diff options
Diffstat (limited to 'results/scraper/fex/4548')
| -rw-r--r-- | results/scraper/fex/4548 | 78 |
1 files changed, 78 insertions, 0 deletions
diff --git a/results/scraper/fex/4548 b/results/scraper/fex/4548 new file mode 100644 index 000000000..ff7b7235e --- /dev/null +++ b/results/scraper/fex/4548 @@ -0,0 +1,78 @@ +Question/Request: FEX x87 reduced precison info and comparison vs new "rosettax87" project hack? +Hi, +just very new to FEX and finding that FEX supports some kind of x87 reduced precision.. + +searching google I find references everywhere but not much info: +a)https://www.reddit.com/r/AsahiGaming/comments/1hv1cix/games_that_perform_better_with_reduced_x87/ +b)https://fex-emu.com/FEX-2310/: +"Until then it is recommended to use FEX’s x87 reduced precision mode to try and alleviate some of the overhead." +c)https://fex-emu.com/FEX-2503/ +"This allows to to see that maybe we should try enabling the x87 reduced precision option to get some performance back." + +and in some forum I found: +FEX_X87REDUCEDPRECISION=1 +so that's the enviroment variable to add, right? + +also what perf can we gain vs non reduced on targeted microtests? what kind of reduction precision loss? +sorry if I don't know where to search for docs.. + +I say this because for Rosetta since recently we have project: +https://github.com/Lifeisawful/rosettax87 +which acceleretes x87 computation in some cases 10X at least on M4: +https://github.com/Lifeisawful/rosettax87/issues/2 + +at least on this simple shared benchmark calculation fsqrt seems: +clang -v -arch x86_64 -mno-sse -mfpmath=387 ./sample/math.c -o ./build/math + +at least +Rosetta M4: + Average time: 123040 ticks + +Rosettax87 M4: + Average time: 12222 ticks + + +part of code: +``` +#define TIMES 1000000 +#define RUNS 10 +#define METHOD run_fsqrt + +clock_t run_fsqrt() { + float sixteen = 16.0f; + + clock_t start = clock(); + + // Run fsqrt many times to get measurable time + float four; + for(int i = 0; i < TIMES; i++) { + four = __builtin_sqrtf(sixteen); + } + + clock_t end = clock(); + clock_t time_spent = (end - start); + + printf("Result: %x\n", *(uint32_t*)&four); + return time_spent; +} + +int main() { + clock_t times[RUNS]; + clock_t sum = 0; + + printf("benchmark %s\n", STRINGIFY(METHOD)); + + // Perform multiple runs + for(int i = 0; i < RUNS; i++) { + times[i] = METHOD(); + sum += times[i]; + printf("Run %d time: %lu ticks\n", i+1, times[i]); + } + + // Calculate average using integer math + clock_t avg = sum / RUNS; + printf("\nAverage time: %lu ticks\n", avg); + + return 0; +} +``` \ No newline at end of file |