diff options
Diffstat (limited to 'results/scraper/box64/2595')
| -rw-r--r-- | results/scraper/box64/2595 | 81 |
1 files changed, 81 insertions, 0 deletions
diff --git a/results/scraper/box64/2595 b/results/scraper/box64/2595 new file mode 100644 index 000000000..519279af3 --- /dev/null +++ b/results/scraper/box64/2595 @@ -0,0 +1,81 @@ +Question/Request: About Box64 x87 reduced precison info and comparison vs new "rosettax87" project hack? +Hi, +very new to Box32/64 and finding that Box64 supports some kinds of x87 reduced precision.. + +from docs: +https://github.com/ptitSeb/box64/blob/main/docs/USAGE.md +https://github.com/ptitSeb/box64/blob/main/docs/box64.pod +we have: +BOX64_X87_NO80BITS +BOX64_DYNAREC_X87DOUBLE + +I see with BOX64_DYNAREC_X87DOUBLE that you even allow/default to single precision using this option.. +0: Try to use float when possible for x87 emulation. [Default] +1: Only use Double for x87 emulation. +2: Check Precision Control low precision on x87 emulation. + +so question is if you can share what perf can we gain vs non reduced x87 precision on targeted microtests? + + +I say this because for Rosetta since recently we have project: +https://github.com/Lifeisawful/rosettax87 +which acceleretes x87 computation in some cases 10X at least on M4: +https://github.com/Lifeisawful/rosettax87/issues/2 + +at least on the simple sample benchmark shared on this project (using x87 for calculating fsqrt seems): +clang -v -arch x86_64 -mno-sse -mfpmath=387 ./sample/math.c -o ./build/math + +at least +Rosetta M4: +Average time: 123040 ticks + +Rosettax87 M4: +Average time: 12222 ticks + +part of code: + +``` +#define TIMES 1000000 +#define RUNS 10 +#define METHOD run_fsqrt + +clock_t run_fsqrt() { + float sixteen = 16.0f; + + clock_t start = clock(); + + // Run fsqrt many times to get measurable time + float four; + for(int i = 0; i < TIMES; i++) { + four = __builtin_sqrtf(sixteen); + } + + clock_t end = clock(); + clock_t time_spent = (end - start); + + printf("Result: %x\n", *(uint32_t*)&four); + return time_spent; +} + +int main() { + clock_t times[RUNS]; + clock_t sum = 0; + + printf("benchmark %s\n", STRINGIFY(METHOD)); + + // Perform multiple runs + for(int i = 0; i < RUNS; i++) { + times[i] = METHOD(); + sum += times[i]; + printf("Run %d time: %lu ticks\n", i+1, times[i]); + } + + // Calculate average using integer math + clock_t avg = sum / RUNS; + printf("\nAverage time: %lu ticks\n", avg); + + return 0; +} + + +``` \ No newline at end of file |