summary refs log tree commit diff stats
path: root/results/scraper/box64/2595
diff options
context:
space:
mode:
Diffstat (limited to 'results/scraper/box64/2595')
-rw-r--r--results/scraper/box64/259581
1 files changed, 81 insertions, 0 deletions
diff --git a/results/scraper/box64/2595 b/results/scraper/box64/2595
new file mode 100644
index 000000000..519279af3
--- /dev/null
+++ b/results/scraper/box64/2595
@@ -0,0 +1,81 @@
+Question/Request: About Box64 x87 reduced precison info and comparison vs new "rosettax87" project hack?
+Hi,
+very new to Box32/64 and finding that Box64 supports some kinds of x87 reduced precision..
+
+from docs:
+https://github.com/ptitSeb/box64/blob/main/docs/USAGE.md
+https://github.com/ptitSeb/box64/blob/main/docs/box64.pod
+we have:
+BOX64_X87_NO80BITS
+BOX64_DYNAREC_X87DOUBLE
+
+I see with BOX64_DYNAREC_X87DOUBLE that you even allow/default to single precision using this option..
+0: Try to use float when possible for x87 emulation. [Default]
+1: Only use Double for x87 emulation.
+2: Check Precision Control low precision on x87 emulation.
+
+so question is if you can share what perf can we gain vs non reduced x87 precision on targeted microtests? 
+
+
+I say this because for Rosetta since recently we have project:
+https://github.com/Lifeisawful/rosettax87
+which acceleretes x87 computation in some cases 10X at least on M4:
+https://github.com/Lifeisawful/rosettax87/issues/2
+
+at least on the  simple sample benchmark shared on this project (using x87 for calculating fsqrt seems):
+clang -v -arch x86_64 -mno-sse -mfpmath=387 ./sample/math.c -o ./build/math
+
+at least
+Rosetta M4:
+Average time: 123040 ticks
+
+Rosettax87 M4:
+Average time: 12222 ticks
+
+part of code:
+
+```
+#define TIMES 1000000
+#define RUNS 10
+#define METHOD run_fsqrt
+
+clock_t run_fsqrt() {
+    float sixteen = 16.0f;
+    
+    clock_t start = clock();
+    
+    // Run fsqrt many times to get measurable time
+    float four;
+    for(int i = 0; i < TIMES; i++) {
+        four = __builtin_sqrtf(sixteen);
+    }
+    
+    clock_t end = clock();
+    clock_t time_spent = (end - start);
+    
+    printf("Result: %x\n", *(uint32_t*)&four);
+    return time_spent;
+}
+
+int main() {
+    clock_t times[RUNS];
+    clock_t sum = 0;
+
+    printf("benchmark %s\n", STRINGIFY(METHOD));
+    
+    // Perform multiple runs
+    for(int i = 0; i < RUNS; i++) {
+        times[i] = METHOD();
+        sum += times[i];
+        printf("Run %d time: %lu ticks\n", i+1, times[i]);
+    }
+    
+    // Calculate average using integer math
+    clock_t avg = sum / RUNS;
+    printf("\nAverage time: %lu ticks\n", avg);
+    
+    return 0;
+}
+
+
+```
\ No newline at end of file