summary refs log tree commit diff stats
path: root/results/classifier/gemma3:12b/performance/2821
diff options
context:
space:
mode:
Diffstat (limited to 'results/classifier/gemma3:12b/performance/2821')
-rw-r--r--results/classifier/gemma3:12b/performance/282124
1 files changed, 24 insertions, 0 deletions
diff --git a/results/classifier/gemma3:12b/performance/2821 b/results/classifier/gemma3:12b/performance/2821
new file mode 100644
index 00000000..1bd60379
--- /dev/null
+++ b/results/classifier/gemma3:12b/performance/2821
@@ -0,0 +1,24 @@
+
+Emulated newer x86 chipsets are noticably slower on cpu-bound loads than "-cpu qemu64"
+Description of problem:
+I noticed that "-cpu qemu64" is much faster than "-cpu max" or "-cpu Icelake-Server-noTSX" for cpu bound loads, and with more than one cpu under load.
+Steps to reproduce:
+1. Run a guest as per "qemu-system-x86_64 -cpu max [..]" command from above. Any linux distro should do.
+2. run through the setup questions if you use Fedora-Server-KVM-41-1.4.x86_64.qcow2 from the example command line above
+3. log into the guest via ssh, i.e. "ssh chris@amd64" here
+4. cd /dev/shm; wget http://archive.apache.org/dist/httpd/httpd-2.4.57.tar.bz2; wget https://fluxcoil.net/files/tmp/job_httpd_extract_cpu.sh
+6. bash ./job_httpd_extract_cpu.sh 4 300
+8. cat /tmp/counter
+
+Step 6 is executing a script which simply uses 4 parallel loops, where each loop runs "bzcat httpd-2.4.57.tar.bz2" constantly. After 300sec, the successful uncompressions over all 4 loops are summed up and stored in /tmp/counter.
+
+- result with "-cpu qemu64": 96
+- result with "-cpu max": 84
+- result with "-cpu Icelake-Server-noTSX": 44
+Additional information:
+- For "-cpu Icelake-Server-noTSX" on this Thinkpad T590 I get these warnings, I think they are not relevant:
+  qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.pcid [bit 17]
+  qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.01H:ECX.tsc-deadline [bit 24]
+  [..]
+- I also looked at Broadwell etc, and all of them seem in the same ballpark.
+  Graph over some emulated architectures: https://fluxcoil.net/files/tmp/gnuplot_cpu-performance-emulated-only.png