summary refs log tree commit diff stats
path: root/results/classifier/deepseek-2-tmp/output/hypervisor/1856335
diff options
context:
space:
mode:
authorChristian Krinitsin <mail@krinitsin.com>2025-06-30 12:34:26 +0000
committerChristian Krinitsin <mail@krinitsin.com>2025-06-30 12:35:44 +0000
commit25f8033d556aa17afaea4a5196ea7a69fe248320 (patch)
tree0f056db167683be54ea1e5e72d29d6069af55e7d /results/classifier/deepseek-2-tmp/output/hypervisor/1856335
parent8e6da29e4ee5fc14bc1cc816a24f21271f14090d (diff)
downloadqemu-analysis-25f8033d556aa17afaea4a5196ea7a69fe248320.tar.gz
qemu-analysis-25f8033d556aa17afaea4a5196ea7a69fe248320.zip
add new temporary deepseek-r1:14b results
Diffstat (limited to 'results/classifier/deepseek-2-tmp/output/hypervisor/1856335')
-rw-r--r--results/classifier/deepseek-2-tmp/output/hypervisor/185633536
1 files changed, 36 insertions, 0 deletions
diff --git a/results/classifier/deepseek-2-tmp/output/hypervisor/1856335 b/results/classifier/deepseek-2-tmp/output/hypervisor/1856335
new file mode 100644
index 000000000..525d3b205
--- /dev/null
+++ b/results/classifier/deepseek-2-tmp/output/hypervisor/1856335
@@ -0,0 +1,36 @@
+
+Cache Layout wrong on many Zen Arch CPUs
+
+AMD CPUs have L3 cache per 2, 3 or 4 cores. Currently, TOPOEXT seems to always map Cache ass if it was an 4-Core per CCX CPU, which is incorrect, and costs upwards 30% performance (more realistically 10%) in L3 Cache Layout aware applications.
+
+Example on a 4-CCX CPU (1950X /w 8 Cores and no SMT): 
+
+  <cpu mode='custom' match='exact' check='full'>
+    <model fallback='forbid'>EPYC-IBPB</model>
+    <vendor>AMD</vendor>
+    <topology sockets='1' cores='8' threads='1'/>
+
+In windows, coreinfo reports correctly: 
+
+****----  Unified Cache 1, Level 3,    8 MB, Assoc  16, LineSize  64
+----****  Unified Cache 6, Level 3,    8 MB, Assoc  16, LineSize  64
+
+On a 3-CCX CPU (3960X /w 6 cores and no SMT):
+
+ <cpu mode='custom' match='exact' check='full'>
+    <model fallback='forbid'>EPYC-IBPB</model>
+    <vendor>AMD</vendor>
+    <topology sockets='1' cores='6' threads='1'/>
+
+in windows, coreinfo reports incorrectly: 
+
+****--  Unified Cache  1, Level 3,    8 MB, Assoc  16, LineSize  64
+----**  Unified Cache  6, Level 3,    8 MB, Assoc  16, LineSize  64
+
+
+Validated against 3.0, 3.1, 4.1 and 4.2 versions of qemu-kvm. 
+
+With newer Qemu there is a fix (that does behave correctly) in using the dies parameter: 
+ <qemu:arg value='cores=3,threads=1,dies=2,sockets=1'/>
+
+The problem is that the dies are exposed differently than how AMD does it natively, they are exposed to Windows as sockets, which means, you can't ever have a machine with more than two CCX (6 cores) as Windows only supports two sockets. (Should this be reported as a separate bug?)
\ No newline at end of file