summary refs log tree commit diff stats
path: root/results/classifier/gemma3:12b/performance/1888923
diff options
context:
space:
mode:
authorChristian Krinitsin <mail@krinitsin.com>2025-07-03 07:27:52 +0000
committerChristian Krinitsin <mail@krinitsin.com>2025-07-03 07:27:52 +0000
commitd0c85e36e4de67af628d54e9ab577cc3fad7796a (patch)
treef8f784b0f04343b90516a338d6df81df3a85dfa2 /results/classifier/gemma3:12b/performance/1888923
parent7f4364274750eb8cb39a3e7493132fca1c01232e (diff)
downloadqemu-analysis-d0c85e36e4de67af628d54e9ab577cc3fad7796a.tar.gz
qemu-analysis-d0c85e36e4de67af628d54e9ab577cc3fad7796a.zip
add deepseek and gemma results
Diffstat (limited to 'results/classifier/gemma3:12b/performance/1888923')
-rw-r--r--results/classifier/gemma3:12b/performance/188892358
1 files changed, 58 insertions, 0 deletions
diff --git a/results/classifier/gemma3:12b/performance/1888923 b/results/classifier/gemma3:12b/performance/1888923
new file mode 100644
index 000000000..dbaaee4ed
--- /dev/null
+++ b/results/classifier/gemma3:12b/performance/1888923
@@ -0,0 +1,58 @@
+
+Configured Memory access latency and bandwidth not taking effect
+
+I was trying to configure latencies and bandwidths between nodes in a NUMA emulation using QEMU 5.0.0.
+
+Host : Ubuntu 20.04 64 bit
+Guest : Ubuntu 18.04 64 bit
+ 
+The machine configured has 2 nodes. Each node has 2 CPUs and has been allocated 3GB of memory. The memory access latencies and bandwidths for a local access (i.e from initiator 0 to target 0, and from initiator 1 to target 1) are set as 40ns and 10GB/s respectively. The memory access latencies and bandwidths for a remote access (i.e from initiator 1 to target 0, and from initiator 0 to target 1) are set as 80ns and 5GB/s respectively. 
+
+The command line launch is as follows.
+
+sudo x86_64-softmmu/qemu-system-x86_64  \
+-machine hmat=on \
+-boot c \
+-enable-kvm \
+-m 6G,slots=2,maxmem=7G \
+-object memory-backend-ram,size=3G,id=m0 \
+-object memory-backend-ram,size=3G,id=m1 \
+-numa node,nodeid=0,memdev=m0 \
+-numa node,nodeid=1,memdev=m1 \
+-smp 4,sockets=4,maxcpus=4  \
+-numa cpu,node-id=0,socket-id=0 \
+-numa cpu,node-id=0,socket-id=1 \
+-numa cpu,node-id=1,socket-id=2 \
+-numa cpu,node-id=1,socket-id=3 \
+-numa dist,src=0,dst=1,val=20 \
+-net nic \
+-net user \
+-hda testing.img \
+-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=40 \
+-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \
+-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=80 \
+-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \
+-numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=80 \
+-numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \
+-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=40 \
+-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \
+
+Then the latencies and bandwidths between the nodes were tested using the Intel Memory Latency Checker v3.9 (https://software.intel.com/content/www/us/en/develop/articles/intelr-memory-latency-checker.html). But the obtained results did not match the configuration. The following are the results obtained. 
+
+Latency_matrix with idle latencies (in ns)
+
+Numa
+Node  0     1
+0    36.2 36.4
+1    34.9 35.4
+
+Bandwidth_matrix with memory bandwidths (in MB/s)
+
+Numa
+Node 0       1
+0 15167.1 15308.9
+1 15226.0 15234.0
+
+A test was also conducted with the tool “lat_mem_rd” from lmbench to measure the memory read latencies. This also gave results which did not match the config.
+
+Any information on why the config latency and bandwidth values are not applied, would be appreciated.
\ No newline at end of file