diff options
Diffstat (limited to 'results/classifier/108/other/1888923')
| -rw-r--r-- | results/classifier/108/other/1888923 | 403 |
1 files changed, 403 insertions, 0 deletions
diff --git a/results/classifier/108/other/1888923 b/results/classifier/108/other/1888923 new file mode 100644 index 00000000..27a7dbc5 --- /dev/null +++ b/results/classifier/108/other/1888923 @@ -0,0 +1,403 @@ +permissions: 0.757 +other: 0.751 +graphic: 0.736 +semantic: 0.729 +debug: 0.701 +performance: 0.679 +PID: 0.649 +device: 0.634 +KVM: 0.623 +boot: 0.600 +vnc: 0.595 +network: 0.579 +socket: 0.573 +files: 0.475 + +Configured Memory access latency and bandwidth not taking effect + +I was trying to configure latencies and bandwidths between nodes in a NUMA emulation using QEMU 5.0.0. + +Host : Ubuntu 20.04 64 bit +Guest : Ubuntu 18.04 64 bit + +The machine configured has 2 nodes. Each node has 2 CPUs and has been allocated 3GB of memory. The memory access latencies and bandwidths for a local access (i.e from initiator 0 to target 0, and from initiator 1 to target 1) are set as 40ns and 10GB/s respectively. The memory access latencies and bandwidths for a remote access (i.e from initiator 1 to target 0, and from initiator 0 to target 1) are set as 80ns and 5GB/s respectively. + +The command line launch is as follows. + +sudo x86_64-softmmu/qemu-system-x86_64 \ +-machine hmat=on \ +-boot c \ +-enable-kvm \ +-m 6G,slots=2,maxmem=7G \ +-object memory-backend-ram,size=3G,id=m0 \ +-object memory-backend-ram,size=3G,id=m1 \ +-numa node,nodeid=0,memdev=m0 \ +-numa node,nodeid=1,memdev=m1 \ +-smp 4,sockets=4,maxcpus=4 \ +-numa cpu,node-id=0,socket-id=0 \ +-numa cpu,node-id=0,socket-id=1 \ +-numa cpu,node-id=1,socket-id=2 \ +-numa cpu,node-id=1,socket-id=3 \ +-numa dist,src=0,dst=1,val=20 \ +-net nic \ +-net user \ +-hda testing.img \ +-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=40 \ +-numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=80 \ +-numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +-numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=80 \ +-numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=40 \ +-numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ + +Then the latencies and bandwidths between the nodes were tested using the Intel Memory Latency Checker v3.9 (https://software.intel.com/content/www/us/en/develop/articles/intelr-memory-latency-checker.html). But the obtained results did not match the configuration. The following are the results obtained. + +Latency_matrix with idle latencies (in ns) + +Numa +Node 0 1 +0 36.2 36.4 +1 34.9 35.4 + +Bandwidth_matrix with memory bandwidths (in MB/s) + +Numa +Node 0 1 +0 15167.1 15308.9 +1 15226.0 15234.0 + +A test was also conducted with the tool “lat_mem_rd” from lmbench to measure the memory read latencies. This also gave results which did not match the config. + +Any information on why the config latency and bandwidth values are not applied, would be appreciated. + +On Sat, 25 Jul 2020 07:26:38 -0000 +Vishnu Dixit <email address hidden> wrote: + +> Public bug reported: +> +> I was trying to configure latencies and bandwidths between nodes in a +> NUMA emulation using QEMU 5.0.0. +> +> Host : Ubuntu 20.04 64 bit +> Guest : Ubuntu 18.04 64 bit +> +> The machine configured has 2 nodes. Each node has 2 CPUs and has been +> allocated 3GB of memory. The memory access latencies and bandwidths for +> a local access (i.e from initiator 0 to target 0, and from initiator 1 +> to target 1) are set as 40ns and 10GB/s respectively. The memory access +> latencies and bandwidths for a remote access (i.e from initiator 1 to +> target 0, and from initiator 0 to target 1) are set as 80ns and 5GB/s +> respectively. +> +> The command line launch is as follows. +> +> sudo x86_64-softmmu/qemu-system-x86_64 \ +> -machine hmat=on \ +> -boot c \ +> -enable-kvm \ +> -m 6G,slots=2,maxmem=7G \ +> -object memory-backend-ram,size=3G,id=m0 \ +> -object memory-backend-ram,size=3G,id=m1 \ +> -numa node,nodeid=0,memdev=m0 \ +> -numa node,nodeid=1,memdev=m1 \ +> -smp 4,sockets=4,maxcpus=4 \ +> -numa cpu,node-id=0,socket-id=0 \ +> -numa cpu,node-id=0,socket-id=1 \ +> -numa cpu,node-id=1,socket-id=2 \ +> -numa cpu,node-id=1,socket-id=3 \ +> -numa dist,src=0,dst=1,val=20 \ +> -net nic \ +> -net user \ +> -hda testing.img \ +> -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=40 \ +> -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +> -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=80 \ +> -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +> -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=80 \ +> -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +> -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=40 \ +> -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +> +> Then the latencies and bandwidths between the nodes were tested using +> the Intel Memory Latency Checker v3.9 +> (https://software.intel.com/content/www/us/en/develop/articles/intelr- +> memory-latency-checker.html). But the obtained results did not match the +> configuration. The following are the results obtained. +> +> Latency_matrix with idle latencies (in ns) +> +> Numa Node +> . .0. . .1. +> 0 36.2 36.4 +> 1 34.9 35.4 +> +> Bandwidth_matrix with memory bandwidths (in MB/s) +> +> Numa Node +> . . .0. . . .1. +> 0 15167.1 15308.9 +> 1 15226.0 15234.0 +> +> A test was also conducted with the tool “lat_mem_rd” from lmbench to +> measure the memory read latencies. This also gave results which did not +> match the config. +> +> Any information on why the config latency and bandwidth values are not +> applied, would be appreciated. + +There is no information about host hardware, so I'd onldy hazard a guess +that host is non NUMA machine so all guest RAM and CPUs are in the same +latency domain, so that's why you are seeing pretty much the same timings. + +QEMU nor KMV do not simullate HW latencies at all, all that is configured +with '-numa hmat-lb' is intended for guest OS consumption as a hint for +smarter memory allocation and it's on to user to pin CPUs and RAM to +concrete host NUMA nodes and use host's values in '-numa hmat-lb' to +actually get performance benefits from it on 'NUMA' machine. +On non NUMA host it's rather pointless except of the cases where one +needs to fake NUMA config (like testing some aspects of NUMA related code +in guest OS). + +> +> ** Affects: qemu +> Importance: Undecided +> Status: New +> +> +> ** Tags: bandwidth hmat hmat-lb latency +> +> ** Description changed: +> +> I was trying to configure latencies and bandwidths between nodes in a +> NUMA emulation using QEMU 5.0.0. +> +> Host : Ubuntu 20.04 64 bit +> Guest : Ubuntu 18.04 64 bit +> - +> - The machine configured has 2 nodes. Each node has 2 CPUs and has been allocated 3GB of memory. The memory access latencies and bandwidths for a local access (i.e from initiator 0 to target 0, and from initiator 1 to target 1) are set as 40ns and 10GB/s respectively. The memory access latencies and bandwidths for a remote access (i.e from initiator 1 to target 0, and from initiator 0 to target 1) are set as 80ns and 5GB/s respectively. +> + +> + The machine configured has 2 nodes. Each node has 2 CPUs and has been +> + allocated 3GB of memory. The memory access latencies and bandwidths for +> + a local access (i.e from initiator 0 to target 0, and from initiator 1 +> + to target 1) are set as 40ns and 10GB/s respectively. The memory access +> + latencies and bandwidths for a remote access (i.e from initiator 1 to +> + target 0, and from initiator 0 to target 1) are set as 80ns and 5GB/s +> + respectively. +> +> The command line launch is as follows. +> +> sudo x86_64-softmmu/qemu-system-x86_64 \ +> -machine hmat=on \ +> -boot c \ +> -enable-kvm \ +> -m 6G,slots=2,maxmem=7G \ +> -object memory-backend-ram,size=3G,id=m0 \ +> -object memory-backend-ram,size=3G,id=m1 \ +> -numa node,nodeid=0,memdev=m0 \ +> -numa node,nodeid=1,memdev=m1 \ +> -smp 4,sockets=4,maxcpus=4 \ +> -numa cpu,node-id=0,socket-id=0 \ +> -numa cpu,node-id=0,socket-id=1 \ +> -numa cpu,node-id=1,socket-id=2 \ +> -numa cpu,node-id=1,socket-id=3 \ +> -numa dist,src=0,dst=1,val=20 \ +> -net nic \ +> -net user \ +> -hda testing.img \ +> -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=40 \ +> -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +> -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=80 \ +> -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +> -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=80 \ +> -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +> -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=40 \ +> -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +> +> Then the latencies and bandwidths between the nodes were tested using +> the Intel Memory Latency Checker v3.9 +> (https://software.intel.com/content/www/us/en/develop/articles/intelr- +> memory-latency-checker.html). But the obtained results did not match the +> configuration. The following are the results obtained. +> +> Latency_matrix with idle latencies (in ns) +> +> Numa +> - Node 0 1 +> + lat node0 node1 +> 0 36.2 36.4 +> 1 34.9 35.4 +> +> Bandwidth_matrix with memory bandwidths (in MB/s) +> +> - Numa +> - Node 0 1 +> + Numa Node +> + bw node0 .bw node1 +> 0 15167.1 15308.9 +> 1 15226.0 15234.0 +> +> A test was also conducted with the tool “lat_mem_rd” from lmbench to +> measure the memory read latencies. This also gave results which did not +> match the config. +> +> Any information on why the config latency and bandwidth values are not +> applied, would be appreciated. +> +> ** Description changed: +> +> I was trying to configure latencies and bandwidths between nodes in a +> NUMA emulation using QEMU 5.0.0. +> +> Host : Ubuntu 20.04 64 bit +> Guest : Ubuntu 18.04 64 bit +> +> The machine configured has 2 nodes. Each node has 2 CPUs and has been +> allocated 3GB of memory. The memory access latencies and bandwidths for +> a local access (i.e from initiator 0 to target 0, and from initiator 1 +> to target 1) are set as 40ns and 10GB/s respectively. The memory access +> latencies and bandwidths for a remote access (i.e from initiator 1 to +> target 0, and from initiator 0 to target 1) are set as 80ns and 5GB/s +> respectively. +> +> The command line launch is as follows. +> +> sudo x86_64-softmmu/qemu-system-x86_64 \ +> -machine hmat=on \ +> -boot c \ +> -enable-kvm \ +> -m 6G,slots=2,maxmem=7G \ +> -object memory-backend-ram,size=3G,id=m0 \ +> -object memory-backend-ram,size=3G,id=m1 \ +> -numa node,nodeid=0,memdev=m0 \ +> -numa node,nodeid=1,memdev=m1 \ +> -smp 4,sockets=4,maxcpus=4 \ +> -numa cpu,node-id=0,socket-id=0 \ +> -numa cpu,node-id=0,socket-id=1 \ +> -numa cpu,node-id=1,socket-id=2 \ +> -numa cpu,node-id=1,socket-id=3 \ +> -numa dist,src=0,dst=1,val=20 \ +> -net nic \ +> -net user \ +> -hda testing.img \ +> -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=40 \ +> -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +> -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=80 \ +> -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +> -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=80 \ +> -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +> -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=40 \ +> -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +> +> Then the latencies and bandwidths between the nodes were tested using +> the Intel Memory Latency Checker v3.9 +> (https://software.intel.com/content/www/us/en/develop/articles/intelr- +> memory-latency-checker.html). But the obtained results did not match the +> configuration. The following are the results obtained. +> +> Latency_matrix with idle latencies (in ns) +> +> - Numa +> + Numa Node +> lat node0 node1 +> 0 36.2 36.4 +> 1 34.9 35.4 +> +> Bandwidth_matrix with memory bandwidths (in MB/s) +> +> - Numa Node +> - bw node0 .bw node1 +> + Numa Node +> + bw node0 bw node1 +> 0 15167.1 15308.9 +> 1 15226.0 15234.0 +> +> A test was also conducted with the tool “lat_mem_rd” from lmbench to +> measure the memory read latencies. This also gave results which did not +> match the config. +> +> Any information on why the config latency and bandwidth values are not +> applied, would be appreciated. +> +> ** Description changed: +> +> I was trying to configure latencies and bandwidths between nodes in a +> NUMA emulation using QEMU 5.0.0. +> +> Host : Ubuntu 20.04 64 bit +> Guest : Ubuntu 18.04 64 bit +> +> The machine configured has 2 nodes. Each node has 2 CPUs and has been +> allocated 3GB of memory. The memory access latencies and bandwidths for +> a local access (i.e from initiator 0 to target 0, and from initiator 1 +> to target 1) are set as 40ns and 10GB/s respectively. The memory access +> latencies and bandwidths for a remote access (i.e from initiator 1 to +> target 0, and from initiator 0 to target 1) are set as 80ns and 5GB/s +> respectively. +> +> The command line launch is as follows. +> +> sudo x86_64-softmmu/qemu-system-x86_64 \ +> -machine hmat=on \ +> -boot c \ +> -enable-kvm \ +> -m 6G,slots=2,maxmem=7G \ +> -object memory-backend-ram,size=3G,id=m0 \ +> -object memory-backend-ram,size=3G,id=m1 \ +> -numa node,nodeid=0,memdev=m0 \ +> -numa node,nodeid=1,memdev=m1 \ +> -smp 4,sockets=4,maxcpus=4 \ +> -numa cpu,node-id=0,socket-id=0 \ +> -numa cpu,node-id=0,socket-id=1 \ +> -numa cpu,node-id=1,socket-id=2 \ +> -numa cpu,node-id=1,socket-id=3 \ +> -numa dist,src=0,dst=1,val=20 \ +> -net nic \ +> -net user \ +> -hda testing.img \ +> -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-latency,latency=40 \ +> -numa hmat-lb,initiator=0,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +> -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-latency,latency=80 \ +> -numa hmat-lb,initiator=0,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +> -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-latency,latency=80 \ +> -numa hmat-lb,initiator=1,target=0,hierarchy=memory,data-type=access-bandwidth,bandwidth=5G \ +> -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-latency,latency=40 \ +> -numa hmat-lb,initiator=1,target=1,hierarchy=memory,data-type=access-bandwidth,bandwidth=10G \ +> +> Then the latencies and bandwidths between the nodes were tested using +> the Intel Memory Latency Checker v3.9 +> (https://software.intel.com/content/www/us/en/develop/articles/intelr- +> memory-latency-checker.html). But the obtained results did not match the +> configuration. The following are the results obtained. +> +> Latency_matrix with idle latencies (in ns) +> +> Numa Node +> - lat node0 node1 +> - 0 36.2 36.4 +> - 1 34.9 35.4 +> + . .0. . .1. +> + 0 36.2 36.4 +> + 1 34.9 35.4 +> +> Bandwidth_matrix with memory bandwidths (in MB/s) +> +> Numa Node +> - bw node0 bw node1 +> + . . .0. . . .1. +> 0 15167.1 15308.9 +> 1 15226.0 15234.0 +> +> A test was also conducted with the tool “lat_mem_rd” from lmbench to +> measure the memory read latencies. This also gave results which did not +> match the config. +> +> Any information on why the config latency and bandwidth values are not +> applied, would be appreciated. +> + + + +It indeed was a non NUMA machine + + |