summary refs log tree commit diff stats
path: root/classification_output/02/semantic/53568181
diff options
context:
space:
mode:
Diffstat (limited to 'classification_output/02/semantic/53568181')
-rw-r--r--classification_output/02/semantic/5356818179
1 files changed, 0 insertions, 79 deletions
diff --git a/classification_output/02/semantic/53568181 b/classification_output/02/semantic/53568181
deleted file mode 100644
index 2a2342afc..000000000
--- a/classification_output/02/semantic/53568181
+++ /dev/null
@@ -1,79 +0,0 @@
-semantic: 0.943
-instruction: 0.932
-other: 0.921
-boot: 0.876
-mistranslation: 0.854
-
-[BUG] x86/PAT handling severely crippled AMD-V SVM KVM performance
-
-Hi, I maintain an out-of-tree 3D APIs pass-through QEMU device models at
-https://github.com/kjliew/qemu-3dfx
-that provide 3D acceleration for legacy
-32-bit Windows guests (Win98SE, WinME, Win2k and WinXP) with the focus on
-playing old legacy games from 1996-2003. It currently supports the now-defunct
-3Dfx propriety API called Glide and an alternative OpenGL pass-through based on
-MESA implementation.
-
-The basic concept of both implementations create memory-mapped virtual
-interfaces consist of host/guest shared memory with guest-push model instead of
-a more common host-pull model for typical QEMU device model implementation.
-Guest uses shared memory as FIFOs for drawing commands and data to bulk up the
-operations until serialization event that flushes the FIFOs into host. This
-achieves extremely good performance since virtual CPUs are fast with hardware
-acceleration (Intel VT/AMD-V) and reduces the overhead of frequent VMEXITs to
-service the device emulation. Both implementations work on Windows 10 with WHPX
-and HAXM accelerators as well as KVM in Linux.
-
-On Windows 10, QEMU WHPX implementation does not sync MSR_IA32_PAT during
-host/guest states sync. There is no visibility into the closed-source WHPX on
-how things are managed behind the scene, but from measuring performance figures
-I can conclude that it didn't handle the MSR_IA32_PAT correctly for both Intel
-and AMD. Call this fair enough, if you will, it didn't flag any concerns, in
-fact games such as Quake2 and Quake3 were still within playable frame rate of
-40~60FPS on Win2k/XP guest. Until the same games were run on Win98/ME guest and
-the frame rate blew off the roof (300~500FPS) on the same CPU and GPU. In fact,
-the later seemed to be more inlined with runnng the games bare-metal with vsync
-off.
-
-On Linux (at the time of writing kernel 5.6.7/Mesa 20.0), the difference
-prevailed. Intel CPUs (and it so happened that I was on laptop with Intel GPU),
-the VMX-based kvm_intel got it right while SVM-based kvm_amd did not.
-To put this in simple exaggeration, an aging Core i3-4010U/HD Graphics 4400
-(Haswell GT2) exhibited an insane performance in Quake2/Quake3 timedemos that
-totally crushed more recent AMD Ryzen 2500U APU/Vega 8 Graphics and AMD
-FX8300/NVIDIA GT730 on desktop. Simply unbelievable!
-
-It turned out that there was something to do with AMD-V NPT. By loading kvm_amd
-with npt=0, AMD Ryzen APU and FX8300 regained a huge performance leap. However,
-AMD NPT issue with KVM was supposedly fixed in 2017 kernel commits. NPT=0 would
-actually incur performance loss for VM due to intervention required by
-hypervisors to maintain the shadow page tables.  Finally, I was able to find the
-pointer that pointed to MSR_IA32_PAT register. By updating the MSR_IA32_PAT to
-0x0606xxxx0606xxxxULL, AMD CPUs now regain their rightful performance without
-taking the hit of NPT=0 for Linux KVM. Taking the same solution into Windows,
-both Intel and AMD CPUs no longer require Win98/ME guest to unleash the full
-performance potentials and performance figures based on games measured on WHPX
-were not very far behind Linux KVM.
-
-So I guess the problem lies in host/guest shared memory regions mapped as
-uncacheable from virtual CPU perspective. As virtual CPUs now completely execute
-in hardware context with x86 hardware virtualiztion extensions, the cacheability
-of memory types would severely impact the performance on guests. WHPX didn't
-handle it for both Intel EPT and AMD NPT, but KVM seems to do it right for Intel
-EPT. I don't have the correct fix for QEMU. But what I can do for my 3D APIs
-pass-through device models is to implement host-side hooks to reprogram and
-restore MSR_IA32_PAT upon activation/deactivation of the 3D APIs. Perhaps there
-is also a better solution of having the proper kernel drivers for virtual
-interfaces to manage the memory types of host/guest shared memory in kernel
-space, but to do that and the needs of Microsoft tools/DDKs, I will just forget
-it. The guest stubs uses the same kernel drivers included in 3Dfx drivers for
-memory mapping and the virtual interfaces remain driver-less from Windows OS
-perspective. Considering the current state of halting progress for QEMU native
-virgil3D to support Windows OS, I am just being pragmatic. I understand that
-QEMU virgil3D will eventually bring 3D acceleration for Windows guests, but I do
-not expect anything to support legacy 32-bit Windows OSes which have out-grown
-their commercial usefulness.
-
-Regards,
-KJ Liew
-