summary refs log tree commit diff stats
path: root/results/classifier/deepseek-2/output/hypervisor/1428352
blob: c4dd6358c089ae8898bfdc9444b0a67ae0165863 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
SYSRET instruction incorrectly implemented

The Intel architecture manual states that when returning to user mode, the SYSRET instruction will re-load the stack selector (%ss) from the IA32_STAR model specific register using the following logic:

SS.Selector <-- (IA32_STAR[63:48]+8) OR 3; (* RPL forced to 3 *)

Another description of the instruction behavior which shows the same logic in a slightly different form can also be found here:

http://tptp.cc/mirrors/siyobik.info/instruction/SYSRET.html

[...]
        SS(SEL) = IA32_STAR[63:48] + 8;
        SS(PL) = 0x3;
[...]

In other words, the value of the %ss register is supposed to be loaded from  bits 63:48 of the IA32_STAR model-specific register, incremented by 8, and then ORed with 3. ORing in the 3 sets the privilege level to 3 (user). This is done since SYSRET returns to user mode after a system call.

However, helper_sysret() in target-i386/seg_helper.c does not do the "OR 3" step. The code looks like this:

        cpu_x86_load_seg_cache(env, R_SS, selector + 8,
                               0, 0xffffffff,
                               DESC_G_MASK | DESC_B_MASK | DESC_P_MASK |
                               DESC_S_MASK | (3 << DESC_DPL_SHIFT) |
                               DESC_W_MASK | DESC_A_MASK);

It should look like this:

        cpu_x86_load_seg_cache(env, R_SS, (selector + 8) | 3,
                               0, 0xffffffff,
                               DESC_G_MASK | DESC_B_MASK | DESC_P_MASK |
                               DESC_S_MASK | (3 << DESC_DPL_SHIFT) |
                               DESC_W_MASK | DESC_A_MASK);

The code does correctly set the privilege level bits for the code selector register (%cs) but not for the stack selector (%ss).

The effect of this is that when SYSRET returns control to the user-mode caller, %ss will be have the privilege level bits cleared. In my case, it went from 0x2b to 0x28. This caused a crash later: when the user-mode code was preempted by an interrupt, and the interrupt handler would do an IRET, a general protection fault would occur because the %ss value being loaded from the exception frame was not valid for user mode. (At least, I think that's what happened.)

This behavior seems inconsistent with real hardware, and also appears to be wrong with respect to the Intel documentation, so I'm pretty confident in calling this a bug. :)

Note that this issue seems to have been around for a long time. I discovered it while using QEMU 2.2.0, but I happened to have the sources for QEMU 0.10.5, and the problem is there too (in os_helper.c). I am using FreeBSD/amd64 9.1-RELEASE as my host system, without KVM.

The fix is fairly simple. I'm attaching a patch which worked for me. Using this fix, the code that I'm testing now behaves the same on the QEMU virtual machine as on real hardware.

- Bill (<email address hidden>)