add tests and include results

author: Christian Krinitsin <mail@krinitsin.com> 2025-05-30 15:56:20 +0200
committer: Christian Krinitsin <mail@krinitsin.com> 2025-05-30 15:56:20 +0200
commit: 904141bfb8d5385b75eb3b7afec1dcda89af65a7 (patch)
tree: c8b5e69b944c9f8b96dbb5afad6214d0406537b2 /classification
parent: 712310482c3dbef91c3eb6458d1bff82a275fa52 (diff)
download: emulator-bug-study-904141bfb8d5385b75eb3b7afec1dcda89af65a7.tar.gz
emulator-bug-study-904141bfb8d5385b75eb3b7afec1dcda89af65a7.zip
8 files changed, 292 insertions, 0 deletions
diff --git a/classification/test_input/README.md b/classification/test_input/README.md
new file mode 100644
index 00000000..d2dad58c
--- /dev/null
+++ b/classification/test_input/README.md
@@ -0,0 +1,74 @@
+For the categories 'semantic', 'instruction', 'mistranslation' and 'other' and the texts from the files, we get the following scores:
+```
+gitlab_semantic_addsubps
+semantic: 0.974
+instruction: 0.931
+other: 0.732
+mistranslation: 0.299
+
+mail_semantic_vmovdqu
+mistranslation: 0.648
+instruction: 0.622
+other: 0.589
+semantic: 0.463
+
+mail_semantic_1
+instruction: 0.915
+other: 0.684
+semantic: 0.670
+mistranslation: 0.198
+
+mail_other_2
+other: 0.952
+instruction: 0.939
+semantic: 0.879
+mistranslation: 0.862
+
+mail_other_3
+other: 0.919
+instruction: 0.680
+mistranslation: 0.679
+semantic: 0.662
+
+mail_semantic_2
+semantic: 0.997
+instruction: 0.974
+mistranslation: 0.637
+other: 0.177
+
+gitlab_semantic_bzhi
+semantic: 0.920
+instruction: 0.623
+mistranslation: 0.171
+other: 0.064
+
+gitlab_semantic_bextr
+semantic: 0.993
+instruction: 0.944
+mistranslation: 0.337
+other: 0.099
+
+gitlab_semantic_blsi
+semantic: 0.983
+instruction: 0.964
+other: 0.609
+mistranslation: 0.606
+
+gitlab_semantic_adox
+semantic: 0.990
+instruction: 0.944
+mistranslation: 0.452
+other: 0.286
+
+gitlab_semantic_blsmsk
+semantic: 0.987
+instruction: 0.962
+mistranslation: 0.603
+other: 0.269
+
+mail_other_1
+other: 0.927
+semantic: 0.916
+instruction: 0.910
+mistranslation: 0.870
+```
diff --git a/classification/test_input/gitlab_semantic_addsubps b/classification/test_input/gitlab_semantic_addsubps
new file mode 100644
index 00000000..60438ff0
--- /dev/null
+++ b/classification/test_input/gitlab_semantic_addsubps
@@ -0,0 +1,23 @@
+x86 SSE/SSE2/SSE3 instruction semantic bugs with NaN
+
+Description of problem
+The result of SSE/SSE2/SSE3 instructions with NaN is different from the CPU. From Intel manual Volume 1 Appendix D.4.2.2, they defined the behavior of such instructions with NaN. But I think QEMU did not implement this semantic exactly because the byte result is different.
+
+Steps to reproduce
+
+Compile this code
+
+void main() {
+    asm("mov rax, 0x000000007fffffff; push rax; mov rax, 0x00000000ffffffff; push rax; movdqu XMM1, [rsp];");
+    asm("mov rax, 0x2e711de7aa46af1a; push rax; mov rax, 0x7fffffff7fffffff; push rax; movdqu XMM2, [rsp];");
+    asm("addsubps xmm1, xmm2");
+}
+
+Execute and compare the result with the CPU. This problem happens with other SSE/SSE2/SSE3 instructions specified in the manual, Volume 1 Appendix D.4.2.2.
+
+CPU xmm1[3] = 0xffffffff
+
+QEMU xmm1[3] = 0x7fffffff
+
+Additional information
+This bug is discovered by research conducted by KAIST SoftSec.
diff --git a/classification/test_input/gitlab_semantic_adox b/classification/test_input/gitlab_semantic_adox
new file mode 100644
index 00000000..9f4471c9
--- /dev/null
+++ b/classification/test_input/gitlab_semantic_adox
@@ -0,0 +1,36 @@
+x86 ADOX and ADCX semantic bug
+Description of problem
+The result of instruction ADOX and ADCX are different from the CPU. The value of one of EFLAGS is different.
+
+Steps to reproduce
+
+Compile this code
+
+
+void main() {
+    asm("push 512; popfq;");
+    asm("mov rax, 0xffffffff84fdbf24");
+    asm("mov rbx, 0xb197d26043bec15d");
+    asm("adox eax, ebx");
+}
+
+
+
+Execute and compare the result with the CPU. This problem happens with ADCX, too (with CF).
+
+CPU
+
+OF = 0
+
+
+QEMU
+
+OF = 1
+
+
+
+
+
+
+Additional information
+This bug is discovered by research conducted by KAIST SoftSec.
diff --git a/classification/test_input/gitlab_semantic_bextr b/classification/test_input/gitlab_semantic_bextr
new file mode 100644
index 00000000..dabe16ac
--- /dev/null
+++ b/classification/test_input/gitlab_semantic_bextr
@@ -0,0 +1,25 @@
+x86 BEXTR semantic bug
+Description of problem
+The result of instruction BEXTR is different with from the CPU. The value of destination register is different. I think QEMU does not consider the operand size limit.
+
+Steps to reproduce
+
+Compile this code
+
+void main() {
+    asm("mov rax, 0x17b3693f77fb6e9");
+    asm("mov rbx, 0x8f635a775ad3b9b4");
+    asm("mov rcx, 0xb717b75da9983018");
+    asm("bextr eax, ebx, ecx");
+}
+
+Execute and compare the result with the CPU.
+
+CPU
+RAX = 0x5a
+
+QEMU
+RAX = 0x635a775a
+
+Additional information
+This bug is discovered by research conducted by KAIST SoftSec.
diff --git a/classification/test_input/gitlab_semantic_blsi b/classification/test_input/gitlab_semantic_blsi
new file mode 100644
index 00000000..92ff92b0
--- /dev/null
+++ b/classification/test_input/gitlab_semantic_blsi
@@ -0,0 +1,20 @@
+x86 BLSI and BLSR semantic bug
+Description of problem
+The result of instruction BLSI and BLSR is different from the CPU. The value of CF is different.
+
+Steps to reproduce
+
+Compile this code
+
+
+void main() {
+    asm("blsi rax, rbx");
+}
+
+
+
+Execute and compare the result with the CPU. The value of CF is exactly the opposite. This problem happens with BLSR, too.
+
+
+Additional information
+This bug is discovered by research conducted by KAIST SoftSec.
diff --git a/classification/test_input/gitlab_semantic_blsmsk b/classification/test_input/gitlab_semantic_blsmsk
new file mode 100644
index 00000000..b950faa2
--- /dev/null
+++ b/classification/test_input/gitlab_semantic_blsmsk
@@ -0,0 +1,27 @@
+x86 BLSMSK semantic bug
+Description of problem
+The result of instruction BLSMSK is different with from the CPU. The value of CF is different.
+
+Steps to reproduce
+
+Compile this code
+
+void main() {
+    asm("mov rax, 0x65b2e276ad27c67");
+    asm("mov rbx, 0x62f34955226b2b5d");
+    asm("blsmsk eax, ebx");
+}
+
+Execute and compare the result with the CPU.
+
+CPU
+
+CF = 0
+
+
+QEMU
+
+CF = 1
+
+Additional information
+This bug is discovered by research conducted by KAIST SoftSec.
diff --git a/classification/test_input/gitlab_semantic_bzhi b/classification/test_input/gitlab_semantic_bzhi
new file mode 100644
index 00000000..b86da08c
--- /dev/null
+++ b/classification/test_input/gitlab_semantic_bzhi
@@ -0,0 +1,38 @@
+x86 BZHI semantic bug
+Description of problem
+The result of instruction BZHI is different from the CPU. The value of destination register and SF of EFLAGS are different.
+
+Steps to reproduce
+
+Compile this code
+
+
+void main() {
+    asm("mov rax, 0xb1aa9da2fe33fe3");
+    asm("mov rbx, 0x80000000ffffffff");
+    asm("mov rcx, 0xf3fce8829b99a5c6");
+    asm("bzhi rax, rbx, rcx");
+}
+
+
+
+Execute and compare the result with the CPU.
+
+CPU
+
+RAX = 0x0x80000000ffffffff
+SF = 1
+
+
+QEMU
+
+RAX = 0xffffffff
+SF = 0
+
+
+
+
+
+
+Additional information
+This bug is discovered by research conducted by KAIST SoftSec.
diff --git a/classification/test_input/mail_semantic_vmovdqu b/classification/test_input/mail_semantic_vmovdqu
new file mode 100644
index 00000000..49b1da50
--- /dev/null
+++ b/classification/test_input/mail_semantic_vmovdqu
@@ -0,0 +1,49 @@
+AVX instruction VMOVDQU implementation error for YMM registers
+Bug Description
+Hi,
+
+Tested with Qemu 4.2.0, and with git version bddff6f6787c916b0e9d63ef9e4d442114257739.
+
+The x86 AVX instruction VMOVDQU doesn't work properly with YMM registers (32 bytes).
+It works with XMM registers (16 bytes) though.
+
+See the attached test case `ymm.c`: when copying from memory-to-ymm0 and then back from ymm0-to-memory using VMOVDQU, Qemu only copies the first 16 of the total 32 bytes.
+
+```
+user@ubuntu ~/Qemu % gcc -o ymm ymm.c -Wall -Wextra -Werror
+
+user@ubuntu ~/Qemu % ./ymm
+00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F
+
+user@ubuntu ~/Qemu % ./x86_64-linux-user/qemu-x86_64 -cpu max ymm
+00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+```
+
+This seems to be because in `translate.c > gen_sse()`, the case handling the VMOVDQU instruction calls `gen_ldo_env_A0` which always performs a 16 bytes copy using two 8 bytes load and store operations (with `tcg_gen_qemu_ld_i64` and `tcg_gen_st_i64`).
+
+Instead, the `gen_ldo_env_A0` function should generate a copy with a size corresponding to the used register.
+
+```
+static void gen_sse(CPUX86State *env, DisasContext *s, int b,
+                    target_ulong pc_start, int rex_r)
+{
+        [...]
+        case 0x26f: /* movdqu xmm, ea */
+            if (mod != 3) {
+                gen_lea_modrm(env, s, modrm);
+                gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg]));
+            } else {
+        [...]
+```
+
+```
+static inline void gen_ldo_env_A0(DisasContext *s, int offset)
+{
+    int mem_index = s->mem_index;
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, MO_LEQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(0)));
+    tcg_gen_addi_tl(s->tmp0, s->A0, 8);
+    tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEQ);
+    tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(1)));
+}
+```
author	Christian Krinitsin <mail@krinitsin.com>	2025-05-30 15:56:20 +0200
committer	Christian Krinitsin <mail@krinitsin.com>	2025-05-30 15:56:20 +0200
commit	904141bfb8d5385b75eb3b7afec1dcda89af65a7 (patch)
tree	c8b5e69b944c9f8b96dbb5afad6214d0406537b2 /classification
parent	712310482c3dbef91c3eb6458d1bff82a275fa52 (diff)
download	emulator-bug-study-904141bfb8d5385b75eb3b7afec1dcda89af65a7.tar.gz emulator-bug-study-904141bfb8d5385b75eb3b7afec1dcda89af65a7.zip