17 files changed, 8426 insertions, 0 deletions
diff --git a/results/classifier/zero-shot/007/permissions/12360755 b/results/classifier/zero-shot/007/permissions/12360755
new file mode 100644
index 000000000..3de2a3c4a
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/12360755
@@ -0,0 +1,306 @@
+permissions: 0.930
+debug: 0.922
+semantic: 0.911
+device: 0.902
+graphic: 0.899
+performance: 0.895
+other: 0.886
+PID: 0.876
+files: 0.851
+boot: 0.818
+vnc: 0.810
+socket: 0.805
+KVM: 0.770
+network: 0.738
+
+[Qemu-devel] [BUG] virtio-net linux driver fails to probe on MIPS Malta since 'hw/virtio-pci: fix virtio behaviour'
+
+Hi,
+
+I've bisected the following failure of the virtio_net linux v4.10 driver
+to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine:
+
+virtio_net virtio0: virtio: device uses modern interface but does not have 
+VIRTIO_F_VERSION_1
+virtio_net: probe of virtio0 failed with error -22
+
+To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour").
+
+It appears that adding ",disable-modern=on,disable-legacy=off" to the
+virtio-net -device makes it work again.
+
+I presume this should really just work out of the box. Any ideas why it
+isn't?
+
+Cheers
+James
+signature.asc
+Description:
+Digital signature
+
+On 03/17/2017 11:57 PM, James Hogan wrote:
+Hi,
+
+I've bisected the following failure of the virtio_net linux v4.10 driver
+to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine:
+
+virtio_net virtio0: virtio: device uses modern interface but does not have 
+VIRTIO_F_VERSION_1
+virtio_net: probe of virtio0 failed with error -22
+
+To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour").
+
+It appears that adding ",disable-modern=on,disable-legacy=off" to the
+virtio-net -device makes it work again.
+
+I presume this should really just work out of the box. Any ideas why it
+isn't?
+Hi,
+
+
+This is strange. This commit changes virtio devices from legacy to virtio 
+"transitional".
+(your command line changes it to legacy)
+Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU 
+side
+there is nothing new.
+
+Michael, do you have any idea?
+
+Thanks,
+Marcel
+Cheers
+James
+
+On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote:
+>
+On 03/17/2017 11:57 PM, James Hogan wrote:
+>
+> Hi,
+>
+>
+>
+> I've bisected the following failure of the virtio_net linux v4.10 driver
+>
+> to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine:
+>
+>
+>
+> virtio_net virtio0: virtio: device uses modern interface but does not have
+>
+> VIRTIO_F_VERSION_1
+>
+> virtio_net: probe of virtio0 failed with error -22
+>
+>
+>
+> To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour").
+>
+>
+>
+> It appears that adding ",disable-modern=on,disable-legacy=off" to the
+>
+> virtio-net -device makes it work again.
+>
+>
+>
+> I presume this should really just work out of the box. Any ideas why it
+>
+> isn't?
+>
+>
+>
+>
+Hi,
+>
+>
+>
+This is strange. This commit changes virtio devices from legacy to virtio
+>
+"transitional".
+>
+(your command line changes it to legacy)
+>
+Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU
+>
+side
+>
+there is nothing new.
+>
+>
+Michael, do you have any idea?
+>
+>
+Thanks,
+>
+Marcel
+My guess would be firmware mishandling 64 bit BARs - we saw such
+a case on sparc previously. As a result you are probably reading
+all zeroes from features register or something like that.
+Marcel, could you send a patch making the bar 32 bit?
+If that helps we know what the issue is.
+
+>
+> Cheers
+>
+> James
+>
+>
+
+On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote:
+On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote:
+On 03/17/2017 11:57 PM, James Hogan wrote:
+Hi,
+
+I've bisected the following failure of the virtio_net linux v4.10 driver
+to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine:
+
+virtio_net virtio0: virtio: device uses modern interface but does not have 
+VIRTIO_F_VERSION_1
+virtio_net: probe of virtio0 failed with error -22
+
+To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour").
+
+It appears that adding ",disable-modern=on,disable-legacy=off" to the
+virtio-net -device makes it work again.
+
+I presume this should really just work out of the box. Any ideas why it
+isn't?
+Hi,
+
+
+This is strange. This commit changes virtio devices from legacy to virtio 
+"transitional".
+(your command line changes it to legacy)
+Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU 
+side
+there is nothing new.
+
+Michael, do you have any idea?
+
+Thanks,
+Marcel
+My guess would be firmware mishandling 64 bit BARs - we saw such
+a case on sparc previously. As a result you are probably reading
+all zeroes from features register or something like that.
+Marcel, could you send a patch making the bar 32 bit?
+If that helps we know what the issue is.
+Sure,
+
+Thanks,
+Marcel
+Cheers
+James
+
+On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote:
+On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote:
+On 03/17/2017 11:57 PM, James Hogan wrote:
+Hi,
+
+I've bisected the following failure of the virtio_net linux v4.10 driver
+to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine:
+
+virtio_net virtio0: virtio: device uses modern interface but does not have 
+VIRTIO_F_VERSION_1
+virtio_net: probe of virtio0 failed with error -22
+
+To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour").
+
+It appears that adding ",disable-modern=on,disable-legacy=off" to the
+virtio-net -device makes it work again.
+
+I presume this should really just work out of the box. Any ideas why it
+isn't?
+Hi,
+
+
+This is strange. This commit changes virtio devices from legacy to virtio 
+"transitional".
+(your command line changes it to legacy)
+Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU 
+side
+there is nothing new.
+
+Michael, do you have any idea?
+
+Thanks,
+Marcel
+My guess would be firmware mishandling 64 bit BARs - we saw such
+a case on sparc previously. As a result you are probably reading
+all zeroes from features register or something like that.
+Marcel, could you send a patch making the bar 32 bit?
+If that helps we know what the issue is.
+Hi James,
+
+Can you please check if the below patch fixes the problem?
+Please note it is not a solution.
+
+diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
+index f9b7244..5b4d429 100644
+--- a/hw/virtio/virtio-pci.c
++++ b/hw/virtio/virtio-pci.c
+@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d, 
+Error **errp)
+         }
+
+         pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx,
+-                         PCI_BASE_ADDRESS_SPACE_MEMORY |
+-                         PCI_BASE_ADDRESS_MEM_PREFETCH |
+-                         PCI_BASE_ADDRESS_MEM_TYPE_64,
++                         PCI_BASE_ADDRESS_SPACE_MEMORY,
+                          &proxy->modern_bar);
+
+         proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap);
+
+
+Thanks,
+Marcel
+
+Hi Marcel,
+
+On Tue, Mar 21, 2017 at 04:16:58PM +0200, Marcel Apfelbaum wrote:
+>
+Can you please check if the below patch fixes the problem?
+>
+Please note it is not a solution.
+>
+>
+diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
+>
+index f9b7244..5b4d429 100644
+>
+--- a/hw/virtio/virtio-pci.c
+>
++++ b/hw/virtio/virtio-pci.c
+>
+@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d,
+>
+Error **errp)
+>
+}
+>
+>
+pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx,
+>
+-                         PCI_BASE_ADDRESS_SPACE_MEMORY |
+>
+-                         PCI_BASE_ADDRESS_MEM_PREFETCH |
+>
+-                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+>
++                         PCI_BASE_ADDRESS_SPACE_MEMORY,
+>
+&proxy->modern_bar);
+>
+>
+proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap);
+Sorry for the delay trying this, I was away last week.
+
+No, it doesn't seem to make any difference.
+
+Thanks
+James
+signature.asc
+Description:
+Digital signature
+
diff --git a/results/classifier/zero-shot/007/permissions/14488057 b/results/classifier/zero-shot/007/permissions/14488057
new file mode 100644
index 000000000..6fa010b72
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/14488057
@@ -0,0 +1,721 @@
+permissions: 0.940
+PID: 0.930
+device: 0.929
+debug: 0.925
+other: 0.922
+performance: 0.911
+semantic: 0.905
+boot: 0.892
+graphic: 0.887
+vnc: 0.882
+KVM: 0.880
+network: 0.846
+socket: 0.825
+files: 0.823
+
+[Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
+
+This is an issue in QEMU's system emulation for X86 in TCG mode.
+The issue permits an attacker who can execute code in guest ring 3
+with normal user privileges to inject code into other processes that
+are running in guest ring 3, in particular root-owned processes.
+
+== reproduction steps ==
+
+ - Create an x86-64 VM and install Debian Jessie in it. The following
+   steps should all be executed inside the VM.
+ - Verify that procmail is installed and the correct version:
+       address@hidden:~# apt-cache show procmail | egrep 'Version|SHA'
+       Version: 3.22-24
+       SHA1: 54ed2d51db0e76f027f06068ab5371048c13434c
+       SHA256: 4488cf6975af9134a9b5238d5d70e8be277f70caa45a840dfbefd2dc444bfe7f
+ - Install build-essential and nasm ("apt install build-essential nasm").
+ - Unpack the exploit, compile it and run it:
+       address@hidden:~$ tar xvf procmail_cache_attack.tar
+       procmail_cache_attack/
+       procmail_cache_attack/shellcode.asm
+       procmail_cache_attack/xp.c
+       procmail_cache_attack/compile.sh
+       procmail_cache_attack/attack.c
+       address@hidden:~$ cd procmail_cache_attack
+       address@hidden:~/procmail_cache_attack$ ./compile.sh
+       address@hidden:~/procmail_cache_attack$ ./attack
+       memory mappings set up
+       child is dead, codegen should be complete
+       executing code as root! :)
+       address@hidden:~/procmail_cache_attack# id
+       uid=0(root) gid=0(root) groups=0(root),[...]
+
+Note: While the exploit depends on the precise version of procmail,
+the actual vulnerability is in QEMU, not in procmail. procmail merely
+serves as a seldomly-executed setuid root binary into which code can
+be injected.
+
+
+== detailed issue description ==
+QEMU caches translated basic blocks. To look up a translated basic
+block, the function tb_find() is used, which uses tb_htable_lookup()
+in its slowpath, which in turn compares translated basic blocks
+(TranslationBlock) to the lookup information (struct tb_desc) using
+tb_cmp().
+
+tb_cmp() attempts to ensure (among other things) that both the virtual
+start address of the basic block and the physical addresses that the
+basic block covers match. When checking the physical addresses, it
+assumes that a basic block can span at most two pages.
+
+gen_intermediate_code() attempts to enforce this by stopping the
+translation of a basic block if nearly one page of instructions has
+been translated already:
+
+    /* if too long translation, stop generation too */
+    if (tcg_op_buf_full() ||
+        (pc_ptr - pc_start) >= (TARGET_PAGE_SIZE - 32) ||
+        num_insns >= max_insns) {
+        gen_jmp_im(pc_ptr - dc->cs_base);
+        gen_eob(dc);
+        break;
+    }
+
+However, while real X86 processors have a maximum instruction length
+of 15 bytes, QEMU's instruction decoder for X86 does not place any
+limit on the instruction length or the number of instruction prefixes.
+Therefore, it is possible to create an arbitrarily long instruction
+by e.g. prepending an arbitrary number of LOCK prefixes to a normal
+instruction. This permits creating a basic block that spans three
+pages by simply appending an approximately page-sized instruction to
+the end of a normal basic block that starts close to the end of a
+page.
+
+Such an overlong basic block causes the basic block caching to fail as
+follows: If code is generated and cached for a basic block that spans
+the physical pages (A,E,B), this basic block will be returned by
+lookups in a process in which the physical pages (A,B,C) are mapped
+in the same virtual address range (assuming that all other lookup
+parameters match).
+
+This behavior can be abused by an attacker e.g. as follows: If a
+non-relocatable world-readable setuid executable legitimately contains
+the pages (A,B,C), an attacker can map (A,E,B) into his own process,
+at the normal load address of A, where E is an attacker-controlled
+page. If a legitimate basic block spans the pages A and B, an attacker
+can write arbitrary non-branch instructions at the start of E, then
+append an overlong instruction
+that ends behind the start of C, yielding a modified basic block that
+spans all three pages. If the attacker then executes the modified
+basic block in his process, the modified basic block is cached.
+Next, the attacker can execute the setuid binary, which will reuse the
+cached modified basic block, executing attacker-controlled
+instructions in the context of the privileged process.
+
+I am sending this to qemu-devel because a QEMU security contact
+told me that QEMU does not consider privilege escalation inside a
+TCG VM to be a security concern.
+procmail_cache_attack.tar
+Description:
+Unix tar archive
+
+On 20 March 2017 at 14:36, Jann Horn <address@hidden> wrote:
+>
+This is an issue in QEMU's system emulation for X86 in TCG mode.
+>
+The issue permits an attacker who can execute code in guest ring 3
+>
+with normal user privileges to inject code into other processes that
+>
+are running in guest ring 3, in particular root-owned processes.
+>
+I am sending this to qemu-devel because a QEMU security contact
+>
+told me that QEMU does not consider privilege escalation inside a
+>
+TCG VM to be a security concern.
+Correct; it's just a bug. Don't trust TCG QEMU as a security boundary.
+
+We should really fix the crossing-a-page-boundary code for x86.
+I believe we do get it correct for ARM Thumb instructions.
+
+thanks
+-- PMM
+
+On Mon, Mar 20, 2017 at 10:46 AM, Peter Maydell wrote:
+>
+On 20 March 2017 at 14:36, Jann Horn <address@hidden> wrote:
+>
+> This is an issue in QEMU's system emulation for X86 in TCG mode.
+>
+> The issue permits an attacker who can execute code in guest ring 3
+>
+> with normal user privileges to inject code into other processes that
+>
+> are running in guest ring 3, in particular root-owned processes.
+>
+>
+> I am sending this to qemu-devel because a QEMU security contact
+>
+> told me that QEMU does not consider privilege escalation inside a
+>
+> TCG VM to be a security concern.
+>
+>
+Correct; it's just a bug. Don't trust TCG QEMU as a security boundary.
+>
+>
+We should really fix the crossing-a-page-boundary code for x86.
+>
+I believe we do get it correct for ARM Thumb instructions.
+How about doing the instruction size check as follows?
+
+diff --git a/target/i386/translate.c b/target/i386/translate.c
+index 72c1b03a2a..94cf3da719 100644
+--- a/target/i386/translate.c
++++ b/target/i386/translate.c
+@@ -8235,6 +8235,10 @@ static target_ulong disas_insn(CPUX86State
+*env, DisasContext *s,
+     default:
+         goto unknown_op;
+     }
++    if (s->pc - pc_start > 15) {
++        s->pc = pc_start;
++        goto illegal_op;
++    }
+     return s->pc;
+  illegal_op:
+     gen_illegal_opcode(s);
+
+Thanks,
+--
+Pranith
+
+On 22 March 2017 at 14:55, Pranith Kumar <address@hidden> wrote:
+>
+On Mon, Mar 20, 2017 at 10:46 AM, Peter Maydell wrote:
+>
+> On 20 March 2017 at 14:36, Jann Horn <address@hidden> wrote:
+>
+>> This is an issue in QEMU's system emulation for X86 in TCG mode.
+>
+>> The issue permits an attacker who can execute code in guest ring 3
+>
+>> with normal user privileges to inject code into other processes that
+>
+>> are running in guest ring 3, in particular root-owned processes.
+>
+>
+>
+>> I am sending this to qemu-devel because a QEMU security contact
+>
+>> told me that QEMU does not consider privilege escalation inside a
+>
+>> TCG VM to be a security concern.
+>
+>
+>
+> Correct; it's just a bug. Don't trust TCG QEMU as a security boundary.
+>
+>
+>
+> We should really fix the crossing-a-page-boundary code for x86.
+>
+> I believe we do get it correct for ARM Thumb instructions.
+>
+>
+How about doing the instruction size check as follows?
+>
+>
+diff --git a/target/i386/translate.c b/target/i386/translate.c
+>
+index 72c1b03a2a..94cf3da719 100644
+>
+--- a/target/i386/translate.c
+>
++++ b/target/i386/translate.c
+>
+@@ -8235,6 +8235,10 @@ static target_ulong disas_insn(CPUX86State
+>
+*env, DisasContext *s,
+>
+default:
+>
+goto unknown_op;
+>
+}
+>
++    if (s->pc - pc_start > 15) {
+>
++        s->pc = pc_start;
+>
++        goto illegal_op;
+>
++    }
+>
+return s->pc;
+>
+illegal_op:
+>
+gen_illegal_opcode(s);
+This doesn't look right because it means we'll check
+only after we've emitted all the code to do the
+instruction operation, so the effect will be
+"execute instruction, then take illegal-opcode
+exception".
+
+We should check what the x86 architecture spec actually
+says and implement that.
+
+thanks
+-- PMM
+
+On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell
+<address@hidden> wrote:
+>
+>
+>
+> How about doing the instruction size check as follows?
+>
+>
+>
+> diff --git a/target/i386/translate.c b/target/i386/translate.c
+>
+> index 72c1b03a2a..94cf3da719 100644
+>
+> --- a/target/i386/translate.c
+>
+> +++ b/target/i386/translate.c
+>
+> @@ -8235,6 +8235,10 @@ static target_ulong disas_insn(CPUX86State
+>
+> *env, DisasContext *s,
+>
+>      default:
+>
+>          goto unknown_op;
+>
+>      }
+>
+> +    if (s->pc - pc_start > 15) {
+>
+> +        s->pc = pc_start;
+>
+> +        goto illegal_op;
+>
+> +    }
+>
+>      return s->pc;
+>
+>   illegal_op:
+>
+>      gen_illegal_opcode(s);
+>
+>
+This doesn't look right because it means we'll check
+>
+only after we've emitted all the code to do the
+>
+instruction operation, so the effect will be
+>
+"execute instruction, then take illegal-opcode
+>
+exception".
+>
+The pc is restored to original address (s->pc = pc_start), so the
+exception will overwrite the generated illegal instruction and will be
+executed first.
+
+But yes, it's better to follow the architecture manual.
+
+Thanks,
+--
+Pranith
+
+On 22 March 2017 at 15:14, Pranith Kumar <address@hidden> wrote:
+>
+On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell
+>
+<address@hidden> wrote:
+>
+> This doesn't look right because it means we'll check
+>
+> only after we've emitted all the code to do the
+>
+> instruction operation, so the effect will be
+>
+> "execute instruction, then take illegal-opcode
+>
+> exception".
+>
+The pc is restored to original address (s->pc = pc_start), so the
+>
+exception will overwrite the generated illegal instruction and will be
+>
+executed first.
+s->pc is the guest PC -- moving that backwards will
+not do anything about the generated TCG IR that's
+already been written. You'd need to rewind the
+write pointer in the IR stream, which there is
+no support for doing AFAIK.
+
+thanks
+-- PMM
+
+On Wed, Mar 22, 2017 at 11:21 AM, Peter Maydell
+<address@hidden> wrote:
+>
+On 22 March 2017 at 15:14, Pranith Kumar <address@hidden> wrote:
+>
+> On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell
+>
+> <address@hidden> wrote:
+>
+>> This doesn't look right because it means we'll check
+>
+>> only after we've emitted all the code to do the
+>
+>> instruction operation, so the effect will be
+>
+>> "execute instruction, then take illegal-opcode
+>
+>> exception".
+>
+>
+> The pc is restored to original address (s->pc = pc_start), so the
+>
+> exception will overwrite the generated illegal instruction and will be
+>
+> executed first.
+>
+>
+s->pc is the guest PC -- moving that backwards will
+>
+not do anything about the generated TCG IR that's
+>
+already been written. You'd need to rewind the
+>
+write pointer in the IR stream, which there is
+>
+no support for doing AFAIK.
+Ah, OK. Thanks for the explanation. May be we should check the size of
+the instruction while decoding the prefixes and error out once we
+exceed the limit. We would not generate any IR code.
+
+--
+Pranith
+
+On 03/23/2017 02:29 AM, Pranith Kumar wrote:
+On Wed, Mar 22, 2017 at 11:21 AM, Peter Maydell
+<address@hidden> wrote:
+On 22 March 2017 at 15:14, Pranith Kumar <address@hidden> wrote:
+On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell
+<address@hidden> wrote:
+This doesn't look right because it means we'll check
+only after we've emitted all the code to do the
+instruction operation, so the effect will be
+"execute instruction, then take illegal-opcode
+exception".
+The pc is restored to original address (s->pc = pc_start), so the
+exception will overwrite the generated illegal instruction and will be
+executed first.
+s->pc is the guest PC -- moving that backwards will
+not do anything about the generated TCG IR that's
+already been written. You'd need to rewind the
+write pointer in the IR stream, which there is
+no support for doing AFAIK.
+Ah, OK. Thanks for the explanation. May be we should check the size of
+the instruction while decoding the prefixes and error out once we
+exceed the limit. We would not generate any IR code.
+Yes.
+It would not enforce a true limit of 15 bytes, since you can't know that until
+you've done the rest of the decode.  But you'd be able to say that no more than
+14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 bytes is used.
+Which does fix the bug.
+
+
+r~
+
+On 22/03/2017 21:01, Richard Henderson wrote:
+>
+>
+>
+> Ah, OK. Thanks for the explanation. May be we should check the size of
+>
+> the instruction while decoding the prefixes and error out once we
+>
+> exceed the limit. We would not generate any IR code.
+>
+>
+Yes.
+>
+>
+It would not enforce a true limit of 15 bytes, since you can't know that
+>
+until you've done the rest of the decode.  But you'd be able to say that
+>
+no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25
+>
+bytes is used.
+>
+>
+Which does fix the bug.
+Yeah, that would work for 2.9 if somebody wants to put together a patch.
+ Ensuring that all instruction fetching happens before translation side
+effects is a little harder, but perhaps it's also the opportunity to get
+rid of s->rip_offset which is a little ugly.
+
+Paolo
+
+On Thu, Mar 23, 2017 at 6:27 AM, Paolo Bonzini <address@hidden> wrote:
+>
+>
+>
+On 22/03/2017 21:01, Richard Henderson wrote:
+>
+>>
+>
+>> Ah, OK. Thanks for the explanation. May be we should check the size of
+>
+>> the instruction while decoding the prefixes and error out once we
+>
+>> exceed the limit. We would not generate any IR code.
+>
+>
+>
+> Yes.
+>
+>
+>
+> It would not enforce a true limit of 15 bytes, since you can't know that
+>
+> until you've done the rest of the decode.  But you'd be able to say that
+>
+> no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25
+>
+> bytes is used.
+>
+>
+>
+> Which does fix the bug.
+>
+>
+Yeah, that would work for 2.9 if somebody wants to put together a patch.
+>
+Ensuring that all instruction fetching happens before translation side
+>
+effects is a little harder, but perhaps it's also the opportunity to get
+>
+rid of s->rip_offset which is a little ugly.
+How about the following?
+
+diff --git a/target/i386/translate.c b/target/i386/translate.c
+index 72c1b03a2a..67c58b8900 100644
+--- a/target/i386/translate.c
++++ b/target/i386/translate.c
+@@ -4418,6 +4418,11 @@ static target_ulong disas_insn(CPUX86State
+*env, DisasContext *s,
+     s->vex_l = 0;
+     s->vex_v = 0;
+  next_byte:
++    /* The prefixes can atmost be 14 bytes since x86 has an upper
++       limit of 15 bytes for the instruction */
++    if (s->pc - pc_start > 14) {
++        goto illegal_op;
++    }
+     b = cpu_ldub_code(env, s->pc);
+     s->pc++;
+     /* Collect prefixes.  */
+
+--
+Pranith
+
+On 23/03/2017 17:50, Pranith Kumar wrote:
+>
+On Thu, Mar 23, 2017 at 6:27 AM, Paolo Bonzini <address@hidden> wrote:
+>
+>
+>
+>
+>
+> On 22/03/2017 21:01, Richard Henderson wrote:
+>
+>>>
+>
+>>> Ah, OK. Thanks for the explanation. May be we should check the size of
+>
+>>> the instruction while decoding the prefixes and error out once we
+>
+>>> exceed the limit. We would not generate any IR code.
+>
+>>
+>
+>> Yes.
+>
+>>
+>
+>> It would not enforce a true limit of 15 bytes, since you can't know that
+>
+>> until you've done the rest of the decode.  But you'd be able to say that
+>
+>> no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25
+>
+>> bytes is used.
+>
+>>
+>
+>> Which does fix the bug.
+>
+>
+>
+> Yeah, that would work for 2.9 if somebody wants to put together a patch.
+>
+>  Ensuring that all instruction fetching happens before translation side
+>
+> effects is a little harder, but perhaps it's also the opportunity to get
+>
+> rid of s->rip_offset which is a little ugly.
+>
+>
+How about the following?
+>
+>
+diff --git a/target/i386/translate.c b/target/i386/translate.c
+>
+index 72c1b03a2a..67c58b8900 100644
+>
+--- a/target/i386/translate.c
+>
++++ b/target/i386/translate.c
+>
+@@ -4418,6 +4418,11 @@ static target_ulong disas_insn(CPUX86State
+>
+*env, DisasContext *s,
+>
+s->vex_l = 0;
+>
+s->vex_v = 0;
+>
+next_byte:
+>
++    /* The prefixes can atmost be 14 bytes since x86 has an upper
+>
++       limit of 15 bytes for the instruction */
+>
++    if (s->pc - pc_start > 14) {
+>
++        goto illegal_op;
+>
++    }
+>
+b = cpu_ldub_code(env, s->pc);
+>
+s->pc++;
+>
+/* Collect prefixes.  */
+Please make the comment more verbose, based on Richard's remark.  We
+should apply it to 2.9.
+
+Also, QEMU usually formats comments with stars on every line.
+
+Paolo
+
+On Thu, Mar 23, 2017 at 1:37 PM, Paolo Bonzini <address@hidden> wrote:
+>
+>
+>
+On 23/03/2017 17:50, Pranith Kumar wrote:
+>
+> On Thu, Mar 23, 2017 at 6:27 AM, Paolo Bonzini <address@hidden> wrote:
+>
+>>
+>
+>>
+>
+>> On 22/03/2017 21:01, Richard Henderson wrote:
+>
+>>>>
+>
+>>>> Ah, OK. Thanks for the explanation. May be we should check the size of
+>
+>>>> the instruction while decoding the prefixes and error out once we
+>
+>>>> exceed the limit. We would not generate any IR code.
+>
+>>>
+>
+>>> Yes.
+>
+>>>
+>
+>>> It would not enforce a true limit of 15 bytes, since you can't know that
+>
+>>> until you've done the rest of the decode.  But you'd be able to say that
+>
+>>> no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25
+>
+>>> bytes is used.
+>
+>>>
+>
+>>> Which does fix the bug.
+>
+>>
+>
+>> Yeah, that would work for 2.9 if somebody wants to put together a patch.
+>
+>>  Ensuring that all instruction fetching happens before translation side
+>
+>> effects is a little harder, but perhaps it's also the opportunity to get
+>
+>> rid of s->rip_offset which is a little ugly.
+>
+>
+>
+> How about the following?
+>
+>
+>
+> diff --git a/target/i386/translate.c b/target/i386/translate.c
+>
+> index 72c1b03a2a..67c58b8900 100644
+>
+> --- a/target/i386/translate.c
+>
+> +++ b/target/i386/translate.c
+>
+> @@ -4418,6 +4418,11 @@ static target_ulong disas_insn(CPUX86State
+>
+> *env, DisasContext *s,
+>
+>      s->vex_l = 0;
+>
+>      s->vex_v = 0;
+>
+>   next_byte:
+>
+> +    /* The prefixes can atmost be 14 bytes since x86 has an upper
+>
+> +       limit of 15 bytes for the instruction */
+>
+> +    if (s->pc - pc_start > 14) {
+>
+> +        goto illegal_op;
+>
+> +    }
+>
+>      b = cpu_ldub_code(env, s->pc);
+>
+>      s->pc++;
+>
+>      /* Collect prefixes.  */
+>
+>
+Please make the comment more verbose, based on Richard's remark.  We
+>
+should apply it to 2.9.
+>
+>
+Also, QEMU usually formats comments with stars on every line.
+OK. I'll send a proper patch with updated comment.
+
+Thanks,
+--
+Pranith
+
diff --git a/results/classifier/zero-shot/007/permissions/14887122 b/results/classifier/zero-shot/007/permissions/14887122
new file mode 100644
index 000000000..ae50ba435
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/14887122
@@ -0,0 +1,268 @@
+permissions: 0.964
+files: 0.944
+debug: 0.934
+semantic: 0.928
+device: 0.919
+PID: 0.914
+socket: 0.914
+graphic: 0.910
+performance: 0.897
+other: 0.890
+vnc: 0.871
+network: 0.855
+boot: 0.831
+KVM: 0.814
+
+[BUG][RFC] CPR transfer Issues: Socket permissions and PID files
+
+Hello,
+
+While testing CPR transfer I encountered two issues. The first is that the 
+transfer fails when running with pidfiles due to the destination qemu process 
+attempting to create the pidfile while it is still locked by the source 
+process. The second is that the transfer fails when running with the -run-with 
+user=$USERID parameter. This is because the destination qemu process creates 
+the UNIX sockets used for the CPR transfer before dropping to the lower 
+permissioned user, which causes them to be owned by the original user. The 
+source qemu process then does not have permission to connect to it because it 
+is already running as the lesser permissioned user.
+
+Reproducing the first issue:
+
+Create a source and destination qemu instance associated with the same VM where 
+both processes have the -pidfile parameter passed on the command line. You 
+should see the following error on the command line of the second process:
+
+qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource 
+temporarily unavailable
+
+Reproducing the second issue:
+
+Create a source and destination qemu instance associated with the same VM where 
+both processes have -run-with user=$USERID passed on the command line, where 
+$USERID is a different user from the one launching the processes. Then attempt 
+a CPR transfer using UNIX sockets for the main and cpr sockets. You should 
+receive the following error via QMP:
+{"error": {"class": "GenericError", "desc": "Failed to connect to 'cpr.sock': 
+Permission denied"}}
+
+I provided a minimal patch that works around the second issue.
+
+Thank you,
+Ben Chaney
+
+---
+include/system/os-posix.h | 4 ++++
+os-posix.c | 8 --------
+util/qemu-sockets.c | 21 +++++++++++++++++++++
+3 files changed, 25 insertions(+), 8 deletions(-)
+
+diff --git a/include/system/os-posix.h b/include/system/os-posix.h
+index ce5b3bccf8..2a414a914a 100644
+--- a/include/system/os-posix.h
++++ b/include/system/os-posix.h
+@@ -55,6 +55,10 @@ void os_setup_limits(void);
+void os_setup_post(void);
+int os_mlock(bool on_fault);
+
++extern struct passwd *user_pwd;
++extern uid_t user_uid;
++extern gid_t user_gid;
++
+/**
+* qemu_alloc_stack:
+* @sz: pointer to a size_t holding the requested usable stack size
+diff --git a/os-posix.c b/os-posix.c
+index 52925c23d3..9369b312a0 100644
+--- a/os-posix.c
++++ b/os-posix.c
+@@ -86,14 +86,6 @@ void os_set_proc_name(const char *s)
+}
+
+
+-/*
+- * Must set all three of these at once.
+- * Legal combinations are unset by name by uid
+- */
+-static struct passwd *user_pwd; /* NULL non-NULL NULL */
+-static uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */
+-static gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */
+-
+/*
+* Prepare to change user ID. user_id can be one of 3 forms:
+* - a username, in which case user ID will be changed to its uid,
+diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
+index 77477c1cd5..987977ead9 100644
+--- a/util/qemu-sockets.c
++++ b/util/qemu-sockets.c
+@@ -871,6 +871,14 @@ static bool saddr_is_tight(UnixSocketAddress *saddr)
+#endif
+}
+
++/*
++ * Must set all three of these at once.
++ * Legal combinations are unset by name by uid
++ */
++struct passwd *user_pwd; /* NULL non-NULL NULL */
++uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */
++gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */
++
+static int unix_listen_saddr(UnixSocketAddress *saddr,
+int num,
+Error **errp)
+@@ -947,6 +955,19 @@ static int unix_listen_saddr(UnixSocketAddress *saddr,
+error_setg_errno(errp, errno, "Failed to bind socket to %s", path);
+goto err;
+}
++ if (user_pwd) {
++ if (chown(un.sun_path, user_pwd->pw_uid, user_pwd->pw_gid) < 0) {
++ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", 
+path);
++ goto err;
++ }
++ }
++ else if (user_uid != -1 && user_gid != -1) {
++ if (chown(un.sun_path, user_uid, user_gid) < 0) {
++ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", 
+path);
++ goto err;
++ }
++ }
++
+if (listen(sock, num) < 0) {
+error_setg_errno(errp, errno, "Failed to listen on socket");
+goto err;
+--
+2.40.1
+
+Thank you Ben.  I appreciate you testing CPR and shaking out the bugs.
+I will study these and propose patches.
+
+My initial reaction to the pidfile issue is that the orchestration layer must
+pass a different filename when starting the destination qemu instance.  When
+using live update without containers, these types of resource conflicts in the
+global namespaces are a known issue.
+
+- Steve
+
+On 3/14/2025 2:33 PM, Chaney, Ben wrote:
+Hello,
+
+While testing CPR transfer I encountered two issues. The first is that the 
+transfer fails when running with pidfiles due to the destination qemu process 
+attempting to create the pidfile while it is still locked by the source 
+process. The second is that the transfer fails when running with the -run-with 
+user=$USERID parameter. This is because the destination qemu process creates 
+the UNIX sockets used for the CPR transfer before dropping to the lower 
+permissioned user, which causes them to be owned by the original user. The 
+source qemu process then does not have permission to connect to it because it 
+is already running as the lesser permissioned user.
+
+Reproducing the first issue:
+
+Create a source and destination qemu instance associated with the same VM where 
+both processes have the -pidfile parameter passed on the command line. You 
+should see the following error on the command line of the second process:
+
+qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource 
+temporarily unavailable
+
+Reproducing the second issue:
+
+Create a source and destination qemu instance associated with the same VM where 
+both processes have -run-with user=$USERID passed on the command line, where 
+$USERID is a different user from the one launching the processes. Then attempt 
+a CPR transfer using UNIX sockets for the main and cpr sockets. You should 
+receive the following error via QMP:
+{"error": {"class": "GenericError", "desc": "Failed to connect to 'cpr.sock': 
+Permission denied"}}
+
+I provided a minimal patch that works around the second issue.
+
+Thank you,
+Ben Chaney
+
+---
+include/system/os-posix.h | 4 ++++
+os-posix.c | 8 --------
+util/qemu-sockets.c | 21 +++++++++++++++++++++
+3 files changed, 25 insertions(+), 8 deletions(-)
+
+diff --git a/include/system/os-posix.h b/include/system/os-posix.h
+index ce5b3bccf8..2a414a914a 100644
+--- a/include/system/os-posix.h
++++ b/include/system/os-posix.h
+@@ -55,6 +55,10 @@ void os_setup_limits(void);
+void os_setup_post(void);
+int os_mlock(bool on_fault);
+
++extern struct passwd *user_pwd;
++extern uid_t user_uid;
++extern gid_t user_gid;
++
+/**
+* qemu_alloc_stack:
+* @sz: pointer to a size_t holding the requested usable stack size
+diff --git a/os-posix.c b/os-posix.c
+index 52925c23d3..9369b312a0 100644
+--- a/os-posix.c
++++ b/os-posix.c
+@@ -86,14 +86,6 @@ void os_set_proc_name(const char *s)
+}
+
+
+-/*
+- * Must set all three of these at once.
+- * Legal combinations are unset by name by uid
+- */
+-static struct passwd *user_pwd; /* NULL non-NULL NULL */
+-static uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */
+-static gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */
+-
+/*
+* Prepare to change user ID. user_id can be one of 3 forms:
+* - a username, in which case user ID will be changed to its uid,
+diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
+index 77477c1cd5..987977ead9 100644
+--- a/util/qemu-sockets.c
++++ b/util/qemu-sockets.c
+@@ -871,6 +871,14 @@ static bool saddr_is_tight(UnixSocketAddress *saddr)
+#endif
+}
+
++/*
++ * Must set all three of these at once.
++ * Legal combinations are unset by name by uid
++ */
++struct passwd *user_pwd; /* NULL non-NULL NULL */
++uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */
++gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */
++
+static int unix_listen_saddr(UnixSocketAddress *saddr,
+int num,
+Error **errp)
+@@ -947,6 +955,19 @@ static int unix_listen_saddr(UnixSocketAddress *saddr,
+error_setg_errno(errp, errno, "Failed to bind socket to %s", path);
+goto err;
+}
++ if (user_pwd) {
++ if (chown(un.sun_path, user_pwd->pw_uid, user_pwd->pw_gid) < 0) {
++ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", 
+path);
++ goto err;
++ }
++ }
++ else if (user_uid != -1 && user_gid != -1) {
++ if (chown(un.sun_path, user_uid, user_gid) < 0) {
++ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", 
+path);
++ goto err;
++ }
++ }
++
+if (listen(sock, num) < 0) {
+error_setg_errno(errp, errno, "Failed to listen on socket");
+goto err;
+--
+2.40.1
+
diff --git a/results/classifier/zero-shot/007/permissions/16056596 b/results/classifier/zero-shot/007/permissions/16056596
new file mode 100644
index 000000000..e6f8e1f9c
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/16056596
@@ -0,0 +1,108 @@
+permissions: 0.985
+other: 0.980
+semantic: 0.979
+debug: 0.978
+files: 0.975
+device: 0.973
+boot: 0.971
+graphic: 0.970
+performance: 0.970
+PID: 0.961
+socket: 0.952
+vnc: 0.946
+network: 0.940
+KVM: 0.934
+
+[BUG][powerpc] KVM Guest Boot Failure and Hang at "Booting Linux via __start()"
+
+Bug Description:
+Encountering a boot failure when launching a KVM guest with
+'qemu-system-ppc64'. The guest hangs at boot, and the QEMU monitor
+crashes.
+Reproduction Steps:
+# qemu-system-ppc64 --version
+QEMU emulator version 9.2.50 (v9.2.0-2799-g0462a32b4f)
+Copyright (c) 2003-2025 Fabrice Bellard and the QEMU Project developers
+# /usr/bin/qemu-system-ppc64 -name avocado-vt-vm1 -machine
+pseries,accel=kvm \
+-m 32768 -smp 32,sockets=1,cores=32,threads=1 -nographic \
+  -device virtio-scsi-pci,id=scsi \
+-drive
+file=/home/kvmci/tests/data/avocado-vt/images/rhel8.0devel-ppc64le.qcow2,if=none,id=drive0,format=qcow2
+\
+-device scsi-hd,drive=drive0,bus=scsi.0 \
+  -netdev bridge,id=net0,br=virbr0 \
+  -device virtio-net-pci,netdev=net0 \
+  -serial pty \
+  -device virtio-balloon-pci \
+  -cpu host
+QEMU 9.2.50 monitor - type 'help' for more information
+char device redirected to /dev/pts/2 (label serial0)
+(qemu)
+(qemu) qemu-system-ppc64: warning: kernel_irqchip allowed but
+unavailable: IRQ_XIVE capability must be present for KVM
+Falling back to kernel-irqchip=off
+** Qemu Hang
+
+(In another ssh session)
+# screen /dev/pts/2
+Preparing to boot Linux version 6.10.4-200.fc40.ppc64le
+(mockbuild@c23cc4e677614c34bb22d54eeea4dc1f) (gcc (GCC) 14.2.1 20240801
+(Red Hat 14.2.1-1), GNU ld version 2.41-37.fc40) #1 SMP Sun Aug 11
+15:20:17 UTC 2024
+Detected machine type: 0000000000000101
+command line:
+BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.10.4-200.fc40.ppc64le
+root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root crashkernel=1024M
+Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
+Calling ibm,client-architecture-support... done
+memory layout at init:
+  memory_limit : 0000000000000000 (16 MB aligned)
+  alloc_bottom : 0000000008200000
+  alloc_top    : 0000000030000000
+  alloc_top_hi : 0000000800000000
+  rmo_top      : 0000000030000000
+  ram_top      : 0000000800000000
+instantiating rtas at 0x000000002fff0000... done
+prom_hold_cpus: skipped
+copying OF device tree...
+Building dt strings...
+Building dt structure...
+Device tree strings 0x0000000008210000 -> 0x0000000008210bd0
+Device tree struct  0x0000000008220000 -> 0x0000000008230000
+Quiescing Open Firmware ...
+Booting Linux via __start() @ 0x0000000000440000 ...
+** Guest Console Hang
+
+
+Git Bisect:
+Performing git bisect points to the following patch:
+# git bisect bad
+e8291ec16da80566c121c68d9112be458954d90b is the first bad commit
+commit e8291ec16da80566c121c68d9112be458954d90b (HEAD)
+Author: Nicholas Piggin <npiggin@gmail.com>
+Date:   Thu Dec 19 13:40:31 2024 +1000
+
+    target/ppc: fix timebase register reset state
+(H)DEC and PURR get reset before icount does, which causes them to
+be
+skewed and not match the init state. This can cause replay to not
+match the recorded trace exactly. For DEC and HDEC this is usually
+not
+noticable since they tend to get programmed before affecting the
+    target machine. PURR has been observed to cause replay bugs when
+    running Linux.
+
+    Fix this by resetting using a time of 0.
+
+    Message-ID: <20241219034035.1826173-2-npiggin@gmail.com>
+    Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
+
+ hw/ppc/ppc.c | 11 ++++++++---
+ 1 file changed, 8 insertions(+), 3 deletions(-)
+
+
+Reverting the patch helps boot the guest.
+Thanks,
+Misbah Anjum N
+
diff --git a/results/classifier/zero-shot/007/permissions/21221931 b/results/classifier/zero-shot/007/permissions/21221931
new file mode 100644
index 000000000..a925c3002
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/21221931
@@ -0,0 +1,338 @@
+permissions: 0.982
+other: 0.979
+network: 0.976
+device: 0.971
+debug: 0.971
+files: 0.967
+semantic: 0.967
+performance: 0.966
+socket: 0.957
+graphic: 0.948
+boot: 0.947
+PID: 0.945
+vnc: 0.944
+KVM: 0.913
+
+[BUG] qemu git error with virgl
+
+Hello,
+
+i can't start any system if i use virgl. I get the following error:
+qemu-x86_64: ../ui/console.c:1791: dpy_gl_ctx_create: Assertion
+`con->gl' failed.
+./and.sh: line 27: 3337167 AbortedÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  qemu-x86_64 -m 4096
+-smp cores=4,sockets=1 -cpu host -machine pc-q35-4.0,accel=kvm -device
+virtio-vga,virgl=on,xres=1280,yres=800 -display sdl,gl=on -device
+intel-hda,id=sound0,msi=on -device
+hda-micro,id=sound0-codec0,bus=sound0.0,cad=0 -device qemu-xhci,id=xhci
+-device usb-tablet,bus=xhci.0 -net
+nic,macaddr=52:54:00:12:34:62,model=e1000 -net
+tap,ifname=$INTERFACE,script=no,downscript=no -drive
+file=/media/daten2/image/lineageos.qcow2,if=virtio,index=1,media=disk,cache=none,aio=threads
+Set 'tap3' nonpersistent
+
+i have bicected the issue:
+
+towo:Defiant> git bisect good
+b4e1a342112e50e05b609e857f38c1f2b7aafdc4 is the first bad commit
+commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4
+Author: Paolo Bonzini <pbonzini@redhat.com>
+Date:Â Â  Tue Oct 27 08:44:23 2020 -0400
+
+Â Â Â  vl: remove separate preconfig main_loop
+Â Â Â  Move post-preconfig initialization to the x-exit-preconfig. If
+preconfig
+Â Â Â  is not requested, just exit preconfig mode immediately with the QMP
+Â Â Â  command.
+
+Â Â Â  As a result, the preconfig loop will run with accel_setup_post
+Â Â Â  and os_setup_post restrictions (xen_restrict, chroot, etc.)
+Â Â Â  already done.
+
+Â Â Â  Reviewed-by: Igor Mammedov <imammedo@redhat.com>
+Â Â Â  Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
+
+Â include/sysemu/runstate.h |Â  1 -
+Â monitor/qmp-cmds.cÂ Â Â Â Â Â Â  |Â  9 -----
+Â softmmu/vl.cÂ Â Â Â Â Â Â Â Â Â Â Â Â  | 95
+++++++++++++++++++++---------------------------
+Â 3 files changed, 41 insertions(+), 64 deletions(-)
+
+Regards,
+
+Torsten Wohlfarth
+
+Cc'ing Gerd + patch author/reviewer.
+
+On 1/2/21 2:11 PM, Torsten Wohlfarth wrote:
+>
+Hello,
+>
+>
+i can't start any system if i use virgl. I get the following error:
+>
+>
+qemu-x86_64: ../ui/console.c:1791: dpy_gl_ctx_create: Assertion
+>
+`con->gl' failed.
+>
+./and.sh: line 27: 3337167 AbortedÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  qemu-x86_64 -m 4096
+>
+-smp cores=4,sockets=1 -cpu host -machine pc-q35-4.0,accel=kvm -device
+>
+virtio-vga,virgl=on,xres=1280,yres=800 -display sdl,gl=on -device
+>
+intel-hda,id=sound0,msi=on -device
+>
+hda-micro,id=sound0-codec0,bus=sound0.0,cad=0 -device qemu-xhci,id=xhci
+>
+-device usb-tablet,bus=xhci.0 -net
+>
+nic,macaddr=52:54:00:12:34:62,model=e1000 -net
+>
+tap,ifname=$INTERFACE,script=no,downscript=no -drive
+>
+file=/media/daten2/image/lineageos.qcow2,if=virtio,index=1,media=disk,cache=none,aio=threads
+>
+>
+Set 'tap3' nonpersistent
+>
+>
+i have bicected the issue:
+>
+>
+towo:Defiant> git bisect good
+>
+b4e1a342112e50e05b609e857f38c1f2b7aafdc4 is the first bad commit
+>
+commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4
+>
+Author: Paolo Bonzini <pbonzini@redhat.com>
+>
+Date:Â Â  Tue Oct 27 08:44:23 2020 -0400
+>
+>
+Â Â Â  vl: remove separate preconfig main_loop
+>
+>
+Â Â Â  Move post-preconfig initialization to the x-exit-preconfig. If
+>
+preconfig
+>
+Â Â Â  is not requested, just exit preconfig mode immediately with the QMP
+>
+Â Â Â  command.
+>
+>
+Â Â Â  As a result, the preconfig loop will run with accel_setup_post
+>
+Â Â Â  and os_setup_post restrictions (xen_restrict, chroot, etc.)
+>
+Â Â Â  already done.
+>
+>
+Â Â Â  Reviewed-by: Igor Mammedov <imammedo@redhat.com>
+>
+Â Â Â  Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
+>
+>
+Â include/sysemu/runstate.h |Â  1 -
+>
+Â monitor/qmp-cmds.cÂ Â Â Â Â Â Â  |Â  9 -----
+>
+Â softmmu/vl.cÂ Â Â Â Â Â Â Â Â Â Â Â Â  | 95
+>
+++++++++++++++++++++---------------------------
+>
+Â 3 files changed, 41 insertions(+), 64 deletions(-)
+>
+>
+Regards,
+>
+>
+Torsten Wohlfarth
+>
+>
+>
+
+On Sun, 3 Jan 2021 18:28:11 +0100
+Philippe Mathieu-DaudÃ© <philmd@redhat.com> wrote:
+
+>
+Cc'ing Gerd + patch author/reviewer.
+>
+>
+On 1/2/21 2:11 PM, Torsten Wohlfarth wrote:
+>
+> Hello,
+>
+>
+>
+> i can't start any system if i use virgl. I get the following error:
+>
+>
+>
+> qemu-x86_64: ../ui/console.c:1791: dpy_gl_ctx_create: Assertion
+>
+> `con->gl' failed.
+Does following fix issue:
+  [PULL 12/55] vl: initialize displays _after_ exiting preconfiguration
+
+>
+> ./and.sh: line 27: 3337167 AbortedÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  qemu-x86_64 -m 4096
+>
+> -smp cores=4,sockets=1 -cpu host -machine pc-q35-4.0,accel=kvm -device
+>
+> virtio-vga,virgl=on,xres=1280,yres=800 -display sdl,gl=on -device
+>
+> intel-hda,id=sound0,msi=on -device
+>
+> hda-micro,id=sound0-codec0,bus=sound0.0,cad=0 -device qemu-xhci,id=xhci
+>
+> -device usb-tablet,bus=xhci.0 -net
+>
+> nic,macaddr=52:54:00:12:34:62,model=e1000 -net
+>
+> tap,ifname=$INTERFACE,script=no,downscript=no -drive
+>
+> file=/media/daten2/image/lineageos.qcow2,if=virtio,index=1,media=disk,cache=none,aio=threads
+>
+>
+>
+> Set 'tap3' nonpersistent
+>
+>
+>
+> i have bicected the issue:
+>
+>
+>
+> towo:Defiant> git bisect good
+>
+> b4e1a342112e50e05b609e857f38c1f2b7aafdc4 is the first bad commit
+>
+> commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4
+>
+> Author: Paolo Bonzini <pbonzini@redhat.com>
+>
+> Date:Â Â  Tue Oct 27 08:44:23 2020 -0400
+>
+>
+>
+> Â Â Â  vl: remove separate preconfig main_loop
+>
+>
+>
+> Â Â Â  Move post-preconfig initialization to the x-exit-preconfig. If
+>
+> preconfig
+>
+> Â Â Â  is not requested, just exit preconfig mode immediately with the QMP
+>
+> Â Â Â  command.
+>
+>
+>
+> Â Â Â  As a result, the preconfig loop will run with accel_setup_post
+>
+> Â Â Â  and os_setup_post restrictions (xen_restrict, chroot, etc.)
+>
+> Â Â Â  already done.
+>
+>
+>
+> Â Â Â  Reviewed-by: Igor Mammedov <imammedo@redhat.com>
+>
+> Â Â Â  Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
+>
+>
+>
+> Â include/sysemu/runstate.h |Â  1 -
+>
+> Â monitor/qmp-cmds.cÂ Â Â Â Â Â Â  |Â  9 -----
+>
+> Â softmmu/vl.cÂ Â Â Â Â Â Â Â Â Â Â Â Â  | 95
+>
+> ++++++++++++++++++++---------------------------
+>
+> Â 3 files changed, 41 insertions(+), 64 deletions(-)
+>
+>
+>
+> Regards,
+>
+>
+>
+> Torsten Wohlfarth
+>
+>
+>
+>
+>
+>
+>
+>
+
+Hi Igor,
+
+yes, that fixes my issue.
+
+Regards, Torsten
+
+Am 04.01.21 um 19:50 schrieb Igor Mammedov:
+On Sun, 3 Jan 2021 18:28:11 +0100
+Philippe Mathieu-DaudÃ© <philmd@redhat.com> wrote:
+Cc'ing Gerd + patch author/reviewer.
+
+On 1/2/21 2:11 PM, Torsten Wohlfarth wrote:
+Hello,
+
+i can't start any system if i use virgl. I get the following error:
+
+qemu-x86_64: ../ui/console.c:1791: dpy_gl_ctx_create: Assertion
+`con->gl' failed.
+Does following fix issue:
+   [PULL 12/55] vl: initialize displays _after_ exiting preconfiguration
+./and.sh: line 27: 3337167 AbortedÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  qemu-x86_64 -m 4096
+-smp cores=4,sockets=1 -cpu host -machine pc-q35-4.0,accel=kvm -device
+virtio-vga,virgl=on,xres=1280,yres=800 -display sdl,gl=on -device
+intel-hda,id=sound0,msi=on -device
+hda-micro,id=sound0-codec0,bus=sound0.0,cad=0 -device qemu-xhci,id=xhci
+-device usb-tablet,bus=xhci.0 -net
+nic,macaddr=52:54:00:12:34:62,model=e1000 -net
+tap,ifname=$INTERFACE,script=no,downscript=no -drive
+file=/media/daten2/image/lineageos.qcow2,if=virtio,index=1,media=disk,cache=none,aio=threads
+
+Set 'tap3' nonpersistent
+
+i have bicected the issue:
+towo:Defiant> git bisect good
+b4e1a342112e50e05b609e857f38c1f2b7aafdc4 is the first bad commit
+commit b4e1a342112e50e05b609e857f38c1f2b7aafdc4
+Author: Paolo Bonzini <pbonzini@redhat.com>
+Date:Â Â  Tue Oct 27 08:44:23 2020 -0400
+
+ Â Â Â  vl: remove separate preconfig main_loop
+
+ Â Â Â  Move post-preconfig initialization to the x-exit-preconfig. If
+preconfig
+ Â Â Â  is not requested, just exit preconfig mode immediately with the QMP
+ Â Â Â  command.
+
+ Â Â Â  As a result, the preconfig loop will run with accel_setup_post
+ Â Â Â  and os_setup_post restrictions (xen_restrict, chroot, etc.)
+ Â Â Â  already done.
+
+ Â Â Â  Reviewed-by: Igor Mammedov <imammedo@redhat.com>
+ Â Â Â  Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
+
+ Â include/sysemu/runstate.h |Â  1 -
+ Â monitor/qmp-cmds.cÂ Â Â Â Â Â Â  |Â  9 -----
+ Â softmmu/vl.cÂ Â Â Â Â Â Â Â Â Â Â Â Â  | 95
+++++++++++++++++++++---------------------------
+ Â 3 files changed, 41 insertions(+), 64 deletions(-)
+
+Regards,
+
+Torsten Wohlfarth
+
diff --git a/results/classifier/zero-shot/007/permissions/23300761 b/results/classifier/zero-shot/007/permissions/23300761
new file mode 100644
index 000000000..929fc4adf
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/23300761
@@ -0,0 +1,323 @@
+permissions: 0.984
+debug: 0.978
+other: 0.963
+performance: 0.952
+PID: 0.950
+semantic: 0.950
+device: 0.932
+boot: 0.929
+socket: 0.927
+vnc: 0.926
+graphic: 0.924
+files: 0.910
+KVM: 0.897
+network: 0.879
+
+[Qemu-devel] [BUG] 216 Alerts reported by LGTM for QEMU (some might be release critical)
+
+Hi,
+LGTM reports 16 errors, 81 warnings and 119 recommendations:
+https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list
+.
+Some of them are already know (wrong format strings), others look like
+real errors:
+- several multiplication results which don't work as they should in
+contrib/vhost-user-gpu, block/* (m->nb_clusters * s->cluster_size only
+32 bit!),Â  target/i386/translate.c and other files
+- potential buffer overflows in gdbstub.c and other files
+I am afraid that the overflows in the block code are release critical,
+maybe that in target/i386/translate.c and other errors, too.
+About half of the alerts are issues which can be fixed later.
+
+Regards
+
+Stefan
+
+On 13/07/19 19:46, Stefan Weil wrote:
+>
+>
+LGTM reports 16 errors, 81 warnings and 119 recommendations:
+>
+https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list
+.
+>
+>
+Some of them are already know (wrong format strings), others look like
+>
+real errors:
+>
+>
+- several multiplication results which don't work as they should in
+>
+contrib/vhost-user-gpu, block/* (m->nb_clusters * s->cluster_size only
+>
+32 bit!),Â  target/i386/translate.c and other files
+m->nb_clusters here is limited by s->l2_slice_size (see for example
+handle_alloc) so I wouldn't be surprised if this is a false positive.  I
+couldn't find this particular multiplication in Coverity, but it has
+about 250 issues marked as intentional or false positive so there's
+probably a lot of overlap with what LGTM found.
+
+Paolo
+
+Am 13.07.2019 um 21:42 schrieb Paolo Bonzini:
+>
+On 13/07/19 19:46, Stefan Weil wrote:
+>
+> LGTM reports 16 errors, 81 warnings and 119 recommendations:
+>
+>
+https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list
+.
+>
+>
+>
+> Some of them are already known (wrong format strings), others look like
+>
+> real errors:
+>
+>
+>
+> - several multiplication results which don't work as they should in
+>
+> contrib/vhost-user-gpu, block/* (m->nb_clusters * s->cluster_size only
+>
+> 32 bit!),Â  target/i386/translate.c and other files
+>
+m->nb_clusters here is limited by s->l2_slice_size (see for example
+>
+handle_alloc) so I wouldn't be surprised if this is a false positive.  I
+>
+couldn't find this particular multiplication in Coverity, but it has
+>
+about 250 issues marked as intentional or false positive so there's
+>
+probably a lot of overlap with what LGTM found.
+>
+>
+Paolo
+>
+From other projects I know that there is a certain overlap between the
+results from Coverity Scan an LGTM, but it is good to have both
+analyzers, and the results from LGTM are typically quite reliable.
+
+Even if we know that there is no multiplication overflow, the code could
+be modified. Either the assigned value should use the same data type as
+the factors (possible when there is never an overflow, avoids a size
+extension), or the multiplication could use the larger data type by
+adding a type cast to one of the factors (then an overflow cannot
+happen, static code analysers and human reviewers have an easier job,
+but the multiplication costs more time).
+
+Stefan
+
+Am 14.07.2019 um 15:28 hat Stefan Weil geschrieben:
+>
+Am 13.07.2019 um 21:42 schrieb Paolo Bonzini:
+>
+> On 13/07/19 19:46, Stefan Weil wrote:
+>
+>> LGTM reports 16 errors, 81 warnings and 119 recommendations:
+>
+>>
+https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list
+.
+>
+>>
+>
+>> Some of them are already known (wrong format strings), others look like
+>
+>> real errors:
+>
+>>
+>
+>> - several multiplication results which don't work as they should in
+>
+>> contrib/vhost-user-gpu, block/* (m->nb_clusters * s->cluster_size only
+>
+>> 32 bit!),Â  target/i386/translate.c and other files
+Request sizes are limited to 32 bit in the generic block layer before
+they are even passed to the individual block drivers, so most if not all
+of these are going to be false positives.
+
+>
+> m->nb_clusters here is limited by s->l2_slice_size (see for example
+>
+> handle_alloc) so I wouldn't be surprised if this is a false positive.  I
+>
+> couldn't find this particular multiplication in Coverity, but it has
+>
+> about 250 issues marked as intentional or false positive so there's
+>
+> probably a lot of overlap with what LGTM found.
+>
+>
+>
+> Paolo
+>
+>
+From other projects I know that there is a certain overlap between the
+>
+results from Coverity Scan an LGTM, but it is good to have both
+>
+analyzers, and the results from LGTM are typically quite reliable.
+>
+>
+Even if we know that there is no multiplication overflow, the code could
+>
+be modified. Either the assigned value should use the same data type as
+>
+the factors (possible when there is never an overflow, avoids a size
+>
+extension), or the multiplication could use the larger data type by
+>
+adding a type cast to one of the factors (then an overflow cannot
+>
+happen, static code analysers and human reviewers have an easier job,
+>
+but the multiplication costs more time).
+But if you look at the code we're talking about, you see that it's
+complaining about things where being more explicit would make things
+less readable.
+
+For example, if complains about the multiplication in this line:
+
+    s->file_size += n * s->header.cluster_size;
+
+We know that n * s->header.cluster_size fits in 32 bits, but
+s->file_size is 64 bits (and has to be 64 bits). Do you really think we
+should introduce another uint32_t variable to store the intermediate
+result? And if we cast n to uint64_t, not only might the multiplication
+cost more time, but also human readers would wonder why the result could
+become larger than 32 bits. So a cast would be misleading.
+
+
+It also complains about this line:
+
+    ret = bdrv_truncate(bs->file, (3 + l1_clusters) * s->cluster_size,
+                        PREALLOC_MODE_OFF, &local_err);
+
+Here, we don't even assign the result to a 64 bit variable, but just
+pass it to a function which takes a 64 bit parameter. Again, I don't
+think introducing additional variables for the intermediate result or
+adding casts would be an improvement of the situation.
+
+
+So I don't think this is a good enough tool to base our code on what it
+does and doesn't understand. It would have too much of a negative impact
+on our code. We'd rather need a way to mark false positives as such and
+move on without changing the code in such cases.
+
+Kevin
+
+On Sat, 13 Jul 2019 at 18:46, Stefan Weil <address@hidden> wrote:
+>
+LGTM reports 16 errors, 81 warnings and 119 recommendations:
+>
+https://lgtm.com/projects/g/qemu/qemu/alerts/?mode=list
+.
+I had a look at some of these before, but mostly I came
+to the conclusion that it wasn't worth trying to put the
+effort into keeping up with the site because they didn't
+seem to provide any useful way to mark things as false
+positives. Coverity has its flaws but at least you can do
+that kind of thing in its UI (it runs at about a 33% fp
+rate, I think.) "Analyzer thinks this multiply can overflow
+but in fact it's not possible" is quite a common false
+positive cause...
+
+Anyway, if you want to fish out specific issues, analyse
+whether they're false positive or real, and report them
+to the mailing list as followups to the patches which
+introduced the issue, that's probably the best way for
+us to make use of this analyzer. (That is essentially
+what I do for coverity.)
+
+thanks
+-- PMM
+
+Am 14.07.2019 um 19:30 schrieb Peter Maydell:
+[...]
+>
+"Analyzer thinks this multiply can overflow
+>
+but in fact it's not possible" is quite a common false
+>
+positive cause...
+The analysers don't complain because a multiply can overflow.
+
+They complain because the code indicates that a larger result is
+expected, for example uint64_t = uint32_t * uint32_t. They would not
+complain for the same multiplication if it were assigned to a uint32_t.
+
+So there is a simple solution to write the code in a way which avoids
+false positives...
+
+Stefan
+
+Stefan Weil <address@hidden> writes:
+
+>
+Am 14.07.2019 um 19:30 schrieb Peter Maydell:
+>
+[...]
+>
+> "Analyzer thinks this multiply can overflow
+>
+> but in fact it's not possible" is quite a common false
+>
+> positive cause...
+>
+>
+>
+The analysers don't complain because a multiply can overflow.
+>
+>
+They complain because the code indicates that a larger result is
+>
+expected, for example uint64_t = uint32_t * uint32_t. They would not
+>
+complain for the same multiplication if it were assigned to a uint32_t.
+I agree this is an anti-pattern.
+
+>
+So there is a simple solution to write the code in a way which avoids
+>
+false positives...
+You wrote elsewhere in this thread:
+
+    Either the assigned value should use the same data type as the
+    factors (possible when there is never an overflow, avoids a size
+    extension), or the multiplication could use the larger data type by
+    adding a type cast to one of the factors (then an overflow cannot
+    happen, static code analysers and human reviewers have an easier
+    job, but the multiplication costs more time).
+
+Makes sense to me.
+
+On 7/14/19 5:30 PM, Peter Maydell wrote:
+>
+I had a look at some of these before, but mostly I came
+>
+to the conclusion that it wasn't worth trying to put the
+>
+effort into keeping up with the site because they didn't
+>
+seem to provide any useful way to mark things as false
+>
+positives. Coverity has its flaws but at least you can do
+>
+that kind of thing in its UI (it runs at about a 33% fp
+>
+rate, I think.)
+Yes, LGTM wants you to modify the source code with
+
+  /* lgtm [cpp/some-warning-code] */
+
+and on the same line as the reported problem.  Which is mildly annoying in that
+you're definitely committing to LGTM in the long term.  Also for any
+non-trivial bit of code, it will almost certainly run over 80 columns.
+
+
+r~
+
diff --git a/results/classifier/zero-shot/007/permissions/26095107 b/results/classifier/zero-shot/007/permissions/26095107
new file mode 100644
index 000000000..d5075496a
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/26095107
@@ -0,0 +1,168 @@
+permissions: 0.993
+debug: 0.993
+files: 0.989
+PID: 0.988
+device: 0.988
+performance: 0.987
+socket: 0.987
+boot: 0.987
+KVM: 0.985
+other: 0.979
+semantic: 0.974
+vnc: 0.972
+graphic: 0.955
+network: 0.879
+
+[Qemu-devel]  [Bug Report] vm paused after succeeding to migrate
+
+Hi, all
+I encounterd a bug when I try to migrate a windows vm.
+
+Enviroment information:
+host A: cpu E5620(model WestmereEP without flag xsave)
+host B: cpu E5-2643(model SandyBridgeEP with xsave)
+
+The reproduce steps is :
+1. Start a windows 2008 vm with -cpu host(which means host-passthrough).
+2. Migrate the vm to host B when cr4.OSXSAVE=0 (successfully).
+3. Vm runs on host B for a while so that cr4.OSXSAVE changes to 1.
+4. Then migrate the vm to host A (successfully), but vm was paused, and qemu 
+printed log as followed:
+
+KVM: entry failed, hardware error 0x80000021
+
+If you're running a guest on an Intel machine without unrestricted mode
+support, the failure can be most likely due to the guest entering an invalid
+state for Intel VT. For example, the guest maybe running in big real mode
+which is not supported on less recent Intel processors.
+
+EAX=019b3bb0 EBX=01a3ae80 ECX=01a61ce8 EDX=00000000
+ESI=01a62000 EDI=00000000 EBP=00000000 ESP=01718b20
+EIP=0185d982 EFL=00000286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
+ES =0000 00000000 0000ffff 00009300
+CS =f000 ffff0000 0000ffff 00009b00
+SS =0000 00000000 0000ffff 00009300
+DS =0000 00000000 0000ffff 00009300
+FS =0000 00000000 0000ffff 00009300
+GS =0000 00000000 0000ffff 00009300
+LDT=0000 00000000 0000ffff 00008200
+TR =0000 00000000 0000ffff 00008b00
+GDT=     00000000 0000ffff
+IDT=     00000000 0000ffff
+CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
+DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 
+DR3=0000000000000000
+DR6=00000000ffff0ff0 DR7=0000000000000400
+EFER=0000000000000000
+Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 
+00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+
+I have found that problem happened when kvm_put_sregs returns err -22(called by 
+kvm_arch_put_registers(qemu)).
+Because kvm_arch_vcpu_ioctl_set_sregs(kvm-mod) checked that guest_cpuid_has no 
+X86_FEATURE_XSAVE but cr4.OSXSAVE=1.
+So should we cancel migration when kvm_arch_put_registers returns error?
+
+* linzhecheng (address@hidden) wrote:
+>
+Hi, all
+>
+I encounterd a bug when I try to migrate a windows vm.
+>
+>
+Enviroment information:
+>
+host A: cpu E5620(model WestmereEP without flag xsave)
+>
+host B: cpu E5-2643(model SandyBridgeEP with xsave)
+>
+>
+The reproduce steps is :
+>
+1. Start a windows 2008 vm with -cpu host(which means host-passthrough).
+>
+2. Migrate the vm to host B when cr4.OSXSAVE=0 (successfully).
+>
+3. Vm runs on host B for a while so that cr4.OSXSAVE changes to 1.
+>
+4. Then migrate the vm to host A (successfully), but vm was paused, and qemu
+>
+printed log as followed:
+Remember that migrating using -cpu host  across different CPU models is NOT
+expected to work.
+
+>
+KVM: entry failed, hardware error 0x80000021
+>
+>
+If you're running a guest on an Intel machine without unrestricted mode
+>
+support, the failure can be most likely due to the guest entering an invalid
+>
+state for Intel VT. For example, the guest maybe running in big real mode
+>
+which is not supported on less recent Intel processors.
+>
+>
+EAX=019b3bb0 EBX=01a3ae80 ECX=01a61ce8 EDX=00000000
+>
+ESI=01a62000 EDI=00000000 EBP=00000000 ESP=01718b20
+>
+EIP=0185d982 EFL=00000286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
+>
+ES =0000 00000000 0000ffff 00009300
+>
+CS =f000 ffff0000 0000ffff 00009b00
+>
+SS =0000 00000000 0000ffff 00009300
+>
+DS =0000 00000000 0000ffff 00009300
+>
+FS =0000 00000000 0000ffff 00009300
+>
+GS =0000 00000000 0000ffff 00009300
+>
+LDT=0000 00000000 0000ffff 00008200
+>
+TR =0000 00000000 0000ffff 00008b00
+>
+GDT=     00000000 0000ffff
+>
+IDT=     00000000 0000ffff
+>
+CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
+>
+DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
+>
+DR3=0000000000000000
+>
+DR6=00000000ffff0ff0 DR7=0000000000000400
+>
+EFER=0000000000000000
+>
+Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00
+>
+00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+>
+00
+>
+>
+I have found that problem happened when kvm_put_sregs returns err -22(called
+>
+by kvm_arch_put_registers(qemu)).
+>
+Because kvm_arch_vcpu_ioctl_set_sregs(kvm-mod) checked that guest_cpuid_has
+>
+no X86_FEATURE_XSAVE but cr4.OSXSAVE=1.
+>
+So should we cancel migration when kvm_arch_put_registers returns error?
+It would seem good if we can make the migration fail there rather than
+hitting that KVM error.
+It looks like we need to do a bit of plumbing to convert the places that
+call it to return a bool rather than void.
+
+Dave
+
+--
+Dr. David Alan Gilbert / address@hidden / Manchester, UK
+
diff --git a/results/classifier/zero-shot/007/permissions/26430026 b/results/classifier/zero-shot/007/permissions/26430026
new file mode 100644
index 000000000..d081ea9ab
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/26430026
@@ -0,0 +1,175 @@
+permissions: 0.937
+debug: 0.925
+KVM: 0.919
+semantic: 0.904
+device: 0.904
+performance: 0.898
+PID: 0.894
+vnc: 0.893
+files: 0.879
+graphic: 0.862
+boot: 0.841
+socket: 0.817
+other: 0.813
+network: 0.758
+
+[BUG] cxl,i386: e820 mappings may not be correct for cxl
+
+Context included below from prior discussion
+    - `cxl create-region` would fail on inability to allocate memory
+    - traced this down to the memory region being marked RESERVED
+    - E820 map marks the CXL fixed memory window as RESERVED
+
+
+Re: x86 errors, I found that region worked with this patch. (I also
+added the SRAT patches the Davidlohr posted, but I do not think they are
+relevant).
+
+I don't think this is correct, and setting this to E820_RAM causes the
+system to fail to boot at all, but with this change `cxl create-region`
+succeeds, which suggests our e820 mappings in the i386 machine are
+incorrect.
+
+Anyone who can help or have an idea as to what e820 should actually be
+doing with this region, or if this is correct and something else is
+failing, please help!
+
+
+diff --git a/hw/i386/pc.c b/hw/i386/pc.c
+index 566accf7e6..a5e688a742 100644
+--- a/hw/i386/pc.c
++++ b/hw/i386/pc.c
+@@ -1077,7 +1077,7 @@ void pc_memory_init(PCMachineState *pcms,
+                 memory_region_init_io(&fw->mr, OBJECT(machine), &cfmws_ops, fw,
+                                       "cxl-fixed-memory-region", fw->size);
+                 memory_region_add_subregion(system_memory, fw->base, &fw->mr);
+-                e820_add_entry(fw->base, fw->size, E820_RESERVED);
++                e820_add_entry(fw->base, fw->size, E820_NVS);
+                 cxl_fmw_base += fw->size;
+                 cxl_resv_end = cxl_fmw_base;
+             }
+
+
+On Mon, Oct 10, 2022 at 05:32:42PM +0100, Jonathan Cameron wrote:
+>
+>
+> > but i'm not sure of what to do with this info.  We have some proof
+>
+> > that real hardware works with this no problem, and the only difference
+>
+> > is that the EFI/bios/firmware is setting the memory regions as `usable`
+>
+> > or `soft reserved`, which would imply the EDK2 is the blocker here
+>
+> > regardless of the OS driver status.
+>
+> >
+>
+> > But I'd seen elsewhere you had gotten some of this working, and I'm
+>
+> > failing to get anything working at the moment.  If you have any input i
+>
+> > would greatly appreciate the help.
+>
+> >
+>
+> > QEMU config:
+>
+> >
+>
+> > /opt/qemu-cxl2/bin/qemu-system-x86_64 \
+>
+> > -drive
+>
+> > file=/var/lib/libvirt/images/cxl.qcow2,format=qcow2,index=0,media=d\
+>
+> > -m 2G,slots=4,maxmem=4G \
+>
+> > -smp 4 \
+>
+> > -machine type=q35,accel=kvm,cxl=on \
+>
+> > -enable-kvm \
+>
+> > -nographic \
+>
+> > -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \
+>
+> > -device cxl-rp,id=rp0,bus=cxl.0,chassis=0,slot=0 \
+>
+> > -object memory-backend-file,id=cxl-mem0,mem-path=/tmp/cxl-mem0,size=256M \
+>
+> > -object memory-backend-file,id=lsa0,mem-path=/tmp/cxl-lsa0,size=256M \
+>
+> > -device cxl-type3,bus=rp0,pmem=true,memdev=cxl-mem0,lsa=lsa0,id=cxl-pmem0
+>
+> > \
+>
+> > -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=256M
+>
+> >
+>
+> > I'd seen on the lists that you had seen issues with single-rp setups,
+>
+> > but no combination of configuration I've tried (including all the ones
+>
+> > in the docs and tests) lead to a successful region creation with
+>
+> > `cxl create-region`
+>
+>
+>
+> Hmm. Let me have a play.  I've not run x86 tests for a while so
+>
+> perhaps something is missing there.
+>
+>
+>
+> I'm carrying a patch to override check_last_peer() in
+>
+> cxl_port_setup_targets() as that is wrong for some combinations,
+>
+> but that doesn't look like it's related to what you are seeing.
+>
+>
+I'm not sure if it's relevant, but turned out I'd forgotten I'm carrying 3
+>
+patches that aren't upstream (and one is a horrible hack).
+>
+>
+Hack:
+https://lore.kernel.org/linux-cxl/20220819094655.000005ed@huawei.com/
+>
+Shouldn't affect a simple case like this...
+>
+>
+https://lore.kernel.org/linux-cxl/20220819093133.00006c22@huawei.com/T/#t
+>
+(Dan's version)
+>
+>
+https://lore.kernel.org/linux-cxl/20220815154044.24733-1-Jonathan.Cameron@huawei.com/T/#t
+>
+>
+For writes to work you will currently need two rps (nothing on the second is
+>
+fine)
+>
+as we still haven't resolved if the kernel should support an HDM decoder on
+>
+a host bridge with one port.  I think it should (Spec allows it), others
+>
+unconvinced.
+>
+>
+Note I haven't shifted over to x86 yet so may still be something different
+>
+from
+>
+arm64.
+>
+>
+Jonathan
+>
+>
+
diff --git a/results/classifier/zero-shot/007/permissions/48245039 b/results/classifier/zero-shot/007/permissions/48245039
new file mode 100644
index 000000000..18cdaa24d
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/48245039
@@ -0,0 +1,540 @@
+permissions: 0.966
+debug: 0.961
+PID: 0.954
+device: 0.953
+other: 0.953
+semantic: 0.939
+graphic: 0.935
+socket: 0.932
+boot: 0.932
+vnc: 0.926
+files: 0.924
+performance: 0.890
+KVM: 0.855
+network: 0.818
+
+[Qemu-devel] [BUG] gcov support appears to be broken
+
+Hello, according to out docs, here is the procedure that should produce 
+coverage report for execution of the complete "make check":
+
+#./configure --enable-gcov
+#make
+#make check
+#make coverage-report
+
+It seems that first three commands execute as expected. (For example, there are 
+plenty of files generated by "make check" that would've not been generated if 
+"enable-gcov" hadn't been chosen.) However, the last command complains about 
+some missing files related to FP support. If those files are added (for 
+example, artificially, using "touch <missing-file"), that it starts complaining 
+about missing some decodetree-generated files. Other kinds of files are 
+involved too.
+
+It would be nice to have coverage support working. Please somebody take a look, 
+or explain if I make a mistake or misunderstood our gcov support.
+
+Yours,
+Aleksandar
+
+On Mon, 5 Aug 2019 at 11:39, Aleksandar Markovic <address@hidden> wrote:
+>
+>
+Hello, according to out docs, here is the procedure that should produce
+>
+coverage report for execution of the complete "make check":
+>
+>
+#./configure --enable-gcov
+>
+#make
+>
+#make check
+>
+#make coverage-report
+>
+>
+It seems that first three commands execute as expected. (For example, there
+>
+are plenty of files generated by "make check" that would've not been
+>
+generated if "enable-gcov" hadn't been chosen.) However, the last command
+>
+complains about some missing files related to FP support. If those files are
+>
+added (for example, artificially, using "touch <missing-file"), that it
+>
+starts complaining about missing some decodetree-generated files. Other kinds
+>
+of files are involved too.
+>
+>
+It would be nice to have coverage support working. Please somebody take a
+>
+look, or explain if I make a mistake or misunderstood our gcov support.
+Cc'ing Alex who's probably the closest we have to a gcov expert.
+
+(make/make check of a --enable-gcov build is in the set of things our
+Travis CI setup runs, so we do defend that part against regressions.)
+
+thanks
+-- PMM
+
+Peter Maydell <address@hidden> writes:
+
+>
+On Mon, 5 Aug 2019 at 11:39, Aleksandar Markovic <address@hidden> wrote:
+>
+>
+>
+> Hello, according to out docs, here is the procedure that should produce
+>
+> coverage report for execution of the complete "make check":
+>
+>
+>
+> #./configure --enable-gcov
+>
+> #make
+>
+> #make check
+>
+> #make coverage-report
+>
+>
+>
+> It seems that first three commands execute as expected. (For example,
+>
+> there are plenty of files generated by "make check" that would've not
+>
+> been generated if "enable-gcov" hadn't been chosen.) However, the
+>
+> last command complains about some missing files related to FP
+>
+> support. If those files are added (for example, artificially, using
+>
+> "touch <missing-file"), that it starts complaining about missing some
+>
+> decodetree-generated files. Other kinds of files are involved too.
+The gcov tool is fairly noisy about missing files but that just
+indicates the tests haven't exercised those code paths. "make check"
+especially doesn't touch much of the TCG code and a chunk of floating
+point.
+
+>
+>
+>
+> It would be nice to have coverage support working. Please somebody
+>
+> take a look, or explain if I make a mistake or misunderstood our gcov
+>
+> support.
+So your failure mode is no report is generated at all? It's working for
+me here.
+
+>
+>
+Cc'ing Alex who's probably the closest we have to a gcov expert.
+>
+>
+(make/make check of a --enable-gcov build is in the set of things our
+>
+Travis CI setup runs, so we do defend that part against regressions.)
+We defend the build but I have just checked and it seems our
+check_coverage script is currently failing:
+https://travis-ci.org/stsquad/qemu/jobs/567809808#L10328
+But as it's an after_success script it doesn't fail the build.
+
+>
+>
+thanks
+>
+-- PMM
+--
+Alex BennÃ©e
+
+>
+> #./configure --enable-gcov
+>
+> #make
+>
+> #make check
+>
+> #make coverage-report
+>
+>
+>
+> It seems that first three commands execute as expected. (For example,
+>
+> there are plenty of files generated by "make check" that would've not
+>
+> been generated if "enable-gcov" hadn't been chosen.) However, the
+>
+> last command complains about some missing files related to FP
+>
+So your failure mode is no report is generated at all? It's working for
+>
+me here.
+Alex, no report is generated for my test setups - in fact, "make 
+coverage-report" even says that it explicitly deletes what appears to be the 
+main coverage report html file).
+
+This is the terminal output of an unsuccessful executions of "make 
+coverage-report" for recent ToT:
+
+~/Build/qemu-TOT-TEST$ make coverage-report
+make[1]: Entering directory '/home/user/Build/qemu-TOT-TEST/slirp'
+make[1]: Nothing to be done for 'all'.
+make[1]: Leaving directory '/home/user/Build/qemu-TOT-TEST/slirp'
+        CHK version_gen.h
+  GEN     coverage-report.html
+Traceback (most recent call last):
+  File "/usr/bin/gcovr", line 1970, in <module>
+    print_html_report(covdata, options.html_details)
+  File "/usr/bin/gcovr", line 1473, in print_html_report
+    INPUT = open(data['FILENAME'], 'r')
+IOError: [Errno 2] No such file or directory: 'wrap.inc.c'
+Makefile:1048: recipe for target 
+'/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html' failed
+make: *** 
+[/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html] Error 1
+make: *** Deleting file 
+'/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html'
+
+This instance is executed in QEMU 3.0 source tree: (so, it looks the problem 
+existed for quite some time)
+
+~/Build/qemu-3.0$ make coverage-report
+        CHK version_gen.h
+  GEN     coverage-report.html
+Traceback (most recent call last):
+  File "/usr/bin/gcovr", line 1970, in <module>
+    print_html_report(covdata, options.html_details)
+  File "/usr/bin/gcovr", line 1473, in print_html_report
+    INPUT = open(data['FILENAME'], 'r')
+IOError: [Errno 2] No such file or directory: 
+'/home/user/Build/qemu-3.0/target/openrisc/decode.inc.c'
+Makefile:992: recipe for target 
+'/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html' failed
+make: *** [/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html] 
+Error 1
+make: *** Deleting file 
+'/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html'
+
+Fond regards,
+Aleksandar
+
+
+>
+Alex BennÃ©e
+
+>
+> #./configure --enable-gcov
+>
+> #make
+>
+> #make check
+>
+> #make coverage-report
+>
+>
+>
+> It seems that first three commands execute as expected. (For example,
+>
+> there are plenty of files generated by "make check" that would've not
+>
+> been generated if "enable-gcov" hadn't been chosen.) However, the
+>
+> last command complains about some missing files related to FP
+>
+So your failure mode is no report is generated at all? It's working for
+>
+me here.
+Another piece of info:
+
+~/Build/qemu-TOT-TEST$ gcov --version
+gcov (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010
+Copyright (C) 2015 Free Software Foundation, Inc.
+This is free software; see the source for copying conditions.
+There is NO warranty; not even for MERCHANTABILITY or 
+FITNESS FOR A PARTICULAR PURPOSE.
+
+:~/Build/qemu-TOT-TEST$ gcc --version
+gcc (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0
+Copyright (C) 2017 Free Software Foundation, Inc.
+This is free software; see the source for copying conditions.  There is NO
+warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+
+
+
+
+Alex, no report is generated for my test setups - in fact, "make 
+coverage-report" even says that it explicitly deletes what appears to be the 
+main coverage report html file).
+
+This is the terminal output of an unsuccessful executions of "make 
+coverage-report" for recent ToT:
+
+~/Build/qemu-TOT-TEST$ make coverage-report
+make[1]: Entering directory '/home/user/Build/qemu-TOT-TEST/slirp'
+make[1]: Nothing to be done for 'all'.
+make[1]: Leaving directory '/home/user/Build/qemu-TOT-TEST/slirp'
+        CHK version_gen.h
+  GEN     coverage-report.html
+Traceback (most recent call last):
+  File "/usr/bin/gcovr", line 1970, in <module>
+    print_html_report(covdata, options.html_details)
+  File "/usr/bin/gcovr", line 1473, in print_html_report
+    INPUT = open(data['FILENAME'], 'r')
+IOError: [Errno 2] No such file or directory: 'wrap.inc.c'
+Makefile:1048: recipe for target 
+'/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html' failed
+make: *** 
+[/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html] Error 1
+make: *** Deleting file 
+'/home/user/Build/qemu-TOT-TEST/reports/coverage/coverage-report.html'
+
+This instance is executed in QEMU 3.0 source tree: (so, it looks the problem 
+existed for quite some time)
+
+~/Build/qemu-3.0$ make coverage-report
+        CHK version_gen.h
+  GEN     coverage-report.html
+Traceback (most recent call last):
+  File "/usr/bin/gcovr", line 1970, in <module>
+    print_html_report(covdata, options.html_details)
+  File "/usr/bin/gcovr", line 1473, in print_html_report
+    INPUT = open(data['FILENAME'], 'r')
+IOError: [Errno 2] No such file or directory: 
+'/home/user/Build/qemu-3.0/target/openrisc/decode.inc.c'
+Makefile:992: recipe for target 
+'/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html' failed
+make: *** [/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html] 
+Error 1
+make: *** Deleting file 
+'/home/user/Build/qemu-3.0/reports/coverage/coverage-report.html'
+
+Fond regards,
+Aleksandar
+
+
+>
+Alex BennÃ©e
+
+>
+> #./configure --enable-gcov
+>
+> #make
+>
+> #make check
+>
+> #make coverage-report
+>
+>
+>
+> It seems that first three commands execute as expected. (For example,
+>
+> there are plenty of files generated by "make check" that would've not
+>
+> been generated if "enable-gcov" hadn't been chosen.) However, the
+>
+> last command complains about some missing files related to FP
+>
+So your failure mode is no report is generated at all? It's working for
+>
+me here.
+Alex, here is the thing:
+
+Seeing that my gcovr is relatively old (2014) 3.2 version, I upgraded it from 
+git repo to the most recent 4.1 (actually, to a dev version, from the very tip 
+of the tree), and "make coverage-report" started generating coverage reports. 
+It did emit some error messages (totally different than previous), but still it 
+did not stop like it used to do with gcovr 3.2.
+
+Perhaps you would want to add some gcov/gcovr minimal version info in our docs. 
+(or at least a statement "this was tested with such and such gcc, gcov and 
+gcovr", etc.?)
+
+Coverage report looked fine at first glance, but it a kind of disappointed me 
+when I digged deeper into its content - for example, it shows very low coverage 
+for our FP code (softfloat), while, in fact, we know that "make check" contains 
+detailed tests on FP functionalities. But this is most likely a separate 
+problem of a very different nature, perhaps the issue of separate git repo for 
+FP tests (testfloat) that our FP tests use as a mid-layer.
+
+I'll try how everything works with my test examples, and will let you know.
+
+Your help is greatly appreciated,
+Aleksandar
+
+Fond regards,
+Aleksandar
+
+
+>
+Alex BennÃ©e
+
+Aleksandar Markovic <address@hidden> writes:
+
+>
+>> #./configure --enable-gcov
+>
+>> #make
+>
+>> #make check
+>
+>> #make coverage-report
+>
+>>
+>
+>> It seems that first three commands execute as expected. (For example,
+>
+>> there are plenty of files generated by "make check" that would've not
+>
+>> been generated if "enable-gcov" hadn't been chosen.) However, the
+>
+>> last command complains about some missing files related to FP
+>
+>
+> So your failure mode is no report is generated at all? It's working for
+>
+> me here.
+>
+>
+Alex, here is the thing:
+>
+>
+Seeing that my gcovr is relatively old (2014) 3.2 version, I upgraded it from
+>
+git repo to the most recent 4.1 (actually, to a dev version, from the very
+>
+tip of the tree), and "make coverage-report" started generating coverage
+>
+reports. It did emit some error messages (totally different than previous),
+>
+but still it did not stop like it used to do with gcovr 3.2.
+>
+>
+Perhaps you would want to add some gcov/gcovr minimal version info in our
+>
+docs. (or at least a statement "this was tested with such and such gcc, gcov
+>
+and gcovr", etc.?)
+>
+>
+Coverage report looked fine at first glance, but it a kind of
+>
+disappointed me when I digged deeper into its content - for example,
+>
+it shows very low coverage for our FP code (softfloat), while, in
+>
+fact, we know that "make check" contains detailed tests on FP
+>
+functionalities. But this is most likely a separate problem of a very
+>
+different nature, perhaps the issue of separate git repo for FP tests
+>
+(testfloat) that our FP tests use as a mid-layer.
+I get:
+
+68.6 %  2593 / 3782     62.2 %  1690 / 2718
+
+Which is not bad considering we don't exercise the 80 and 128 bit
+softfloat code at all (which is not shared by the re-factored 16/32/64
+bit code).
+
+>
+>
+I'll try how everything works with my test examples, and will let you know.
+>
+>
+Your help is greatly appreciated,
+>
+Aleksandar
+>
+>
+Fond regards,
+>
+Aleksandar
+>
+>
+>
+> Alex BennÃ©e
+--
+Alex BennÃ©e
+
+>
+> it shows very low coverage for our FP code (softfloat), while, in
+>
+> fact, we know that "make check" contains detailed tests on FP
+>
+> functionalities. But this is most likely a separate problem of a very
+>
+> different nature, perhaps the issue of separate git repo for FP tests
+>
+> (testfloat) that our FP tests use as a mid-layer.
+>
+>
+I get:
+>
+>
+68.6 %  2593 / 3782     62.2 %  1690 / 2718
+>
+I would expect that kind of result too.
+
+However, I get:
+
+File:   fpu/softfloat.c                 Lines:  8       3334    0.2 %
+Date:   2019-08-05 19:56:58             Branches:       3       2376    0.1 %
+
+:(
+
+OK, I'll try to figure that out, and most likely I could live with it if it is 
+an isolated problem.
+
+Thank you for your assistance in this matter,
+Aleksandar
+
+>
+Which is not bad considering we don't exercise the 80 and 128 bit
+>
+softfloat code at all (which is not shared by the re-factored 16/32/64
+>
+bit code).
+>
+>
+Alex BennÃ©e
+
+>
+> it shows very low coverage for our FP code (softfloat), while, in
+>
+> fact, we know that "make check" contains detailed tests on FP
+>
+> functionalities. But this is most likely a separate problem of a very
+>
+> different nature, perhaps the issue of separate git repo for FP tests
+>
+> (testfloat) that our FP tests use as a mid-layer.
+>
+>
+I get:
+>
+>
+68.6 %  2593 / 3782     62.2 %  1690 / 2718
+>
+This problem is solved too. (and it is my fault)
+
+I worked with multiple versions of QEMU, and my previous low-coverage results 
+were for QEMU 3.0, and for that version the directory tests/fp did not even 
+exist. :D (<blush>)
+
+For QEMU ToT, I get now:
+
+fpu/softfloat.c         
+        68.8 %  2592 / 3770     62.3 %  1693 / 2718
+
+which is identical for all intents and purposes to your result.
+
+Yours cordially,
+Aleksandar
+
diff --git a/results/classifier/zero-shot/007/permissions/51610399 b/results/classifier/zero-shot/007/permissions/51610399
new file mode 100644
index 000000000..2e420e72d
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/51610399
@@ -0,0 +1,318 @@
+permissions: 0.988
+debug: 0.986
+boot: 0.986
+graphic: 0.986
+other: 0.985
+semantic: 0.984
+device: 0.984
+performance: 0.983
+files: 0.981
+PID: 0.978
+socket: 0.978
+KVM: 0.975
+vnc: 0.974
+network: 0.973
+
+[BUG][powerpc] KVM Guest Boot Failure – Hangs at "Booting Linux via __start()”
+
+Bug Description:
+Encountering a boot failure when launching a KVM guest with
+qemu-system-ppc64. The guest hangs at boot, and the QEMU monitor
+crashes.
+Reproduction Steps:
+# qemu-system-ppc64 --version
+QEMU emulator version 9.2.50 (v9.2.0-2799-g0462a32b4f)
+Copyright (c) 2003-2025 Fabrice Bellard and the QEMU Project developers
+# /usr/bin/qemu-system-ppc64 -name avocado-vt-vm1 -machine
+pseries,accel=kvm \
+-m 32768 -smp 32,sockets=1,cores=32,threads=1 -nographic \
+  -device virtio-scsi-pci,id=scsi \
+-drive
+file=/home/kvmci/tests/data/avocado-vt/images/rhel8.0devel-ppc64le.qcow2,if=none,id=drive0,format=qcow2
+\
+-device scsi-hd,drive=drive0,bus=scsi.0 \
+  -netdev bridge,id=net0,br=virbr0 \
+  -device virtio-net-pci,netdev=net0 \
+  -serial pty \
+  -device virtio-balloon-pci \
+  -cpu host
+QEMU 9.2.50 monitor - type 'help' for more information
+char device redirected to /dev/pts/2 (label serial0)
+(qemu)
+(qemu) qemu-system-ppc64: warning: kernel_irqchip allowed but
+unavailable: IRQ_XIVE capability must be present for KVM
+Falling back to kernel-irqchip=off
+** Qemu Hang
+
+(In another ssh session)
+# screen /dev/pts/2
+Preparing to boot Linux version 6.10.4-200.fc40.ppc64le
+(mockbuild@c23cc4e677614c34bb22d54eeea4dc1f) (gcc (GCC) 14.2.1 20240801
+(Red Hat 14.2.1-1), GNU ld version 2.41-37.fc40) #1 SMP Sun Aug 11
+15:20:17 UTC 2024
+Detected machine type: 0000000000000101
+command line:
+BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.10.4-200.fc40.ppc64le
+root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root crashkernel=1024M
+Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
+Calling ibm,client-architecture-support... done
+memory layout at init:
+  memory_limit : 0000000000000000 (16 MB aligned)
+  alloc_bottom : 0000000008200000
+  alloc_top    : 0000000030000000
+  alloc_top_hi : 0000000800000000
+  rmo_top      : 0000000030000000
+  ram_top      : 0000000800000000
+instantiating rtas at 0x000000002fff0000... done
+prom_hold_cpus: skipped
+copying OF device tree...
+Building dt strings...
+Building dt structure...
+Device tree strings 0x0000000008210000 -> 0x0000000008210bd0
+Device tree struct  0x0000000008220000 -> 0x0000000008230000
+Quiescing Open Firmware ...
+Booting Linux via __start() @ 0x0000000000440000 ...
+** Guest Console Hang
+
+
+Git Bisect:
+Performing git bisect points to the following patch:
+# git bisect bad
+e8291ec16da80566c121c68d9112be458954d90b is the first bad commit
+commit e8291ec16da80566c121c68d9112be458954d90b (HEAD)
+Author: Nicholas Piggin <npiggin@gmail.com>
+Date:   Thu Dec 19 13:40:31 2024 +1000
+
+    target/ppc: fix timebase register reset state
+(H)DEC and PURR get reset before icount does, which causes them to
+be
+skewed and not match the init state. This can cause replay to not
+match the recorded trace exactly. For DEC and HDEC this is usually
+not
+noticable since they tend to get programmed before affecting the
+    target machine. PURR has been observed to cause replay bugs when
+    running Linux.
+
+    Fix this by resetting using a time of 0.
+
+    Message-ID: <20241219034035.1826173-2-npiggin@gmail.com>
+    Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
+
+ hw/ppc/ppc.c | 11 ++++++++---
+ 1 file changed, 8 insertions(+), 3 deletions(-)
+
+
+Reverting the patch helps boot the guest.
+Thanks,
+Misbah Anjum N
+
+Thanks for the report.
+
+Tricky problem. A secondary CPU is hanging before it is started by the
+primary via rtas call.
+
+That secondary keeps calling kvm_cpu_exec(), which keeps exiting out
+early with EXCP_HLT because kvm_arch_process_async_events() returns
+true because that cpu has ->halted=1. That just goes around he run
+loop because there is an interrupt pending (DEC).
+
+So it never runs. It also never releases the BQL, and another CPU,
+the primary which is actually supposed to be running, is stuck in
+spapr_set_all_lpcrs() in run_on_cpu() waiting for the BQL.
+
+This patch just exposes the bug I think, by causing the interrupt.
+although I'm not quite sure why it's okay previously (-ve decrementer
+values should be causing a timer exception too). The timer exception
+should not be taken as an interrupt by those secondary CPUs, and it
+doesn't because it is masked, until set_all_lpcrs sets an LPCR value
+that enables powersave wakeup on decrementer interrupt.
+
+The start_powered_off sate just sets ->halted, which makes it look
+like a powersaving state. Logically I think it's not the same thing
+as far as spapr goes. I don't know why start_powered_off only sets
+->halted, and not ->stop/stopped as well.
+
+Not sure how best to solve it cleanly. I'll send a revert if I can't
+get something working soon.
+
+Thanks,
+Nick
+
+On Tue Mar 18, 2025 at 7:09 AM AEST, misanjum wrote:
+>
+Bug Description:
+>
+Encountering a boot failure when launching a KVM guest with
+>
+qemu-system-ppc64. The guest hangs at boot, and the QEMU monitor
+>
+crashes.
+>
+>
+>
+Reproduction Steps:
+>
+# qemu-system-ppc64 --version
+>
+QEMU emulator version 9.2.50 (v9.2.0-2799-g0462a32b4f)
+>
+Copyright (c) 2003-2025 Fabrice Bellard and the QEMU Project developers
+>
+>
+# /usr/bin/qemu-system-ppc64 -name avocado-vt-vm1 -machine
+>
+pseries,accel=kvm \
+>
+-m 32768 -smp 32,sockets=1,cores=32,threads=1 -nographic \
+>
+-device virtio-scsi-pci,id=scsi \
+>
+-drive
+>
+file=/home/kvmci/tests/data/avocado-vt/images/rhel8.0devel-ppc64le.qcow2,if=none,id=drive0,format=qcow2
+>
+>
+\
+>
+-device scsi-hd,drive=drive0,bus=scsi.0 \
+>
+-netdev bridge,id=net0,br=virbr0 \
+>
+-device virtio-net-pci,netdev=net0 \
+>
+-serial pty \
+>
+-device virtio-balloon-pci \
+>
+-cpu host
+>
+QEMU 9.2.50 monitor - type 'help' for more information
+>
+char device redirected to /dev/pts/2 (label serial0)
+>
+(qemu)
+>
+(qemu) qemu-system-ppc64: warning: kernel_irqchip allowed but
+>
+unavailable: IRQ_XIVE capability must be present for KVM
+>
+Falling back to kernel-irqchip=off
+>
+** Qemu Hang
+>
+>
+(In another ssh session)
+>
+# screen /dev/pts/2
+>
+Preparing to boot Linux version 6.10.4-200.fc40.ppc64le
+>
+(mockbuild@c23cc4e677614c34bb22d54eeea4dc1f) (gcc (GCC) 14.2.1 20240801
+>
+(Red Hat 14.2.1-1), GNU ld version 2.41-37.fc40) #1 SMP Sun Aug 11
+>
+15:20:17 UTC 2024
+>
+Detected machine type: 0000000000000101
+>
+command line:
+>
+BOOT_IMAGE=(ieee1275/disk,msdos2)/vmlinuz-6.10.4-200.fc40.ppc64le
+>
+root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root crashkernel=1024M
+>
+Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
+>
+Calling ibm,client-architecture-support... done
+>
+memory layout at init:
+>
+memory_limit : 0000000000000000 (16 MB aligned)
+>
+alloc_bottom : 0000000008200000
+>
+alloc_top    : 0000000030000000
+>
+alloc_top_hi : 0000000800000000
+>
+rmo_top      : 0000000030000000
+>
+ram_top      : 0000000800000000
+>
+instantiating rtas at 0x000000002fff0000... done
+>
+prom_hold_cpus: skipped
+>
+copying OF device tree...
+>
+Building dt strings...
+>
+Building dt structure...
+>
+Device tree strings 0x0000000008210000 -> 0x0000000008210bd0
+>
+Device tree struct  0x0000000008220000 -> 0x0000000008230000
+>
+Quiescing Open Firmware ...
+>
+Booting Linux via __start() @ 0x0000000000440000 ...
+>
+** Guest Console Hang
+>
+>
+>
+Git Bisect:
+>
+Performing git bisect points to the following patch:
+>
+# git bisect bad
+>
+e8291ec16da80566c121c68d9112be458954d90b is the first bad commit
+>
+commit e8291ec16da80566c121c68d9112be458954d90b (HEAD)
+>
+Author: Nicholas Piggin <npiggin@gmail.com>
+>
+Date:   Thu Dec 19 13:40:31 2024 +1000
+>
+>
+target/ppc: fix timebase register reset state
+>
+>
+(H)DEC and PURR get reset before icount does, which causes them to
+>
+be
+>
+skewed and not match the init state. This can cause replay to not
+>
+match the recorded trace exactly. For DEC and HDEC this is usually
+>
+not
+>
+noticable since they tend to get programmed before affecting the
+>
+target machine. PURR has been observed to cause replay bugs when
+>
+running Linux.
+>
+>
+Fix this by resetting using a time of 0.
+>
+>
+Message-ID: <20241219034035.1826173-2-npiggin@gmail.com>
+>
+Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
+>
+>
+hw/ppc/ppc.c | 11 ++++++++---
+>
+1 file changed, 8 insertions(+), 3 deletions(-)
+>
+>
+>
+Reverting the patch helps boot the guest.
+>
+Thanks,
+>
+Misbah Anjum N
+
diff --git a/results/classifier/zero-shot/007/permissions/55247116 b/results/classifier/zero-shot/007/permissions/55247116
new file mode 100644
index 000000000..cbe7dfafd
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/55247116
@@ -0,0 +1,1320 @@
+permissions: 0.946
+other: 0.945
+debug: 0.941
+performance: 0.938
+graphic: 0.933
+PID: 0.929
+socket: 0.929
+semantic: 0.928
+device: 0.919
+boot: 0.918
+network: 0.916
+vnc: 0.916
+files: 0.912
+KVM: 0.894
+
+[Qemu-devel]  [RFC/BUG] xen-mapcache: buggy invalidate map cache?
+
+Hi,
+
+In xen_map_cache_unlocked(), map to guest memory maybe in entry->next
+instead of first level entry (if map to rom other than guest memory
+comes first), while in xen_invalidate_map_cache(), when VM ballooned
+out memory, qemu did not invalidate cache entries in linked
+list(entry->next), so when VM balloon back in memory, gfns probably
+mapped to different mfns, thus if guest asks device to DMA to these
+GPA, qemu may DMA to stale MFNs.
+
+So I think in xen_invalidate_map_cache() linked lists should also be
+checked and invalidated.
+
+Whatâs your opinion? Is this a bug? Is my analyze correct?
+
+On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+Hi,
+>
+>
+In xen_map_cache_unlocked(), map to guest memory maybe in entry->next
+>
+instead of first level entry (if map to rom other than guest memory
+>
+comes first), while in xen_invalidate_map_cache(), when VM ballooned
+>
+out memory, qemu did not invalidate cache entries in linked
+>
+list(entry->next), so when VM balloon back in memory, gfns probably
+>
+mapped to different mfns, thus if guest asks device to DMA to these
+>
+GPA, qemu may DMA to stale MFNs.
+>
+>
+So I think in xen_invalidate_map_cache() linked lists should also be
+>
+checked and invalidated.
+>
+>
+Whatâs your opinion? Is this a bug? Is my analyze correct?
+Added Jun Nakajima and Alexander Graf
+
+On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+>
+On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+> Hi,
+>
+>
+>
+> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next
+>
+> instead of first level entry (if map to rom other than guest memory
+>
+> comes first), while in xen_invalidate_map_cache(), when VM ballooned
+>
+> out memory, qemu did not invalidate cache entries in linked
+>
+> list(entry->next), so when VM balloon back in memory, gfns probably
+>
+> mapped to different mfns, thus if guest asks device to DMA to these
+>
+> GPA, qemu may DMA to stale MFNs.
+>
+>
+>
+> So I think in xen_invalidate_map_cache() linked lists should also be
+>
+> checked and invalidated.
+>
+>
+>
+> Whatâs your opinion? Is this a bug? Is my analyze correct?
+>
+>
+Added Jun Nakajima and Alexander Graf
+And correct Stefano Stabellini's email address.
+
+On Mon, 10 Apr 2017 00:36:02 +0800
+hrg <address@hidden> wrote:
+
+Hi,
+
+>
+On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+>
+> On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+>> Hi,
+>
+>>
+>
+>> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next
+>
+>> instead of first level entry (if map to rom other than guest memory
+>
+>> comes first), while in xen_invalidate_map_cache(), when VM ballooned
+>
+>> out memory, qemu did not invalidate cache entries in linked
+>
+>> list(entry->next), so when VM balloon back in memory, gfns probably
+>
+>> mapped to different mfns, thus if guest asks device to DMA to these
+>
+>> GPA, qemu may DMA to stale MFNs.
+>
+>>
+>
+>> So I think in xen_invalidate_map_cache() linked lists should also be
+>
+>> checked and invalidated.
+>
+>>
+>
+>> Whatâs your opinion? Is this a bug? Is my analyze correct?
+>
+>
+>
+> Added Jun Nakajima and Alexander Graf
+>
+And correct Stefano Stabellini's email address.
+There is a real issue with the xen-mapcache corruption in fact. I encountered
+it a few months ago while experimenting with Q35 support on Xen. Q35 emulation
+uses an AHCI controller by default, along with NCQ mode enabled. The issue can
+be (somewhat) easily reproduced there, though using a normal i440 emulation
+might possibly allow to reproduce the issue as well, using a dedicated test
+code from a guest side. In case of Q35+NCQ the issue can be reproduced "as is".
+
+The issue occurs when a guest domain performs an intensive disk I/O, ex. while
+guest OS booting. QEMU crashes with "Bad ram offset 980aa000"
+message logged, where the address is different each time. The hard thing with
+this issue is that it has a very low reproducibility rate.
+
+The corruption happens when there are multiple I/O commands in the NCQ queue.
+So there are overlapping emulated DMA operations in flight and QEMU uses a
+sequence of mapcache actions which can be executed in the "wrong" order thus
+leading to an inconsistent xen-mapcache - so a bad address from the wrong
+entry is returned.
+
+The bad thing with this issue is that QEMU crash due to "Bad ram offset"
+appearance is a relatively good situation in the sense that this is a caught
+error. But there might be a much worse (artificial) situation where the returned
+address looks valid but points to a different mapped memory.
+
+The fix itself is not hard (ex. an additional checked field in MapCacheEntry),
+but there is a need of some reliable way to test it considering the low
+reproducibility rate.
+
+Regards,
+Alex
+
+On Mon, 10 Apr 2017, hrg wrote:
+>
+On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+>
+> On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+>> Hi,
+>
+>>
+>
+>> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next
+>
+>> instead of first level entry (if map to rom other than guest memory
+>
+>> comes first), while in xen_invalidate_map_cache(), when VM ballooned
+>
+>> out memory, qemu did not invalidate cache entries in linked
+>
+>> list(entry->next), so when VM balloon back in memory, gfns probably
+>
+>> mapped to different mfns, thus if guest asks device to DMA to these
+>
+>> GPA, qemu may DMA to stale MFNs.
+>
+>>
+>
+>> So I think in xen_invalidate_map_cache() linked lists should also be
+>
+>> checked and invalidated.
+>
+>>
+>
+>> Whatâs your opinion? Is this a bug? Is my analyze correct?
+Yes, you are right. We need to go through the list for each element of
+the array in xen_invalidate_map_cache. Can you come up with a patch?
+
+On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+>
+On Mon, 10 Apr 2017, hrg wrote:
+>
+> On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+>
+> > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+> >> Hi,
+>
+> >>
+>
+> >> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next
+>
+> >> instead of first level entry (if map to rom other than guest memory
+>
+> >> comes first), while in xen_invalidate_map_cache(), when VM ballooned
+>
+> >> out memory, qemu did not invalidate cache entries in linked
+>
+> >> list(entry->next), so when VM balloon back in memory, gfns probably
+>
+> >> mapped to different mfns, thus if guest asks device to DMA to these
+>
+> >> GPA, qemu may DMA to stale MFNs.
+>
+> >>
+>
+> >> So I think in xen_invalidate_map_cache() linked lists should also be
+>
+> >> checked and invalidated.
+>
+> >>
+>
+> >> Whatâs your opinion? Is this a bug? Is my analyze correct?
+>
+>
+Yes, you are right. We need to go through the list for each element of
+>
+the array in xen_invalidate_map_cache. Can you come up with a patch?
+I spoke too soon. In the regular case there should be no locked mappings
+when xen_invalidate_map_cache is called (see the DPRINTF warning at the
+beginning of the functions). Without locked mappings, there should never
+be more than one element in each list (see xen_map_cache_unlocked:
+entry->lock == true is a necessary condition to append a new entry to
+the list, otherwise it is just remapped).
+
+Can you confirm that what you are seeing are locked mappings
+when xen_invalidate_map_cache is called? To find out, enable the DPRINTK
+by turning it into a printf or by defininig MAPCACHE_DEBUG.
+
+On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini
+<address@hidden> wrote:
+>
+On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+>
+> On Mon, 10 Apr 2017, hrg wrote:
+>
+> > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+>
+> > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+> > >> Hi,
+>
+> > >>
+>
+> > >> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next
+>
+> > >> instead of first level entry (if map to rom other than guest memory
+>
+> > >> comes first), while in xen_invalidate_map_cache(), when VM ballooned
+>
+> > >> out memory, qemu did not invalidate cache entries in linked
+>
+> > >> list(entry->next), so when VM balloon back in memory, gfns probably
+>
+> > >> mapped to different mfns, thus if guest asks device to DMA to these
+>
+> > >> GPA, qemu may DMA to stale MFNs.
+>
+> > >>
+>
+> > >> So I think in xen_invalidate_map_cache() linked lists should also be
+>
+> > >> checked and invalidated.
+>
+> > >>
+>
+> > >> Whatâs your opinion? Is this a bug? Is my analyze correct?
+>
+>
+>
+> Yes, you are right. We need to go through the list for each element of
+>
+> the array in xen_invalidate_map_cache. Can you come up with a patch?
+>
+>
+I spoke too soon. In the regular case there should be no locked mappings
+>
+when xen_invalidate_map_cache is called (see the DPRINTF warning at the
+>
+beginning of the functions). Without locked mappings, there should never
+>
+be more than one element in each list (see xen_map_cache_unlocked:
+>
+entry->lock == true is a necessary condition to append a new entry to
+>
+the list, otherwise it is just remapped).
+>
+>
+Can you confirm that what you are seeing are locked mappings
+>
+when xen_invalidate_map_cache is called? To find out, enable the DPRINTK
+>
+by turning it into a printf or by defininig MAPCACHE_DEBUG.
+In fact, I think the DPRINTF above is incorrect too. In
+pci_add_option_rom(), rtl8139 rom is locked mapped in
+pci_add_option_rom->memory_region_get_ram_ptr (after
+memory_region_init_ram). So actually I think we should remove the
+DPRINTF warning as it is normal.
+
+On Tue, 11 Apr 2017, hrg wrote:
+>
+On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini
+>
+<address@hidden> wrote:
+>
+> On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+>
+>> On Mon, 10 Apr 2017, hrg wrote:
+>
+>> > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+>
+>> > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+>> > >> Hi,
+>
+>> > >>
+>
+>> > >> In xen_map_cache_unlocked(), map to guest memory maybe in entry->next
+>
+>> > >> instead of first level entry (if map to rom other than guest memory
+>
+>> > >> comes first), while in xen_invalidate_map_cache(), when VM ballooned
+>
+>> > >> out memory, qemu did not invalidate cache entries in linked
+>
+>> > >> list(entry->next), so when VM balloon back in memory, gfns probably
+>
+>> > >> mapped to different mfns, thus if guest asks device to DMA to these
+>
+>> > >> GPA, qemu may DMA to stale MFNs.
+>
+>> > >>
+>
+>> > >> So I think in xen_invalidate_map_cache() linked lists should also be
+>
+>> > >> checked and invalidated.
+>
+>> > >>
+>
+>> > >> Whatâs your opinion? Is this a bug? Is my analyze correct?
+>
+>>
+>
+>> Yes, you are right. We need to go through the list for each element of
+>
+>> the array in xen_invalidate_map_cache. Can you come up with a patch?
+>
+>
+>
+> I spoke too soon. In the regular case there should be no locked mappings
+>
+> when xen_invalidate_map_cache is called (see the DPRINTF warning at the
+>
+> beginning of the functions). Without locked mappings, there should never
+>
+> be more than one element in each list (see xen_map_cache_unlocked:
+>
+> entry->lock == true is a necessary condition to append a new entry to
+>
+> the list, otherwise it is just remapped).
+>
+>
+>
+> Can you confirm that what you are seeing are locked mappings
+>
+> when xen_invalidate_map_cache is called? To find out, enable the DPRINTK
+>
+> by turning it into a printf or by defininig MAPCACHE_DEBUG.
+>
+>
+In fact, I think the DPRINTF above is incorrect too. In
+>
+pci_add_option_rom(), rtl8139 rom is locked mapped in
+>
+pci_add_option_rom->memory_region_get_ram_ptr (after
+>
+memory_region_init_ram). So actually I think we should remove the
+>
+DPRINTF warning as it is normal.
+Let me explain why the DPRINTF warning is there: emulated dma operations
+can involve locked mappings. Once a dma operation completes, the related
+mapping is unlocked and can be safely destroyed. But if we destroy a
+locked mapping in xen_invalidate_map_cache, while a dma is still
+ongoing, QEMU will crash. We cannot handle that case.
+
+However, the scenario you described is different. It has nothing to do
+with DMA. It looks like pci_add_option_rom calls
+memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a
+locked mapping and it is never unlocked or destroyed.
+
+It looks like "ptr" is not used after pci_add_option_rom returns. Does
+the append patch fix the problem you are seeing? For the proper fix, I
+think we probably need some sort of memory_region_unmap wrapper or maybe
+a call to address_space_unmap.
+
+
+diff --git a/hw/pci/pci.c b/hw/pci/pci.c
+index e6b08e1..04f98b7 100644
+--- a/hw/pci/pci.c
++++ b/hw/pci/pci.c
+@@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool 
+is_default_rom,
+     }
+ 
+     pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
++    xen_invalidate_map_cache_entry(ptr);
+ }
+ 
+ static void pci_del_option_rom(PCIDevice *pdev)
+
+On Tue, 11 Apr 2017 15:32:09 -0700 (PDT)
+Stefano Stabellini <address@hidden> wrote:
+
+>
+On Tue, 11 Apr 2017, hrg wrote:
+>
+> On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini
+>
+> <address@hidden> wrote:
+>
+> > On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+>
+> >> On Mon, 10 Apr 2017, hrg wrote:
+>
+> >> > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+>
+> >> > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+> >> > >> Hi,
+>
+> >> > >>
+>
+> >> > >> In xen_map_cache_unlocked(), map to guest memory maybe in
+>
+> >> > >> entry->next instead of first level entry (if map to rom other than
+>
+> >> > >> guest memory comes first), while in xen_invalidate_map_cache(),
+>
+> >> > >> when VM ballooned out memory, qemu did not invalidate cache entries
+>
+> >> > >> in linked list(entry->next), so when VM balloon back in memory,
+>
+> >> > >> gfns probably mapped to different mfns, thus if guest asks device
+>
+> >> > >> to DMA to these GPA, qemu may DMA to stale MFNs.
+>
+> >> > >>
+>
+> >> > >> So I think in xen_invalidate_map_cache() linked lists should also be
+>
+> >> > >> checked and invalidated.
+>
+> >> > >>
+>
+> >> > >> Whatâs your opinion? Is this a bug? Is my analyze correct?
+>
+> >>
+>
+> >> Yes, you are right. We need to go through the list for each element of
+>
+> >> the array in xen_invalidate_map_cache. Can you come up with a patch?
+>
+> >
+>
+> > I spoke too soon. In the regular case there should be no locked mappings
+>
+> > when xen_invalidate_map_cache is called (see the DPRINTF warning at the
+>
+> > beginning of the functions). Without locked mappings, there should never
+>
+> > be more than one element in each list (see xen_map_cache_unlocked:
+>
+> > entry->lock == true is a necessary condition to append a new entry to
+>
+> > the list, otherwise it is just remapped).
+>
+> >
+>
+> > Can you confirm that what you are seeing are locked mappings
+>
+> > when xen_invalidate_map_cache is called? To find out, enable the DPRINTK
+>
+> > by turning it into a printf or by defininig MAPCACHE_DEBUG.
+>
+>
+>
+> In fact, I think the DPRINTF above is incorrect too. In
+>
+> pci_add_option_rom(), rtl8139 rom is locked mapped in
+>
+> pci_add_option_rom->memory_region_get_ram_ptr (after
+>
+> memory_region_init_ram). So actually I think we should remove the
+>
+> DPRINTF warning as it is normal.
+>
+>
+Let me explain why the DPRINTF warning is there: emulated dma operations
+>
+can involve locked mappings. Once a dma operation completes, the related
+>
+mapping is unlocked and can be safely destroyed. But if we destroy a
+>
+locked mapping in xen_invalidate_map_cache, while a dma is still
+>
+ongoing, QEMU will crash. We cannot handle that case.
+>
+>
+However, the scenario you described is different. It has nothing to do
+>
+with DMA. It looks like pci_add_option_rom calls
+>
+memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a
+>
+locked mapping and it is never unlocked or destroyed.
+>
+>
+It looks like "ptr" is not used after pci_add_option_rom returns. Does
+>
+the append patch fix the problem you are seeing? For the proper fix, I
+>
+think we probably need some sort of memory_region_unmap wrapper or maybe
+>
+a call to address_space_unmap.
+Hmm, for some reason my message to the Xen-devel list got rejected but was sent
+to Qemu-devel instead, without any notice. Sorry if I'm missing something
+obvious as a list newbie.
+
+Stefano, hrg,
+
+There is an issue with inconsistency between the list of normal MapCacheEntry's
+and their 'reverse' counterparts - MapCacheRev's in locked_entries.
+When bad situation happens, there are multiple (locked) MapCacheEntry
+entries in the bucket's linked list along with a number of MapCacheRev's. And
+when it comes to a reverse lookup, xen-mapcache picks the wrong entry from the
+first list and calculates a wrong pointer from it which may then be caught with
+the "Bad RAM offset" check (or not). Mapcache invalidation might be related to
+this issue as well I think.
+
+I'll try to provide a test code which can reproduce the issue from the
+guest side using an emulated IDE controller, though it's much simpler to achieve
+this result with an AHCI controller using multiple NCQ I/O commands. So far I've
+seen this issue only with Windows 7 (and above) guest on AHCI, but any block I/O
+DMA should be enough I think.
+
+On 2017/4/12 14:17, Alexey G wrote:
+On Tue, 11 Apr 2017 15:32:09 -0700 (PDT)
+Stefano Stabellini <address@hidden> wrote:
+On Tue, 11 Apr 2017, hrg wrote:
+On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini
+<address@hidden> wrote:
+On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+On Mon, 10 Apr 2017, hrg wrote:
+On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+Hi,
+
+In xen_map_cache_unlocked(), map to guest memory maybe in
+entry->next instead of first level entry (if map to rom other than
+guest memory comes first), while in xen_invalidate_map_cache(),
+when VM ballooned out memory, qemu did not invalidate cache entries
+in linked list(entry->next), so when VM balloon back in memory,
+gfns probably mapped to different mfns, thus if guest asks device
+to DMA to these GPA, qemu may DMA to stale MFNs.
+
+So I think in xen_invalidate_map_cache() linked lists should also be
+checked and invalidated.
+
+Whatâs your opinion? Is this a bug? Is my analyze correct?
+Yes, you are right. We need to go through the list for each element of
+the array in xen_invalidate_map_cache. Can you come up with a patch?
+I spoke too soon. In the regular case there should be no locked mappings
+when xen_invalidate_map_cache is called (see the DPRINTF warning at the
+beginning of the functions). Without locked mappings, there should never
+be more than one element in each list (see xen_map_cache_unlocked:
+entry->lock == true is a necessary condition to append a new entry to
+the list, otherwise it is just remapped).
+
+Can you confirm that what you are seeing are locked mappings
+when xen_invalidate_map_cache is called? To find out, enable the DPRINTK
+by turning it into a printf or by defininig MAPCACHE_DEBUG.
+In fact, I think the DPRINTF above is incorrect too. In
+pci_add_option_rom(), rtl8139 rom is locked mapped in
+pci_add_option_rom->memory_region_get_ram_ptr (after
+memory_region_init_ram). So actually I think we should remove the
+DPRINTF warning as it is normal.
+Let me explain why the DPRINTF warning is there: emulated dma operations
+can involve locked mappings. Once a dma operation completes, the related
+mapping is unlocked and can be safely destroyed. But if we destroy a
+locked mapping in xen_invalidate_map_cache, while a dma is still
+ongoing, QEMU will crash. We cannot handle that case.
+
+However, the scenario you described is different. It has nothing to do
+with DMA. It looks like pci_add_option_rom calls
+memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a
+locked mapping and it is never unlocked or destroyed.
+
+It looks like "ptr" is not used after pci_add_option_rom returns. Does
+the append patch fix the problem you are seeing? For the proper fix, I
+think we probably need some sort of memory_region_unmap wrapper or maybe
+a call to address_space_unmap.
+Hmm, for some reason my message to the Xen-devel list got rejected but was sent
+to Qemu-devel instead, without any notice. Sorry if I'm missing something
+obvious as a list newbie.
+
+Stefano, hrg,
+
+There is an issue with inconsistency between the list of normal MapCacheEntry's
+and their 'reverse' counterparts - MapCacheRev's in locked_entries.
+When bad situation happens, there are multiple (locked) MapCacheEntry
+entries in the bucket's linked list along with a number of MapCacheRev's. And
+when it comes to a reverse lookup, xen-mapcache picks the wrong entry from the
+first list and calculates a wrong pointer from it which may then be caught with
+the "Bad RAM offset" check (or not). Mapcache invalidation might be related to
+this issue as well I think.
+
+I'll try to provide a test code which can reproduce the issue from the
+guest side using an emulated IDE controller, though it's much simpler to achieve
+this result with an AHCI controller using multiple NCQ I/O commands. So far I've
+seen this issue only with Windows 7 (and above) guest on AHCI, but any block I/O
+DMA should be enough I think.
+Yes, I think there may be other bugs lurking, considering the complexity, 
+though we need to reproduce it if we want to delve into it.
+
+On Wed, 12 Apr 2017, Alexey G wrote:
+>
+On Tue, 11 Apr 2017 15:32:09 -0700 (PDT)
+>
+Stefano Stabellini <address@hidden> wrote:
+>
+>
+> On Tue, 11 Apr 2017, hrg wrote:
+>
+> > On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini
+>
+> > <address@hidden> wrote:
+>
+> > > On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+>
+> > >> On Mon, 10 Apr 2017, hrg wrote:
+>
+> > >> > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+>
+> > >> > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+> > >> > >> Hi,
+>
+> > >> > >>
+>
+> > >> > >> In xen_map_cache_unlocked(), map to guest memory maybe in
+>
+> > >> > >> entry->next instead of first level entry (if map to rom other than
+>
+> > >> > >> guest memory comes first), while in xen_invalidate_map_cache(),
+>
+> > >> > >> when VM ballooned out memory, qemu did not invalidate cache
+>
+> > >> > >> entries
+>
+> > >> > >> in linked list(entry->next), so when VM balloon back in memory,
+>
+> > >> > >> gfns probably mapped to different mfns, thus if guest asks device
+>
+> > >> > >> to DMA to these GPA, qemu may DMA to stale MFNs.
+>
+> > >> > >>
+>
+> > >> > >> So I think in xen_invalidate_map_cache() linked lists should also
+>
+> > >> > >> be
+>
+> > >> > >> checked and invalidated.
+>
+> > >> > >>
+>
+> > >> > >> Whatâs your opinion? Is this a bug? Is my analyze correct?
+>
+> > >>
+>
+> > >> Yes, you are right. We need to go through the list for each element of
+>
+> > >> the array in xen_invalidate_map_cache. Can you come up with a patch?
+>
+> > >
+>
+> > > I spoke too soon. In the regular case there should be no locked mappings
+>
+> > > when xen_invalidate_map_cache is called (see the DPRINTF warning at the
+>
+> > > beginning of the functions). Without locked mappings, there should never
+>
+> > > be more than one element in each list (see xen_map_cache_unlocked:
+>
+> > > entry->lock == true is a necessary condition to append a new entry to
+>
+> > > the list, otherwise it is just remapped).
+>
+> > >
+>
+> > > Can you confirm that what you are seeing are locked mappings
+>
+> > > when xen_invalidate_map_cache is called? To find out, enable the DPRINTK
+>
+> > > by turning it into a printf or by defininig MAPCACHE_DEBUG.
+>
+> >
+>
+> > In fact, I think the DPRINTF above is incorrect too. In
+>
+> > pci_add_option_rom(), rtl8139 rom is locked mapped in
+>
+> > pci_add_option_rom->memory_region_get_ram_ptr (after
+>
+> > memory_region_init_ram). So actually I think we should remove the
+>
+> > DPRINTF warning as it is normal.
+>
+>
+>
+> Let me explain why the DPRINTF warning is there: emulated dma operations
+>
+> can involve locked mappings. Once a dma operation completes, the related
+>
+> mapping is unlocked and can be safely destroyed. But if we destroy a
+>
+> locked mapping in xen_invalidate_map_cache, while a dma is still
+>
+> ongoing, QEMU will crash. We cannot handle that case.
+>
+>
+>
+> However, the scenario you described is different. It has nothing to do
+>
+> with DMA. It looks like pci_add_option_rom calls
+>
+> memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a
+>
+> locked mapping and it is never unlocked or destroyed.
+>
+>
+>
+> It looks like "ptr" is not used after pci_add_option_rom returns. Does
+>
+> the append patch fix the problem you are seeing? For the proper fix, I
+>
+> think we probably need some sort of memory_region_unmap wrapper or maybe
+>
+> a call to address_space_unmap.
+>
+>
+Hmm, for some reason my message to the Xen-devel list got rejected but was
+>
+sent
+>
+to Qemu-devel instead, without any notice. Sorry if I'm missing something
+>
+obvious as a list newbie.
+>
+>
+Stefano, hrg,
+>
+>
+There is an issue with inconsistency between the list of normal
+>
+MapCacheEntry's
+>
+and their 'reverse' counterparts - MapCacheRev's in locked_entries.
+>
+When bad situation happens, there are multiple (locked) MapCacheEntry
+>
+entries in the bucket's linked list along with a number of MapCacheRev's. And
+>
+when it comes to a reverse lookup, xen-mapcache picks the wrong entry from the
+>
+first list and calculates a wrong pointer from it which may then be caught
+>
+with
+>
+the "Bad RAM offset" check (or not). Mapcache invalidation might be related to
+>
+this issue as well I think.
+>
+>
+I'll try to provide a test code which can reproduce the issue from the
+>
+guest side using an emulated IDE controller, though it's much simpler to
+>
+achieve
+>
+this result with an AHCI controller using multiple NCQ I/O commands. So far
+>
+I've
+>
+seen this issue only with Windows 7 (and above) guest on AHCI, but any block
+>
+I/O
+>
+DMA should be enough I think.
+That would be helpful. Please see if you can reproduce it after fixing
+the other issue (
+http://marc.info/?l=qemu-devel&m=149195042500707&w=2
+).
+
+On 2017/4/12 6:32, Stefano Stabellini wrote:
+On Tue, 11 Apr 2017, hrg wrote:
+On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini
+<address@hidden> wrote:
+On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+On Mon, 10 Apr 2017, hrg wrote:
+On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+Hi,
+
+In xen_map_cache_unlocked(), map to guest memory maybe in entry->next
+instead of first level entry (if map to rom other than guest memory
+comes first), while in xen_invalidate_map_cache(), when VM ballooned
+out memory, qemu did not invalidate cache entries in linked
+list(entry->next), so when VM balloon back in memory, gfns probably
+mapped to different mfns, thus if guest asks device to DMA to these
+GPA, qemu may DMA to stale MFNs.
+
+So I think in xen_invalidate_map_cache() linked lists should also be
+checked and invalidated.
+
+Whatâs your opinion? Is this a bug? Is my analyze correct?
+Yes, you are right. We need to go through the list for each element of
+the array in xen_invalidate_map_cache. Can you come up with a patch?
+I spoke too soon. In the regular case there should be no locked mappings
+when xen_invalidate_map_cache is called (see the DPRINTF warning at the
+beginning of the functions). Without locked mappings, there should never
+be more than one element in each list (see xen_map_cache_unlocked:
+entry->lock == true is a necessary condition to append a new entry to
+the list, otherwise it is just remapped).
+
+Can you confirm that what you are seeing are locked mappings
+when xen_invalidate_map_cache is called? To find out, enable the DPRINTK
+by turning it into a printf or by defininig MAPCACHE_DEBUG.
+In fact, I think the DPRINTF above is incorrect too. In
+pci_add_option_rom(), rtl8139 rom is locked mapped in
+pci_add_option_rom->memory_region_get_ram_ptr (after
+memory_region_init_ram). So actually I think we should remove the
+DPRINTF warning as it is normal.
+Let me explain why the DPRINTF warning is there: emulated dma operations
+can involve locked mappings. Once a dma operation completes, the related
+mapping is unlocked and can be safely destroyed. But if we destroy a
+locked mapping in xen_invalidate_map_cache, while a dma is still
+ongoing, QEMU will crash. We cannot handle that case.
+
+However, the scenario you described is different. It has nothing to do
+with DMA. It looks like pci_add_option_rom calls
+memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a
+locked mapping and it is never unlocked or destroyed.
+
+It looks like "ptr" is not used after pci_add_option_rom returns. Does
+the append patch fix the problem you are seeing? For the proper fix, I
+think we probably need some sort of memory_region_unmap wrapper or maybe
+a call to address_space_unmap.
+Yes, I think so, maybe this is the proper way to fix this.
+diff --git a/hw/pci/pci.c b/hw/pci/pci.c
+index e6b08e1..04f98b7 100644
+--- a/hw/pci/pci.c
++++ b/hw/pci/pci.c
+@@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool 
+is_default_rom,
+      }
+pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
++    xen_invalidate_map_cache_entry(ptr);
+  }
+static void pci_del_option_rom(PCIDevice *pdev)
+
+On Wed, 12 Apr 2017, Herongguang (Stephen) wrote:
+>
+On 2017/4/12 6:32, Stefano Stabellini wrote:
+>
+> On Tue, 11 Apr 2017, hrg wrote:
+>
+> > On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini
+>
+> > <address@hidden> wrote:
+>
+> > > On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+>
+> > > > On Mon, 10 Apr 2017, hrg wrote:
+>
+> > > > > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+>
+> > > > > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+>
+> > > > > > > Hi,
+>
+> > > > > > >
+>
+> > > > > > > In xen_map_cache_unlocked(), map to guest memory maybe in
+>
+> > > > > > > entry->next
+>
+> > > > > > > instead of first level entry (if map to rom other than guest
+>
+> > > > > > > memory
+>
+> > > > > > > comes first), while in xen_invalidate_map_cache(), when VM
+>
+> > > > > > > ballooned
+>
+> > > > > > > out memory, qemu did not invalidate cache entries in linked
+>
+> > > > > > > list(entry->next), so when VM balloon back in memory, gfns
+>
+> > > > > > > probably
+>
+> > > > > > > mapped to different mfns, thus if guest asks device to DMA to
+>
+> > > > > > > these
+>
+> > > > > > > GPA, qemu may DMA to stale MFNs.
+>
+> > > > > > >
+>
+> > > > > > > So I think in xen_invalidate_map_cache() linked lists should
+>
+> > > > > > > also be
+>
+> > > > > > > checked and invalidated.
+>
+> > > > > > >
+>
+> > > > > > > Whatâs your opinion? Is this a bug? Is my analyze correct?
+>
+> > > > Yes, you are right. We need to go through the list for each element of
+>
+> > > > the array in xen_invalidate_map_cache. Can you come up with a patch?
+>
+> > > I spoke too soon. In the regular case there should be no locked mappings
+>
+> > > when xen_invalidate_map_cache is called (see the DPRINTF warning at the
+>
+> > > beginning of the functions). Without locked mappings, there should never
+>
+> > > be more than one element in each list (see xen_map_cache_unlocked:
+>
+> > > entry->lock == true is a necessary condition to append a new entry to
+>
+> > > the list, otherwise it is just remapped).
+>
+> > >
+>
+> > > Can you confirm that what you are seeing are locked mappings
+>
+> > > when xen_invalidate_map_cache is called? To find out, enable the DPRINTK
+>
+> > > by turning it into a printf or by defininig MAPCACHE_DEBUG.
+>
+> > In fact, I think the DPRINTF above is incorrect too. In
+>
+> > pci_add_option_rom(), rtl8139 rom is locked mapped in
+>
+> > pci_add_option_rom->memory_region_get_ram_ptr (after
+>
+> > memory_region_init_ram). So actually I think we should remove the
+>
+> > DPRINTF warning as it is normal.
+>
+> Let me explain why the DPRINTF warning is there: emulated dma operations
+>
+> can involve locked mappings. Once a dma operation completes, the related
+>
+> mapping is unlocked and can be safely destroyed. But if we destroy a
+>
+> locked mapping in xen_invalidate_map_cache, while a dma is still
+>
+> ongoing, QEMU will crash. We cannot handle that case.
+>
+>
+>
+> However, the scenario you described is different. It has nothing to do
+>
+> with DMA. It looks like pci_add_option_rom calls
+>
+> memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a
+>
+> locked mapping and it is never unlocked or destroyed.
+>
+>
+>
+> It looks like "ptr" is not used after pci_add_option_rom returns. Does
+>
+> the append patch fix the problem you are seeing? For the proper fix, I
+>
+> think we probably need some sort of memory_region_unmap wrapper or maybe
+>
+> a call to address_space_unmap.
+>
+>
+Yes, I think so, maybe this is the proper way to fix this.
+Would you be up for sending a proper patch and testing it? We cannot call
+xen_invalidate_map_cache_entry directly from pci.c though, it would need
+to be one of the other functions like address_space_unmap for example.
+
+
+>
+> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
+>
+> index e6b08e1..04f98b7 100644
+>
+> --- a/hw/pci/pci.c
+>
+> +++ b/hw/pci/pci.c
+>
+> @@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool
+>
+> is_default_rom,
+>
+>       }
+>
+>         pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
+>
+> +    xen_invalidate_map_cache_entry(ptr);
+>
+>   }
+>
+>     static void pci_del_option_rom(PCIDevice *pdev)
+
+On 2017/4/13 7:51, Stefano Stabellini wrote:
+On Wed, 12 Apr 2017, Herongguang (Stephen) wrote:
+On 2017/4/12 6:32, Stefano Stabellini wrote:
+On Tue, 11 Apr 2017, hrg wrote:
+On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini
+<address@hidden> wrote:
+On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+On Mon, 10 Apr 2017, hrg wrote:
+On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden> wrote:
+On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden> wrote:
+Hi,
+
+In xen_map_cache_unlocked(), map to guest memory maybe in
+entry->next
+instead of first level entry (if map to rom other than guest
+memory
+comes first), while in xen_invalidate_map_cache(), when VM
+ballooned
+out memory, qemu did not invalidate cache entries in linked
+list(entry->next), so when VM balloon back in memory, gfns
+probably
+mapped to different mfns, thus if guest asks device to DMA to
+these
+GPA, qemu may DMA to stale MFNs.
+
+So I think in xen_invalidate_map_cache() linked lists should
+also be
+checked and invalidated.
+
+Whatâs your opinion? Is this a bug? Is my analyze correct?
+Yes, you are right. We need to go through the list for each element of
+the array in xen_invalidate_map_cache. Can you come up with a patch?
+I spoke too soon. In the regular case there should be no locked mappings
+when xen_invalidate_map_cache is called (see the DPRINTF warning at the
+beginning of the functions). Without locked mappings, there should never
+be more than one element in each list (see xen_map_cache_unlocked:
+entry->lock == true is a necessary condition to append a new entry to
+the list, otherwise it is just remapped).
+
+Can you confirm that what you are seeing are locked mappings
+when xen_invalidate_map_cache is called? To find out, enable the DPRINTK
+by turning it into a printf or by defininig MAPCACHE_DEBUG.
+In fact, I think the DPRINTF above is incorrect too. In
+pci_add_option_rom(), rtl8139 rom is locked mapped in
+pci_add_option_rom->memory_region_get_ram_ptr (after
+memory_region_init_ram). So actually I think we should remove the
+DPRINTF warning as it is normal.
+Let me explain why the DPRINTF warning is there: emulated dma operations
+can involve locked mappings. Once a dma operation completes, the related
+mapping is unlocked and can be safely destroyed. But if we destroy a
+locked mapping in xen_invalidate_map_cache, while a dma is still
+ongoing, QEMU will crash. We cannot handle that case.
+
+However, the scenario you described is different. It has nothing to do
+with DMA. It looks like pci_add_option_rom calls
+memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a
+locked mapping and it is never unlocked or destroyed.
+
+It looks like "ptr" is not used after pci_add_option_rom returns. Does
+the append patch fix the problem you are seeing? For the proper fix, I
+think we probably need some sort of memory_region_unmap wrapper or maybe
+a call to address_space_unmap.
+Yes, I think so, maybe this is the proper way to fix this.
+Would you be up for sending a proper patch and testing it? We cannot call
+xen_invalidate_map_cache_entry directly from pci.c though, it would need
+to be one of the other functions like address_space_unmap for example.
+Yes, I will look into this.
+diff --git a/hw/pci/pci.c b/hw/pci/pci.c
+index e6b08e1..04f98b7 100644
+--- a/hw/pci/pci.c
++++ b/hw/pci/pci.c
+@@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool
+is_default_rom,
+       }
+         pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
++    xen_invalidate_map_cache_entry(ptr);
+   }
+     static void pci_del_option_rom(PCIDevice *pdev)
+
+On Thu, 13 Apr 2017, Herongguang (Stephen) wrote:
+>
+On 2017/4/13 7:51, Stefano Stabellini wrote:
+>
+> On Wed, 12 Apr 2017, Herongguang (Stephen) wrote:
+>
+> > On 2017/4/12 6:32, Stefano Stabellini wrote:
+>
+> > > On Tue, 11 Apr 2017, hrg wrote:
+>
+> > > > On Tue, Apr 11, 2017 at 3:50 AM, Stefano Stabellini
+>
+> > > > <address@hidden> wrote:
+>
+> > > > > On Mon, 10 Apr 2017, Stefano Stabellini wrote:
+>
+> > > > > > On Mon, 10 Apr 2017, hrg wrote:
+>
+> > > > > > > On Sun, Apr 9, 2017 at 11:55 PM, hrg <address@hidden>
+>
+> > > > > > > wrote:
+>
+> > > > > > > > On Sun, Apr 9, 2017 at 11:52 PM, hrg <address@hidden>
+>
+> > > > > > > > wrote:
+>
+> > > > > > > > > Hi,
+>
+> > > > > > > > >
+>
+> > > > > > > > > In xen_map_cache_unlocked(), map to guest memory maybe in
+>
+> > > > > > > > > entry->next
+>
+> > > > > > > > > instead of first level entry (if map to rom other than guest
+>
+> > > > > > > > > memory
+>
+> > > > > > > > > comes first), while in xen_invalidate_map_cache(), when VM
+>
+> > > > > > > > > ballooned
+>
+> > > > > > > > > out memory, qemu did not invalidate cache entries in linked
+>
+> > > > > > > > > list(entry->next), so when VM balloon back in memory, gfns
+>
+> > > > > > > > > probably
+>
+> > > > > > > > > mapped to different mfns, thus if guest asks device to DMA
+>
+> > > > > > > > > to
+>
+> > > > > > > > > these
+>
+> > > > > > > > > GPA, qemu may DMA to stale MFNs.
+>
+> > > > > > > > >
+>
+> > > > > > > > > So I think in xen_invalidate_map_cache() linked lists should
+>
+> > > > > > > > > also be
+>
+> > > > > > > > > checked and invalidated.
+>
+> > > > > > > > >
+>
+> > > > > > > > > Whatâs your opinion? Is this a bug? Is my analyze correct?
+>
+> > > > > > Yes, you are right. We need to go through the list for each
+>
+> > > > > > element of
+>
+> > > > > > the array in xen_invalidate_map_cache. Can you come up with a
+>
+> > > > > > patch?
+>
+> > > > > I spoke too soon. In the regular case there should be no locked
+>
+> > > > > mappings
+>
+> > > > > when xen_invalidate_map_cache is called (see the DPRINTF warning at
+>
+> > > > > the
+>
+> > > > > beginning of the functions). Without locked mappings, there should
+>
+> > > > > never
+>
+> > > > > be more than one element in each list (see xen_map_cache_unlocked:
+>
+> > > > > entry->lock == true is a necessary condition to append a new entry
+>
+> > > > > to
+>
+> > > > > the list, otherwise it is just remapped).
+>
+> > > > >
+>
+> > > > > Can you confirm that what you are seeing are locked mappings
+>
+> > > > > when xen_invalidate_map_cache is called? To find out, enable the
+>
+> > > > > DPRINTK
+>
+> > > > > by turning it into a printf or by defininig MAPCACHE_DEBUG.
+>
+> > > > In fact, I think the DPRINTF above is incorrect too. In
+>
+> > > > pci_add_option_rom(), rtl8139 rom is locked mapped in
+>
+> > > > pci_add_option_rom->memory_region_get_ram_ptr (after
+>
+> > > > memory_region_init_ram). So actually I think we should remove the
+>
+> > > > DPRINTF warning as it is normal.
+>
+> > > Let me explain why the DPRINTF warning is there: emulated dma operations
+>
+> > > can involve locked mappings. Once a dma operation completes, the related
+>
+> > > mapping is unlocked and can be safely destroyed. But if we destroy a
+>
+> > > locked mapping in xen_invalidate_map_cache, while a dma is still
+>
+> > > ongoing, QEMU will crash. We cannot handle that case.
+>
+> > >
+>
+> > > However, the scenario you described is different. It has nothing to do
+>
+> > > with DMA. It looks like pci_add_option_rom calls
+>
+> > > memory_region_get_ram_ptr to map the rtl8139 rom. The mapping is a
+>
+> > > locked mapping and it is never unlocked or destroyed.
+>
+> > >
+>
+> > > It looks like "ptr" is not used after pci_add_option_rom returns. Does
+>
+> > > the append patch fix the problem you are seeing? For the proper fix, I
+>
+> > > think we probably need some sort of memory_region_unmap wrapper or maybe
+>
+> > > a call to address_space_unmap.
+>
+> >
+>
+> > Yes, I think so, maybe this is the proper way to fix this.
+>
+>
+>
+> Would you be up for sending a proper patch and testing it? We cannot call
+>
+> xen_invalidate_map_cache_entry directly from pci.c though, it would need
+>
+> to be one of the other functions like address_space_unmap for example.
+>
+>
+>
+>
+>
+Yes, I will look into this.
+Any updates?
+
+
+>
+> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
+>
+> > > index e6b08e1..04f98b7 100644
+>
+> > > --- a/hw/pci/pci.c
+>
+> > > +++ b/hw/pci/pci.c
+>
+> > > @@ -2242,6 +2242,7 @@ static void pci_add_option_rom(PCIDevice *pdev,
+>
+> > > bool
+>
+> > > is_default_rom,
+>
+> > >        }
+>
+> > >          pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
+>
+> > > +    xen_invalidate_map_cache_entry(ptr);
+>
+> > >    }
+>
+> > >      static void pci_del_option_rom(PCIDevice *pdev)
+>
+
diff --git a/results/classifier/zero-shot/007/permissions/67821138 b/results/classifier/zero-shot/007/permissions/67821138
new file mode 100644
index 000000000..37755ea48
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/67821138
@@ -0,0 +1,209 @@
+permissions: 0.935
+device: 0.916
+PID: 0.909
+boot: 0.881
+debug: 0.870
+other: 0.853
+performance: 0.845
+semantic: 0.843
+graphic: 0.826
+files: 0.824
+KVM: 0.822
+vnc: 0.734
+network: 0.718
+socket: 0.699
+
+[BUG, RFC] Base node is in RW after making external snapshot
+
+Hi everyone,
+
+When making an external snapshot, we end up in a situation when 2 block
+graph nodes related to the same image file (format and storage nodes)
+have different RO flags set on them.
+
+E.g.
+
+# ls -la /proc/PID/fd
+lrwx------ 1 root qemu 64 Apr 24 20:14 12 -> /path/to/harddisk.hdd
+
+# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}'
+--pretty | egrep '"node-name"|"ro"'
+      "ro": false,
+      "node-name": "libvirt-1-format",
+      "ro": false,
+      "node-name": "libvirt-1-storage",
+
+# virsh snapshot-create-as VM --name snap --disk-only
+Domain snapshot snap created
+
+# ls -la /proc/PID/fd
+lr-x------ 1 root qemu 64 Apr 24 20:14 134 -> /path/to/harddisk.hdd
+lrwx------ 1 root qemu 64 Apr 24 20:14 135 -> /path/to/harddisk.snap
+
+# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}'
+--pretty | egrep '"node-name"|"ro"'
+      "ro": false,
+      "node-name": "libvirt-2-format",
+      "ro": false,
+      "node-name": "libvirt-2-storage",
+      "ro": true,
+      "node-name": "libvirt-1-format",
+      "ro": false,                        <--------------
+      "node-name": "libvirt-1-storage",
+
+File descriptor has been reopened in RO, but "libvirt-1-storage" node
+still has RW permissions set.
+
+I'm wondering it this a bug or this is intended?  Looks like a bug to
+me, although I see that some iotests (e.g. 273) expect 2 nodes related
+to the same image file to have different RO flags.
+
+bdrv_reopen_set_read_only()
+  bdrv_reopen()
+    bdrv_reopen_queue()
+      bdrv_reopen_queue_child()
+    bdrv_reopen_multiple()
+      bdrv_list_refresh_perms()
+        bdrv_topological_dfs()
+        bdrv_do_refresh_perms()
+      bdrv_reopen_commit()
+
+In the stack above bdrv_reopen_set_read_only() is only being called for
+the parent (libvirt-1-format) node.  There're 2 lists: BDSs from
+refresh_list are used by bdrv_drv_set_perm and this leads to actual
+reopen with RO of the file descriptor.  And then there's reopen queue
+bs_queue -- BDSs from this queue get their parameters updated.  While
+refresh_list ends up having the whole subtree (including children, this
+is done in bdrv_topological_dfs()) bs_queue only has the parent.  And
+that is because storage (child) node's (bs->inherits_from == NULL), so
+bdrv_reopen_queue_child() never adds it to the queue.  Could it be the
+source of this bug?
+
+Anyway, would greatly appreciate a clarification.
+
+Andrey
+
+On 4/24/24 21:00, Andrey Drobyshev wrote:
+>
+Hi everyone,
+>
+>
+When making an external snapshot, we end up in a situation when 2 block
+>
+graph nodes related to the same image file (format and storage nodes)
+>
+have different RO flags set on them.
+>
+>
+E.g.
+>
+>
+# ls -la /proc/PID/fd
+>
+lrwx------ 1 root qemu 64 Apr 24 20:14 12 -> /path/to/harddisk.hdd
+>
+>
+# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}'
+>
+--pretty | egrep '"node-name"|"ro"'
+>
+"ro": false,
+>
+"node-name": "libvirt-1-format",
+>
+"ro": false,
+>
+"node-name": "libvirt-1-storage",
+>
+>
+# virsh snapshot-create-as VM --name snap --disk-only
+>
+Domain snapshot snap created
+>
+>
+# ls -la /proc/PID/fd
+>
+lr-x------ 1 root qemu 64 Apr 24 20:14 134 -> /path/to/harddisk.hdd
+>
+lrwx------ 1 root qemu 64 Apr 24 20:14 135 -> /path/to/harddisk.snap
+>
+>
+# virsh qemu-monitor-command VM '{"execute": "query-named-block-nodes"}'
+>
+--pretty | egrep '"node-name"|"ro"'
+>
+"ro": false,
+>
+"node-name": "libvirt-2-format",
+>
+"ro": false,
+>
+"node-name": "libvirt-2-storage",
+>
+"ro": true,
+>
+"node-name": "libvirt-1-format",
+>
+"ro": false,                        <--------------
+>
+"node-name": "libvirt-1-storage",
+>
+>
+File descriptor has been reopened in RO, but "libvirt-1-storage" node
+>
+still has RW permissions set.
+>
+>
+I'm wondering it this a bug or this is intended?  Looks like a bug to
+>
+me, although I see that some iotests (e.g. 273) expect 2 nodes related
+>
+to the same image file to have different RO flags.
+>
+>
+bdrv_reopen_set_read_only()
+>
+bdrv_reopen()
+>
+bdrv_reopen_queue()
+>
+bdrv_reopen_queue_child()
+>
+bdrv_reopen_multiple()
+>
+bdrv_list_refresh_perms()
+>
+bdrv_topological_dfs()
+>
+bdrv_do_refresh_perms()
+>
+bdrv_reopen_commit()
+>
+>
+In the stack above bdrv_reopen_set_read_only() is only being called for
+>
+the parent (libvirt-1-format) node.  There're 2 lists: BDSs from
+>
+refresh_list are used by bdrv_drv_set_perm and this leads to actual
+>
+reopen with RO of the file descriptor.  And then there's reopen queue
+>
+bs_queue -- BDSs from this queue get their parameters updated.  While
+>
+refresh_list ends up having the whole subtree (including children, this
+>
+is done in bdrv_topological_dfs()) bs_queue only has the parent.  And
+>
+that is because storage (child) node's (bs->inherits_from == NULL), so
+>
+bdrv_reopen_queue_child() never adds it to the queue.  Could it be the
+>
+source of this bug?
+>
+>
+Anyway, would greatly appreciate a clarification.
+>
+>
+Andrey
+Friendly ping.  Could somebody confirm that it is a bug indeed?
+
diff --git a/results/classifier/zero-shot/007/permissions/74715356 b/results/classifier/zero-shot/007/permissions/74715356
new file mode 100644
index 000000000..d579d4ad3
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/74715356
@@ -0,0 +1,136 @@
+permissions: 0.930
+other: 0.927
+semantic: 0.916
+debug: 0.907
+performance: 0.905
+device: 0.900
+PID: 0.897
+graphic: 0.894
+boot: 0.881
+KVM: 0.863
+vnc: 0.850
+files: 0.850
+socket: 0.843
+network: 0.838
+
+[Bug] x86 EFLAGS refresh is not happening correctly
+
+Hello,
+I'm posting this here instead of opening an issue as it is not clear to me if this is a bug or not.
+The issue is located in function "cpu_compute_eflags" in target/i386/cpu.h
+(
+https://gitlab.com/qemu-project/qemu/-/blob/master/target/i386/cpu.h#L2071
+)
+This function is exectued in an out of cpu loop context.
+It is used to synchronize TCG internal eflags registers (CC_OP, CC_SRC,Â  etc...) with the CPU eflags field upon loop exit.
+It does:
+Â  Â  eflags
+|=
+cpu_cc_compute_all
+(
+env
+,
+CC_OP
+)
+|
+(
+env
+->
+df
+&
+DF_MASK
+);
+Shouldn't it be:
+Â  Â  Â
+eflags
+=
+cpu_cc_compute_all
+(
+env
+,
+CC_OP
+)
+|
+(
+env
+->
+df
+&
+DF_MASK
+);
+as eflags is entirely reevaluated by "cpu_cc_compute_all" ?
+Thanks,
+Kind regards,
+Stevie
+
+On 05/08/21 11:51, Stevie Lavern wrote:
+Shouldn't it be:
+eflags = cpu_cc_compute_all(env, CC_OP) | (env->df & DF_MASK);
+as eflags is entirely reevaluated by "cpu_cc_compute_all" ?
+No, both are wrong.  env->eflags contains flags other than the
+arithmetic flags (OF/SF/ZF/AF/PF/CF) and those have to be preserved.
+The right code is in helper_read_eflags.  You can move it into
+cpu_compute_eflags, and make helper_read_eflags use it.
+Paolo
+
+On 05/08/21 13:24, Paolo Bonzini wrote:
+On 05/08/21 11:51, Stevie Lavern wrote:
+Shouldn't it be:
+eflags = cpu_cc_compute_all(env, CC_OP) | (env->df & DF_MASK);
+as eflags is entirely reevaluated by "cpu_cc_compute_all" ?
+No, both are wrong.Â  env->eflags contains flags other than the
+arithmetic flags (OF/SF/ZF/AF/PF/CF) and those have to be preserved.
+The right code is in helper_read_eflags.Â  You can move it into
+cpu_compute_eflags, and make helper_read_eflags use it.
+Ah, actually the two are really the same, the TF/VM bits do not apply to
+cpu_compute_eflags so it's correct.
+What seems wrong is migration of the EFLAGS register.  There should be
+code in cpu_pre_save and cpu_post_load to special-case it and setup
+CC_DST/CC_OP as done in cpu_load_eflags.
+Also, cpu_load_eflags should assert that update_mask does not include
+any of the arithmetic flags.
+Paolo
+
+Thank for your reply!
+It's still a bit cryptic for me.
+I think i need to precise that I'm using a x86_64 custom user-mode,base on linux user-mode, that i'm developing (unfortunately i cannot share the code) with modifications in the translation loop (I've added cpu loop exits on specific instructions which are not control flow instructions).
+If my understanding is correct, in the user-mode case 'cpu_compute_eflags' is called directly by 'x86_cpu_exec_exit' with the intention of synchronizing the CPU env->eflags field with its real value (represented by the CC_* fields).
+I'm not sure how 'cpu_pre_save' and 'cpu_post_load' are involved in this case.
+Â
+As you said in your first email, 'helper_read_eflags' seems to be the correct way to go.
+Here is some detail about my current experimentation/understanding of this "issue":
+With the current implementationÂ
+Â  Â  Â  Â  Â
+eflags |= cpu_cc_compute_all(env, CC_OP) | (env->df & DF_MASK);
+if I exit the loop with a CC_OP different from CC_OP_EFLAGS, I found that the resulting env->eflags may be invalid.
+In my test case, the loop was exiting with eflags = 0x44 and CC_OP = CC_OP_SUBL with CC_DST=1, CC_SRC=258, CC_SRC2=0.
+While 'cpu_cc_compute_all' computes the correct flags (ZF:0, PF:0), the result will still be 0x44 (ZF:1, PF:1) due to the 'or' operation, thus leading to an incorrect eflags value loaded into the CPU env.Â
+In my case, after loop reentry, it led to an invalid branch to be taken.
+Thanks for your time!
+Regards
+Stevie
+Â
+On Thu, Aug 5, 2021 at 1:33 PM Paolo Bonzini <
+pbonzini@redhat.com
+> wrote:
+On 05/08/21 13:24, Paolo Bonzini wrote:
+> On 05/08/21 11:51, Stevie Lavern wrote:
+>>
+>> Shouldn't it be:
+>> eflags = cpu_cc_compute_all(env, CC_OP) | (env->df & DF_MASK);
+>> as eflags is entirely reevaluated by "cpu_cc_compute_all" ?
+>
+> No, both are wrong.Â  env->eflags contains flags other than the
+> arithmetic flags (OF/SF/ZF/AF/PF/CF) and those have to be preserved.
+>
+> The right code is in helper_read_eflags.Â  You can move it into
+> cpu_compute_eflags, and make helper_read_eflags use it.
+Ah, actually the two are really the same, the TF/VM bits do not apply to
+cpu_compute_eflags so it's correct.
+What seems wrong is migration of the EFLAGS register.Â  There should be
+code in cpu_pre_save and cpu_post_load to special-case it and setup
+CC_DST/CC_OP as done in cpu_load_eflags.
+Also, cpu_load_eflags should assert that update_mask does not include
+any of the arithmetic flags.
+Paolo
+
diff --git a/results/classifier/zero-shot/007/permissions/85542195 b/results/classifier/zero-shot/007/permissions/85542195
new file mode 100644
index 000000000..328f31033
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/85542195
@@ -0,0 +1,130 @@
+permissions: 0.968
+PID: 0.945
+other: 0.944
+semantic: 0.941
+graphic: 0.938
+device: 0.936
+performance: 0.933
+boot: 0.932
+vnc: 0.923
+files: 0.920
+debug: 0.915
+socket: 0.905
+network: 0.899
+KVM: 0.898
+
+[Qemu-devel] [Bug in qemu-system-ppc running Mac OS 9 on Windows 10]
+
+Hi all,
+
+I've been experiencing issues when installing Mac OS 9.x using
+qemu-system-ppc.exe in Windows 10. After booting from CD image,
+partitioning a fresh disk image often hangs Qemu. When using a
+pre-partitioned disk image, the OS installation process halts
+somewhere during the process. The issues can be resolved by setting
+qemu-system-ppc.exe to run in Windows 7 compatibility mode.
+AFAIK all Qemu builds for Windows since Mac OS 9 became available as
+guest are affected.
+The issue is reproducible by installing Qemu for Windows from Stephan
+Weil on Windows 10 and boot/install Mac OS 9.x
+
+Best regards and thanks for looking into this,
+Howard
+
+On Nov 25, 2016, at 9:26 AM, address@hidden wrote:
+Hi all,
+
+I've been experiencing issues when installing Mac OS 9.x using
+qemu-system-ppc.exe in Windows 10. After booting from CD image,
+partitioning a fresh disk image often hangs Qemu. When using a
+pre-partitioned disk image, the OS installation process halts
+somewhere during the process. The issues can be resolved by setting
+qemu-system-ppc.exe to run in Windows 7 compatibility mode.
+AFAIK all Qemu builds for Windows since Mac OS 9 became available as
+guest are affected.
+The issue is reproducible by installing Qemu for Windows from Stephan
+Weil on Windows 10 and boot/install Mac OS 9.x
+
+Best regards and thanks for looking into this,
+Howard
+I assume there was some kind of behavior change for some of the
+Windows API between Windows 7 and Windows 10, that is my guess as to
+why the compatibility mode works. Could you run 'make check' on your
+system, once in Windows 7 and once in Windows 10. Maybe the tests
+will tell us something. I'm hoping that one of the tests succeeds in
+Windows 7 and fails in Windows 10. That would help us pinpoint what
+the problem is.
+What I mean by run in Windows 7 is set the mingw environment to run
+in Windows 7 compatibility mode (if possible). If you have Windows 7
+on another partition you could boot from, that would be better.
+Good luck.
+p.s. use 'make check -k' to allow all the tests to run (even if one
+or more of the tests fails).
+
+>
+> Hi all,
+>
+>
+>
+> I've been experiencing issues when installing Mac OS 9.x using
+>
+> qemu-system-ppc.exe in Windows 10. After booting from CD image,
+>
+> partitioning a fresh disk image often hangs Qemu. When using a
+>
+> pre-partitioned disk image, the OS installation process halts
+>
+> somewhere during the process. The issues can be resolved by setting
+>
+> qemu-system-ppc.exe to run in Windows 7 compatibility mode.
+>
+> AFAIK all Qemu builds for Windows since Mac OS 9 became available as
+>
+> guest are affected.
+>
+> The issue is reproducible by installing Qemu for Windows from Stephan
+>
+> Weil on Windows 10 and boot/install Mac OS 9.x
+>
+>
+>
+> Best regards and thanks for looking into this,
+>
+> Howard
+>
+>
+>
+I assume there was some kind of behavior change for some of the Windows API
+>
+between Windows 7 and Windows 10, that is my guess as to why the
+>
+compatibility mode works. Could you run 'make check' on your system, once in
+>
+Windows 7 and once in Windows 10. Maybe the tests will tell us something.
+>
+I'm hoping that one of the tests succeeds in Windows 7 and fails in Windows
+>
+10. That would help us pinpoint what the problem is.
+>
+>
+What I mean by run in Windows 7 is set the mingw environment to run in
+>
+Windows 7 compatibility mode (if possible). If you have Windows 7 on another
+>
+partition you could boot from, that would be better.
+>
+>
+Good luck.
+>
+>
+p.s. use 'make check -k' to allow all the tests to run (even if one or more
+>
+of the tests fails).
+Hi,
+
+Thank you for you suggestion, but I have no means to run the check you
+suggest. I cross-compile from Linux.
+
+Best regards,
+Howard
+
diff --git a/results/classifier/zero-shot/007/permissions/88225572 b/results/classifier/zero-shot/007/permissions/88225572
new file mode 100644
index 000000000..292ea66b8
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/88225572
@@ -0,0 +1,2910 @@
+permissions: 0.992
+other: 0.987
+debug: 0.986
+PID: 0.984
+semantic: 0.976
+graphic: 0.974
+device: 0.970
+boot: 0.969
+performance: 0.965
+vnc: 0.958
+files: 0.957
+socket: 0.955
+network: 0.950
+KVM: 0.924
+
+[BUG qemu 4.0] segfault when unplugging virtio-blk-pci device
+
+Hi,
+
+I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I
+think it's because io completion hits use-after-free when device is
+already gone. Is this a known bug that has been fixed? (I went through
+the git log but didn't find anything obvious).
+
+gdb backtrace is:
+
+Core was generated by `/usr/local/libexec/qemu-kvm -name 
+sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+Program terminated with signal 11, Segmentation fault.
+#0 object_get_class (obj=obj@entry=0x0) at 
+/usr/src/debug/qemu-4.0/qom/object.c:903
+903        return obj->class;
+(gdb) bt
+#0  object_get_class (obj=obj@entry=0x0) at 
+/usr/src/debug/qemu-4.0/qom/object.c:903
+#1 Â 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0,
+Â  Â  vector=<optimized out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+#2 Â 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete (
+Â  Â  opaque=0x558a2f2fd420, ret=0)
+Â  Â  at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+#3 Â 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+Â  Â  at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+#4 Â 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>,
+Â  Â  i1=<optimized out>) at /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+#5 Â 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+#6 Â 0x00007fff9ed75780 in ?? ()
+#7 Â 0x0000000000000000 in ?? ()
+
+It seems like qemu was completing a discard/write_zero request, but
+parent BusState was already freed & set to NULL.
+
+Do we need to drain all pending request before unrealizing virtio-blk
+device? Like the following patch proposed?
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+If more info is needed, please let me know.
+
+Thanks,
+Eryu
+
+On Tue, 31 Dec 2019 18:34:34 +0800
+Eryu Guan <address@hidden> wrote:
+
+>
+Hi,
+>
+>
+I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I
+>
+think it's because io completion hits use-after-free when device is
+>
+already gone. Is this a known bug that has been fixed? (I went through
+>
+the git log but didn't find anything obvious).
+>
+>
+gdb backtrace is:
+>
+>
+Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+Program terminated with signal 11, Segmentation fault.
+>
+#0 object_get_class (obj=obj@entry=0x0) at
+>
+/usr/src/debug/qemu-4.0/qom/object.c:903
+>
+903        return obj->class;
+>
+(gdb) bt
+>
+#0  object_get_class (obj=obj@entry=0x0) at
+>
+/usr/src/debug/qemu-4.0/qom/object.c:903
+>
+#1 Â 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0,
+>
+Â  Â  vector=<optimized out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+#2 Â 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete (
+>
+Â  Â  opaque=0x558a2f2fd420, ret=0)
+>
+Â  Â  at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+#3 Â 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+>
+Â  Â  at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+#4 Â 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>,
+>
+Â  Â  i1=<optimized out>) at
+>
+/usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+#5 Â 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+#6 Â 0x00007fff9ed75780 in ?? ()
+>
+#7 Â 0x0000000000000000 in ?? ()
+>
+>
+It seems like qemu was completing a discard/write_zero request, but
+>
+parent BusState was already freed & set to NULL.
+>
+>
+Do we need to drain all pending request before unrealizing virtio-blk
+>
+device? Like the following patch proposed?
+>
+>
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+>
+If more info is needed, please let me know.
+may be this will help:
+https://patchwork.kernel.org/patch/11213047/
+>
+>
+Thanks,
+>
+Eryu
+>
+
+On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
+>
+On Tue, 31 Dec 2019 18:34:34 +0800
+>
+Eryu Guan <address@hidden> wrote:
+>
+>
+> Hi,
+>
+>
+>
+> I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I
+>
+> think it's because io completion hits use-after-free when device is
+>
+> already gone. Is this a known bug that has been fixed? (I went through
+>
+> the git log but didn't find anything obvious).
+>
+>
+>
+> gdb backtrace is:
+>
+>
+>
+> Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+> Program terminated with signal 11, Segmentation fault.
+>
+> #0 object_get_class (obj=obj@entry=0x0) at
+>
+> /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> 903        return obj->class;
+>
+> (gdb) bt
+>
+> #0  object_get_class (obj=obj@entry=0x0) at
+>
+> /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> #1 Â 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0,
+>
+> Â  Â  vector=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+> #2 Â 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete (
+>
+> Â  Â  opaque=0x558a2f2fd420, ret=0)
+>
+> Â  Â  at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+> #3 Â 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+>
+> Â  Â  at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+> #4 Â 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>,
+>
+> Â  Â  i1=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+> #5 Â 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+> #6 Â 0x00007fff9ed75780 in ?? ()
+>
+> #7 Â 0x0000000000000000 in ?? ()
+>
+>
+>
+> It seems like qemu was completing a discard/write_zero request, but
+>
+> parent BusState was already freed & set to NULL.
+>
+>
+>
+> Do we need to drain all pending request before unrealizing virtio-blk
+>
+> device? Like the following patch proposed?
+>
+>
+>
+>
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+>
+>
+> If more info is needed, please let me know.
+>
+>
+may be this will help:
+https://patchwork.kernel.org/patch/11213047/
+Yeah, this looks promising! I'll try it out (though it's a one-time
+crash for me). Thanks!
+
+Eryu
+
+On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote:
+>
+On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
+>
+> On Tue, 31 Dec 2019 18:34:34 +0800
+>
+> Eryu Guan <address@hidden> wrote:
+>
+>
+>
+> > Hi,
+>
+> >
+>
+> > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I
+>
+> > think it's because io completion hits use-after-free when device is
+>
+> > already gone. Is this a known bug that has been fixed? (I went through
+>
+> > the git log but didn't find anything obvious).
+>
+> >
+>
+> > gdb backtrace is:
+>
+> >
+>
+> > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+> > Program terminated with signal 11, Segmentation fault.
+>
+> > #0 object_get_class (obj=obj@entry=0x0) at
+>
+> > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > 903        return obj->class;
+>
+> > (gdb) bt
+>
+> > #0  object_get_class (obj=obj@entry=0x0) at
+>
+> > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > #1 Â 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0,
+>
+> > Â  Â  vector=<optimized out>) at
+>
+> > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+> > #2 Â 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete (
+>
+> > Â  Â  opaque=0x558a2f2fd420, ret=0)
+>
+> > Â  Â  at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+> > #3 Â 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+>
+> > Â  Â  at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+> > #4 Â 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>,
+>
+> > Â  Â  i1=<optimized out>) at
+>
+> > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+> > #5 Â 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+> > #6 Â 0x00007fff9ed75780 in ?? ()
+>
+> > #7 Â 0x0000000000000000 in ?? ()
+>
+> >
+>
+> > It seems like qemu was completing a discard/write_zero request, but
+>
+> > parent BusState was already freed & set to NULL.
+>
+> >
+>
+> > Do we need to drain all pending request before unrealizing virtio-blk
+>
+> > device? Like the following patch proposed?
+>
+> >
+>
+> >
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+> >
+>
+> > If more info is needed, please let me know.
+>
+>
+>
+> may be this will help:
+https://patchwork.kernel.org/patch/11213047/
+>
+>
+Yeah, this looks promising! I'll try it out (though it's a one-time
+>
+crash for me). Thanks!
+After applying this patch, I don't see the original segfaut and
+backtrace, but I see this crash
+
+[Thread debugging using libthread_db enabled]
+Using host libthread_db library "/lib64/libthread_db.so.1".
+Core was generated by `/usr/local/libexec/qemu-kvm -name 
+sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'.
+Program terminated with signal 11, Segmentation fault.
+#0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, 
+addr=0, val=<optimized out>, size=<optimized out>) at 
+/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+1324        VirtIOPCIProxy *proxy = 
+VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent);
+Missing separate debuginfos, use: debuginfo-install 
+glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64 
+libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64 
+libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64 
+pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64
+(gdb) bt
+#0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0, 
+addr=0, val=<optimized out>, size=<optimized out>) at 
+/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+#1  0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>, 
+addr=<optimized out>, value=<optimized out>, size=<optimized out>, 
+shift=<optimized out>, mask=<optimized out>, attrs=...) at 
+/usr/src/debug/qemu-4.0/memory.c:502
+#2  0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0, 
+value=value@entry=0x7fcdeab1b8a8, size=size@entry=2, access_size_min=<optimized 
+out>, access_size_max=<optimized out>, access_fn=0x561216835ac0 
+<memory_region_write_accessor>, mr=0x56121846d340, attrs=...)
+    at /usr/src/debug/qemu-4.0/memory.c:568
+#3  0x0000561216837c66 in memory_region_dispatch_write 
+(mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2, 
+attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503
+#4  0x00005612167e036f in flatview_write_continue (fv=fv@entry=0x56121852edd0, 
+addr=addr@entry=841813602304, attrs=..., buf=buf@entry=0x7fce7dd97028 <Address 
+0x7fce7dd97028 out of bounds>, len=len@entry=2, addr1=<optimized out>, 
+l=<optimized out>, mr=0x56121846d340)
+    at /usr/src/debug/qemu-4.0/exec.c:3279
+#5  0x00005612167e0506 in flatview_write (fv=0x56121852edd0, addr=841813602304, 
+attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>, len=2) at 
+/usr/src/debug/qemu-4.0/exec.c:3318
+#6  0x00005612167e4a1b in address_space_write (as=<optimized out>, 
+addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at 
+/usr/src/debug/qemu-4.0/exec.c:3408
+#7  0x00005612167e4aa5 in address_space_rw (as=<optimized out>, addr=<optimized 
+out>, attrs=..., attrs@entry=..., buf=buf@entry=0x7fce7dd97028 <Address 
+0x7fce7dd97028 out of bounds>, len=<optimized out>, is_write=<optimized out>) 
+at /usr/src/debug/qemu-4.0/exec.c:3419
+#8  0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at 
+/usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034
+#9  0x000056121682255e in qemu_kvm_cpu_thread_fn (arg=arg@entry=0x56121849aa00) 
+at /usr/src/debug/qemu-4.0/cpus.c:1281
+#10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at 
+/usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502
+#11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0
+#12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6
+
+And I searched and found
+https://bugzilla.redhat.com/show_bug.cgi?id=1706759
+, which has the same
+backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add
+blk_drain() to virtio_blk_device_unrealize()") is to fix this particular
+bug.
+
+But I can still hit the bug even after applying the commit. Do I miss
+anything?
+
+Thanks,
+Eryu
+>
+Eryu
+
+On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote:
+>
+>
+On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote:
+>
+> On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
+>
+> > On Tue, 31 Dec 2019 18:34:34 +0800
+>
+> > Eryu Guan <address@hidden> wrote:
+>
+> >
+>
+> > > Hi,
+>
+> > >
+>
+> > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I
+>
+> > > think it's because io completion hits use-after-free when device is
+>
+> > > already gone. Is this a known bug that has been fixed? (I went through
+>
+> > > the git log but didn't find anything obvious).
+>
+> > >
+>
+> > > gdb backtrace is:
+>
+> > >
+>
+> > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+> > > Program terminated with signal 11, Segmentation fault.
+>
+> > > #0 object_get_class (obj=obj@entry=0x0) at
+>
+> > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > 903        return obj->class;
+>
+> > > (gdb) bt
+>
+> > > #0  object_get_class (obj=obj@entry=0x0) at
+>
+> > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > #1  0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0,
+>
+> > >     vector=<optimized out>) at
+>
+> > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+> > > #2  0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete (
+>
+> > >     opaque=0x558a2f2fd420, ret=0)
+>
+> > >     at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+> > > #3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+>
+> > >     at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+> > > #4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>,
+>
+> > >     i1=<optimized out>) at
+>
+> > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+> > > #5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+> > > #6  0x00007fff9ed75780 in ?? ()
+>
+> > > #7  0x0000000000000000 in ?? ()
+>
+> > >
+>
+> > > It seems like qemu was completing a discard/write_zero request, but
+>
+> > > parent BusState was already freed & set to NULL.
+>
+> > >
+>
+> > > Do we need to drain all pending request before unrealizing virtio-blk
+>
+> > > device? Like the following patch proposed?
+>
+> > >
+>
+> > >
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+> > >
+>
+> > > If more info is needed, please let me know.
+>
+> >
+>
+> > may be this will help:
+https://patchwork.kernel.org/patch/11213047/
+>
+>
+>
+> Yeah, this looks promising! I'll try it out (though it's a one-time
+>
+> crash for me). Thanks!
+>
+>
+After applying this patch, I don't see the original segfaut and
+>
+backtrace, but I see this crash
+>
+>
+[Thread debugging using libthread_db enabled]
+>
+Using host libthread_db library "/lib64/libthread_db.so.1".
+>
+Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'.
+>
+Program terminated with signal 11, Segmentation fault.
+>
+#0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
+>
+addr=0, val=<optimized out>, size=<optimized out>) at
+>
+/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+1324        VirtIOPCIProxy *proxy =
+>
+VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent);
+>
+Missing separate debuginfos, use: debuginfo-install
+>
+glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64
+>
+libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64
+>
+libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64
+>
+pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64
+>
+(gdb) bt
+>
+#0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
+>
+addr=0, val=<optimized out>, size=<optimized out>) at
+>
+/usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+#1  0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>,
+>
+addr=<optimized out>, value=<optimized out>, size=<optimized out>,
+>
+shift=<optimized out>, mask=<optimized out>, attrs=...) at
+>
+/usr/src/debug/qemu-4.0/memory.c:502
+>
+#2  0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0,
+>
+value=value@entry=0x7fcdeab1b8a8, size=size@entry=2,
+>
+access_size_min=<optimized out>, access_size_max=<optimized out>,
+>
+access_fn=0x561216835ac0 <memory_region_write_accessor>, mr=0x56121846d340,
+>
+attrs=...)
+>
+at /usr/src/debug/qemu-4.0/memory.c:568
+>
+#3  0x0000561216837c66 in memory_region_dispatch_write
+>
+(mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2,
+>
+attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503
+>
+#4  0x00005612167e036f in flatview_write_continue
+>
+(fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=...,
+>
+buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+len=len@entry=2, addr1=<optimized out>, l=<optimized out>, mr=0x56121846d340)
+>
+at /usr/src/debug/qemu-4.0/exec.c:3279
+>
+#5  0x00005612167e0506 in flatview_write (fv=0x56121852edd0,
+>
+addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028 out
+>
+of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318
+>
+#6  0x00005612167e4a1b in address_space_write (as=<optimized out>,
+>
+addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at
+>
+/usr/src/debug/qemu-4.0/exec.c:3408
+>
+#7  0x00005612167e4aa5 in address_space_rw (as=<optimized out>,
+>
+addr=<optimized out>, attrs=..., attrs@entry=...,
+>
+buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+len=<optimized out>, is_write=<optimized out>) at
+>
+/usr/src/debug/qemu-4.0/exec.c:3419
+>
+#8  0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at
+>
+/usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034
+>
+#9  0x000056121682255e in qemu_kvm_cpu_thread_fn
+>
+(arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281
+>
+#10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at
+>
+/usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502
+>
+#11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0
+>
+#12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6
+>
+>
+And I searched and found
+>
+https://bugzilla.redhat.com/show_bug.cgi?id=1706759
+, which has the same
+>
+backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add
+>
+blk_drain() to virtio_blk_device_unrealize()") is to fix this particular
+>
+bug.
+>
+>
+But I can still hit the bug even after applying the commit. Do I miss
+>
+anything?
+Hi Eryu,
+This backtrace seems to be caused by this bug (there were two bugs in
+1706759):
+https://bugzilla.redhat.com/show_bug.cgi?id=1708480
+Although the solution hasn't been tested on virtio-blk yet, you may
+want to apply this patch:
+https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html
+Let me know if this works.
+
+Best regards, Julia Suvorova.
+
+On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote:
+>
+On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote:
+>
+>
+>
+> On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote:
+>
+> > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
+>
+> > > On Tue, 31 Dec 2019 18:34:34 +0800
+>
+> > > Eryu Guan <address@hidden> wrote:
+>
+> > >
+>
+> > > > Hi,
+>
+> > > >
+>
+> > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I
+>
+> > > > think it's because io completion hits use-after-free when device is
+>
+> > > > already gone. Is this a known bug that has been fixed? (I went through
+>
+> > > > the git log but didn't find anything obvious).
+>
+> > > >
+>
+> > > > gdb backtrace is:
+>
+> > > >
+>
+> > > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+> > > > Program terminated with signal 11, Segmentation fault.
+>
+> > > > #0 object_get_class (obj=obj@entry=0x0) at
+>
+> > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > 903        return obj->class;
+>
+> > > > (gdb) bt
+>
+> > > > #0  object_get_class (obj=obj@entry=0x0) at
+>
+> > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > #1  0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0,
+>
+> > > >     vector=<optimized out>) at
+>
+> > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+> > > > #2  0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete (
+>
+> > > >     opaque=0x558a2f2fd420, ret=0)
+>
+> > > >     at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+> > > > #3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+>
+> > > >     at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+> > > > #4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>,
+>
+> > > >     i1=<optimized out>) at
+>
+> > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+> > > > #5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+> > > > #6  0x00007fff9ed75780 in ?? ()
+>
+> > > > #7  0x0000000000000000 in ?? ()
+>
+> > > >
+>
+> > > > It seems like qemu was completing a discard/write_zero request, but
+>
+> > > > parent BusState was already freed & set to NULL.
+>
+> > > >
+>
+> > > > Do we need to drain all pending request before unrealizing virtio-blk
+>
+> > > > device? Like the following patch proposed?
+>
+> > > >
+>
+> > > >
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+> > > >
+>
+> > > > If more info is needed, please let me know.
+>
+> > >
+>
+> > > may be this will help:
+https://patchwork.kernel.org/patch/11213047/
+>
+> >
+>
+> > Yeah, this looks promising! I'll try it out (though it's a one-time
+>
+> > crash for me). Thanks!
+>
+>
+>
+> After applying this patch, I don't see the original segfaut and
+>
+> backtrace, but I see this crash
+>
+>
+>
+> [Thread debugging using libthread_db enabled]
+>
+> Using host libthread_db library "/lib64/libthread_db.so.1".
+>
+> Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'.
+>
+> Program terminated with signal 11, Segmentation fault.
+>
+> #0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
+>
+> addr=0, val=<optimized out>, size=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> 1324        VirtIOPCIProxy *proxy =
+>
+> VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent);
+>
+> Missing separate debuginfos, use: debuginfo-install
+>
+> glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64
+>
+> libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64
+>
+> libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64
+>
+> pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64
+>
+> (gdb) bt
+>
+> #0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
+>
+> addr=0, val=<optimized out>, size=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> #1  0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>,
+>
+> addr=<optimized out>, value=<optimized out>, size=<optimized out>,
+>
+> shift=<optimized out>, mask=<optimized out>, attrs=...) at
+>
+> /usr/src/debug/qemu-4.0/memory.c:502
+>
+> #2  0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0,
+>
+> value=value@entry=0x7fcdeab1b8a8, size=size@entry=2,
+>
+> access_size_min=<optimized out>, access_size_max=<optimized out>,
+>
+> access_fn=0x561216835ac0 <memory_region_write_accessor>, mr=0x56121846d340,
+>
+> attrs=...)
+>
+>     at /usr/src/debug/qemu-4.0/memory.c:568
+>
+> #3  0x0000561216837c66 in memory_region_dispatch_write
+>
+> (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2,
+>
+> attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503
+>
+> #4  0x00005612167e036f in flatview_write_continue
+>
+> (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=...,
+>
+> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> len=len@entry=2, addr1=<optimized out>, l=<optimized out>,
+>
+> mr=0x56121846d340)
+>
+>     at /usr/src/debug/qemu-4.0/exec.c:3279
+>
+> #5  0x00005612167e0506 in flatview_write (fv=0x56121852edd0,
+>
+> addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028
+>
+> out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318
+>
+> #6  0x00005612167e4a1b in address_space_write (as=<optimized out>,
+>
+> addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>)
+>
+> at /usr/src/debug/qemu-4.0/exec.c:3408
+>
+> #7  0x00005612167e4aa5 in address_space_rw (as=<optimized out>,
+>
+> addr=<optimized out>, attrs=..., attrs@entry=...,
+>
+> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> len=<optimized out>, is_write=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/exec.c:3419
+>
+> #8  0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at
+>
+> /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034
+>
+> #9  0x000056121682255e in qemu_kvm_cpu_thread_fn
+>
+> (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281
+>
+> #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502
+>
+> #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0
+>
+> #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6
+>
+>
+>
+> And I searched and found
+>
+>
+https://bugzilla.redhat.com/show_bug.cgi?id=1706759
+, which has the same
+>
+> backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add
+>
+> blk_drain() to virtio_blk_device_unrealize()") is to fix this particular
+>
+> bug.
+>
+>
+>
+> But I can still hit the bug even after applying the commit. Do I miss
+>
+> anything?
+>
+>
+Hi Eryu,
+>
+This backtrace seems to be caused by this bug (there were two bugs in
+>
+1706759):
+https://bugzilla.redhat.com/show_bug.cgi?id=1708480
+>
+Although the solution hasn't been tested on virtio-blk yet, you may
+>
+want to apply this patch:
+>
+https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html
+>
+Let me know if this works.
+Will try it out, thanks a lot!
+
+Eryu
+
+On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote:
+>
+On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote:
+>
+>
+>
+> On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote:
+>
+> > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
+>
+> > > On Tue, 31 Dec 2019 18:34:34 +0800
+>
+> > > Eryu Guan <address@hidden> wrote:
+>
+> > >
+>
+> > > > Hi,
+>
+> > > >
+>
+> > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox, I
+>
+> > > > think it's because io completion hits use-after-free when device is
+>
+> > > > already gone. Is this a known bug that has been fixed? (I went through
+>
+> > > > the git log but didn't find anything obvious).
+>
+> > > >
+>
+> > > > gdb backtrace is:
+>
+> > > >
+>
+> > > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+> > > > Program terminated with signal 11, Segmentation fault.
+>
+> > > > #0 object_get_class (obj=obj@entry=0x0) at
+>
+> > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > 903        return obj->class;
+>
+> > > > (gdb) bt
+>
+> > > > #0  object_get_class (obj=obj@entry=0x0) at
+>
+> > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > #1  0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0,
+>
+> > > >     vector=<optimized out>) at
+>
+> > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+> > > > #2  0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete (
+>
+> > > >     opaque=0x558a2f2fd420, ret=0)
+>
+> > > >     at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+> > > > #3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+>
+> > > >     at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+> > > > #4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>,
+>
+> > > >     i1=<optimized out>) at
+>
+> > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+> > > > #5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+> > > > #6  0x00007fff9ed75780 in ?? ()
+>
+> > > > #7  0x0000000000000000 in ?? ()
+>
+> > > >
+>
+> > > > It seems like qemu was completing a discard/write_zero request, but
+>
+> > > > parent BusState was already freed & set to NULL.
+>
+> > > >
+>
+> > > > Do we need to drain all pending request before unrealizing virtio-blk
+>
+> > > > device? Like the following patch proposed?
+>
+> > > >
+>
+> > > >
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+> > > >
+>
+> > > > If more info is needed, please let me know.
+>
+> > >
+>
+> > > may be this will help:
+https://patchwork.kernel.org/patch/11213047/
+>
+> >
+>
+> > Yeah, this looks promising! I'll try it out (though it's a one-time
+>
+> > crash for me). Thanks!
+>
+>
+>
+> After applying this patch, I don't see the original segfaut and
+>
+> backtrace, but I see this crash
+>
+>
+>
+> [Thread debugging using libthread_db enabled]
+>
+> Using host libthread_db library "/lib64/libthread_db.so.1".
+>
+> Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'.
+>
+> Program terminated with signal 11, Segmentation fault.
+>
+> #0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
+>
+> addr=0, val=<optimized out>, size=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> 1324        VirtIOPCIProxy *proxy =
+>
+> VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent);
+>
+> Missing separate debuginfos, use: debuginfo-install
+>
+> glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64
+>
+> libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64
+>
+> libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64
+>
+> pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64
+>
+> (gdb) bt
+>
+> #0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
+>
+> addr=0, val=<optimized out>, size=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> #1  0x0000561216835b22 in memory_region_write_accessor (mr=<optimized out>,
+>
+> addr=<optimized out>, value=<optimized out>, size=<optimized out>,
+>
+> shift=<optimized out>, mask=<optimized out>, attrs=...) at
+>
+> /usr/src/debug/qemu-4.0/memory.c:502
+>
+> #2  0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0,
+>
+> value=value@entry=0x7fcdeab1b8a8, size=size@entry=2,
+>
+> access_size_min=<optimized out>, access_size_max=<optimized out>,
+>
+> access_fn=0x561216835ac0 <memory_region_write_accessor>, mr=0x56121846d340,
+>
+> attrs=...)
+>
+>     at /usr/src/debug/qemu-4.0/memory.c:568
+>
+> #3  0x0000561216837c66 in memory_region_dispatch_write
+>
+> (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2,
+>
+> attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503
+>
+> #4  0x00005612167e036f in flatview_write_continue
+>
+> (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=...,
+>
+> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> len=len@entry=2, addr1=<optimized out>, l=<optimized out>,
+>
+> mr=0x56121846d340)
+>
+>     at /usr/src/debug/qemu-4.0/exec.c:3279
+>
+> #5  0x00005612167e0506 in flatview_write (fv=0x56121852edd0,
+>
+> addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028
+>
+> out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318
+>
+> #6  0x00005612167e4a1b in address_space_write (as=<optimized out>,
+>
+> addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>)
+>
+> at /usr/src/debug/qemu-4.0/exec.c:3408
+>
+> #7  0x00005612167e4aa5 in address_space_rw (as=<optimized out>,
+>
+> addr=<optimized out>, attrs=..., attrs@entry=...,
+>
+> buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> len=<optimized out>, is_write=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/exec.c:3419
+>
+> #8  0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at
+>
+> /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034
+>
+> #9  0x000056121682255e in qemu_kvm_cpu_thread_fn
+>
+> (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281
+>
+> #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at
+>
+> /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502
+>
+> #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0
+>
+> #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6
+>
+>
+>
+> And I searched and found
+>
+>
+https://bugzilla.redhat.com/show_bug.cgi?id=1706759
+, which has the same
+>
+> backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add
+>
+> blk_drain() to virtio_blk_device_unrealize()") is to fix this particular
+>
+> bug.
+>
+>
+>
+> But I can still hit the bug even after applying the commit. Do I miss
+>
+> anything?
+>
+>
+Hi Eryu,
+>
+This backtrace seems to be caused by this bug (there were two bugs in
+>
+1706759):
+https://bugzilla.redhat.com/show_bug.cgi?id=1708480
+>
+Although the solution hasn't been tested on virtio-blk yet, you may
+>
+want to apply this patch:
+>
+https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html
+>
+Let me know if this works.
+Unfortunately, I still see the same segfault & backtrace after applying
+commit 421afd2fe8dd ("virtio: reset region cache when on queue
+deletion")
+
+Anything I can help to debug?
+
+Thanks,
+Eryu
+
+On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote:
+>
+On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote:
+>
+> On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote:
+>
+> >
+>
+> > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote:
+>
+> > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
+>
+> > > > On Tue, 31 Dec 2019 18:34:34 +0800
+>
+> > > > Eryu Guan <address@hidden> wrote:
+>
+> > > >
+>
+> > > > > Hi,
+>
+> > > > >
+>
+> > > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox,
+>
+> > > > > I
+>
+> > > > > think it's because io completion hits use-after-free when device is
+>
+> > > > > already gone. Is this a known bug that has been fixed? (I went
+>
+> > > > > through
+>
+> > > > > the git log but didn't find anything obvious).
+>
+> > > > >
+>
+> > > > > gdb backtrace is:
+>
+> > > > >
+>
+> > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+> > > > > Program terminated with signal 11, Segmentation fault.
+>
+> > > > > #0 object_get_class (obj=obj@entry=0x0) at
+>
+> > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > > 903        return obj->class;
+>
+> > > > > (gdb) bt
+>
+> > > > > #0  object_get_class (obj=obj@entry=0x0) at
+>
+> > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > > #1  0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0,
+>
+> > > > >     vector=<optimized out>) at
+>
+> > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+> > > > > #2  0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete (
+>
+> > > > >     opaque=0x558a2f2fd420, ret=0)
+>
+> > > > >     at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+> > > > > #3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+>
+> > > > >     at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+> > > > > #4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>,
+>
+> > > > >     i1=<optimized out>) at
+>
+> > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+> > > > > #5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+> > > > > #6  0x00007fff9ed75780 in ?? ()
+>
+> > > > > #7  0x0000000000000000 in ?? ()
+>
+> > > > >
+>
+> > > > > It seems like qemu was completing a discard/write_zero request, but
+>
+> > > > > parent BusState was already freed & set to NULL.
+>
+> > > > >
+>
+> > > > > Do we need to drain all pending request before unrealizing
+>
+> > > > > virtio-blk
+>
+> > > > > device? Like the following patch proposed?
+>
+> > > > >
+>
+> > > > >
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+> > > > >
+>
+> > > > > If more info is needed, please let me know.
+>
+> > > >
+>
+> > > > may be this will help:
+https://patchwork.kernel.org/patch/11213047/
+>
+> > >
+>
+> > > Yeah, this looks promising! I'll try it out (though it's a one-time
+>
+> > > crash for me). Thanks!
+>
+> >
+>
+> > After applying this patch, I don't see the original segfaut and
+>
+> > backtrace, but I see this crash
+>
+> >
+>
+> > [Thread debugging using libthread_db enabled]
+>
+> > Using host libthread_db library "/lib64/libthread_db.so.1".
+>
+> > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'.
+>
+> > Program terminated with signal 11, Segmentation fault.
+>
+> > #0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
+>
+> > addr=0, val=<optimized out>, size=<optimized out>) at
+>
+> > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> > 1324        VirtIOPCIProxy *proxy =
+>
+> > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent);
+>
+> > Missing separate debuginfos, use: debuginfo-install
+>
+> > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64
+>
+> > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64
+>
+> > libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64
+>
+> > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64
+>
+> > (gdb) bt
+>
+> > #0  0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
+>
+> > addr=0, val=<optimized out>, size=<optimized out>) at
+>
+> > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> > #1  0x0000561216835b22 in memory_region_write_accessor (mr=<optimized
+>
+> > out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>,
+>
+> > shift=<optimized out>, mask=<optimized out>, attrs=...) at
+>
+> > /usr/src/debug/qemu-4.0/memory.c:502
+>
+> > #2  0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0,
+>
+> > value=value@entry=0x7fcdeab1b8a8, size=size@entry=2,
+>
+> > access_size_min=<optimized out>, access_size_max=<optimized out>,
+>
+> > access_fn=0x561216835ac0 <memory_region_write_accessor>,
+>
+> > mr=0x56121846d340, attrs=...)
+>
+> >     at /usr/src/debug/qemu-4.0/memory.c:568
+>
+> > #3  0x0000561216837c66 in memory_region_dispatch_write
+>
+> > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2,
+>
+> > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503
+>
+> > #4  0x00005612167e036f in flatview_write_continue
+>
+> > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=...,
+>
+> > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> > len=len@entry=2, addr1=<optimized out>, l=<optimized out>,
+>
+> > mr=0x56121846d340)
+>
+> >     at /usr/src/debug/qemu-4.0/exec.c:3279
+>
+> > #5  0x00005612167e0506 in flatview_write (fv=0x56121852edd0,
+>
+> > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028
+>
+> > out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318
+>
+> > #6  0x00005612167e4a1b in address_space_write (as=<optimized out>,
+>
+> > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized
+>
+> > out>) at /usr/src/debug/qemu-4.0/exec.c:3408
+>
+> > #7  0x00005612167e4aa5 in address_space_rw (as=<optimized out>,
+>
+> > addr=<optimized out>, attrs=..., attrs@entry=...,
+>
+> > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> > len=<optimized out>, is_write=<optimized out>) at
+>
+> > /usr/src/debug/qemu-4.0/exec.c:3419
+>
+> > #8  0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at
+>
+> > /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034
+>
+> > #9  0x000056121682255e in qemu_kvm_cpu_thread_fn
+>
+> > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281
+>
+> > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at
+>
+> > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502
+>
+> > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0
+>
+> > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6
+>
+> >
+>
+> > And I searched and found
+>
+> >
+https://bugzilla.redhat.com/show_bug.cgi?id=1706759
+, which has the same
+>
+> > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add
+>
+> > blk_drain() to virtio_blk_device_unrealize()") is to fix this particular
+>
+> > bug.
+>
+> >
+>
+> > But I can still hit the bug even after applying the commit. Do I miss
+>
+> > anything?
+>
+>
+>
+> Hi Eryu,
+>
+> This backtrace seems to be caused by this bug (there were two bugs in
+>
+> 1706759):
+https://bugzilla.redhat.com/show_bug.cgi?id=1708480
+>
+> Although the solution hasn't been tested on virtio-blk yet, you may
+>
+> want to apply this patch:
+>
+>
+https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html
+>
+> Let me know if this works.
+>
+>
+Unfortunately, I still see the same segfault & backtrace after applying
+>
+commit 421afd2fe8dd ("virtio: reset region cache when on queue
+>
+deletion")
+>
+>
+Anything I can help to debug?
+Please post the QEMU command-line and the QMP commands use to remove the
+device.
+
+The backtrace shows a vcpu thread submitting a request.  The device
+seems to be partially destroyed.  That's surprising because the monitor
+and the vcpu thread should use the QEMU global mutex to avoid race
+conditions.  Maybe seeing the QMP commands will make it clearer...
+
+Stefan
+signature.asc
+Description:
+PGP signature
+
+On Mon, Jan 13, 2020 at 04:38:55PM +0000, Stefan Hajnoczi wrote:
+>
+On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote:
+>
+> On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote:
+>
+> > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote:
+>
+> > >
+>
+> > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote:
+>
+> > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
+>
+> > > > > On Tue, 31 Dec 2019 18:34:34 +0800
+>
+> > > > > Eryu Guan <address@hidden> wrote:
+>
+> > > > >
+>
+> > > > > > Hi,
+>
+> > > > > >
+>
+> > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata
+>
+> > > > > > sandbox, I
+>
+> > > > > > think it's because io completion hits use-after-free when device
+>
+> > > > > > is
+>
+> > > > > > already gone. Is this a known bug that has been fixed? (I went
+>
+> > > > > > through
+>
+> > > > > > the git log but didn't find anything obvious).
+>
+> > > > > >
+>
+> > > > > > gdb backtrace is:
+>
+> > > > > >
+>
+> > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+> > > > > > Program terminated with signal 11, Segmentation fault.
+>
+> > > > > > #0 object_get_class (obj=obj@entry=0x0) at
+>
+> > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > > > 903        return obj->class;
+>
+> > > > > > (gdb) bt
+>
+> > > > > > #0  object_get_class (obj=obj@entry=0x0) at
+>
+> > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > > > #1  0x0000558a2c009e9b in virtio_notify_vector
+>
+> > > > > > (vdev=0x558a2e7751d0,
+>
+> > > > > >     vector=<optimized out>) at
+>
+> > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+> > > > > > #2  0x0000558a2bfdcb1e in
+>
+> > > > > > virtio_blk_discard_write_zeroes_complete (
+>
+> > > > > >     opaque=0x558a2f2fd420, ret=0)
+>
+> > > > > >     at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+> > > > > > #3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+>
+> > > > > >     at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+> > > > > > #4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized
+>
+> > > > > > out>,
+>
+> > > > > >     i1=<optimized out>) at
+>
+> > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+> > > > > > #5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+> > > > > > #6  0x00007fff9ed75780 in ?? ()
+>
+> > > > > > #7  0x0000000000000000 in ?? ()
+>
+> > > > > >
+>
+> > > > > > It seems like qemu was completing a discard/write_zero request,
+>
+> > > > > > but
+>
+> > > > > > parent BusState was already freed & set to NULL.
+>
+> > > > > >
+>
+> > > > > > Do we need to drain all pending request before unrealizing
+>
+> > > > > > virtio-blk
+>
+> > > > > > device? Like the following patch proposed?
+>
+> > > > > >
+>
+> > > > > >
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+> > > > > >
+>
+> > > > > > If more info is needed, please let me know.
+>
+> > > > >
+>
+> > > > > may be this will help:
+https://patchwork.kernel.org/patch/11213047/
+>
+> > > >
+>
+> > > > Yeah, this looks promising! I'll try it out (though it's a one-time
+>
+> > > > crash for me). Thanks!
+>
+> > >
+>
+> > > After applying this patch, I don't see the original segfaut and
+>
+> > > backtrace, but I see this crash
+>
+> > >
+>
+> > > [Thread debugging using libthread_db enabled]
+>
+> > > Using host libthread_db library "/lib64/libthread_db.so.1".
+>
+> > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'.
+>
+> > > Program terminated with signal 11, Segmentation fault.
+>
+> > > #0  0x0000561216a57609 in virtio_pci_notify_write
+>
+> > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized
+>
+> > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> > > 1324        VirtIOPCIProxy *proxy =
+>
+> > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent);
+>
+> > > Missing separate debuginfos, use: debuginfo-install
+>
+> > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64
+>
+> > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64
+>
+> > > libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64
+>
+> > > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64
+>
+> > > (gdb) bt
+>
+> > > #0  0x0000561216a57609 in virtio_pci_notify_write
+>
+> > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized
+>
+> > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> > > #1  0x0000561216835b22 in memory_region_write_accessor (mr=<optimized
+>
+> > > out>, addr=<optimized out>, value=<optimized out>, size=<optimized
+>
+> > > out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at
+>
+> > > /usr/src/debug/qemu-4.0/memory.c:502
+>
+> > > #2  0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0,
+>
+> > > value=value@entry=0x7fcdeab1b8a8, size=size@entry=2,
+>
+> > > access_size_min=<optimized out>, access_size_max=<optimized out>,
+>
+> > > access_fn=0x561216835ac0 <memory_region_write_accessor>,
+>
+> > > mr=0x56121846d340, attrs=...)
+>
+> > >     at /usr/src/debug/qemu-4.0/memory.c:568
+>
+> > > #3  0x0000561216837c66 in memory_region_dispatch_write
+>
+> > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2,
+>
+> > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503
+>
+> > > #4  0x00005612167e036f in flatview_write_continue
+>
+> > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=...,
+>
+> > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> > > len=len@entry=2, addr1=<optimized out>, l=<optimized out>,
+>
+> > > mr=0x56121846d340)
+>
+> > >     at /usr/src/debug/qemu-4.0/exec.c:3279
+>
+> > > #5  0x00005612167e0506 in flatview_write (fv=0x56121852edd0,
+>
+> > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address
+>
+> > > 0x7fce7dd97028 out of bounds>, len=2) at
+>
+> > > /usr/src/debug/qemu-4.0/exec.c:3318
+>
+> > > #6  0x00005612167e4a1b in address_space_write (as=<optimized out>,
+>
+> > > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized
+>
+> > > out>) at /usr/src/debug/qemu-4.0/exec.c:3408
+>
+> > > #7  0x00005612167e4aa5 in address_space_rw (as=<optimized out>,
+>
+> > > addr=<optimized out>, attrs=..., attrs@entry=...,
+>
+> > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> > > len=<optimized out>, is_write=<optimized out>) at
+>
+> > > /usr/src/debug/qemu-4.0/exec.c:3419
+>
+> > > #8  0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00)
+>
+> > > at /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034
+>
+> > > #9  0x000056121682255e in qemu_kvm_cpu_thread_fn
+>
+> > > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281
+>
+> > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at
+>
+> > > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502
+>
+> > > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0
+>
+> > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6
+>
+> > >
+>
+> > > And I searched and found
+>
+> > >
+https://bugzilla.redhat.com/show_bug.cgi?id=1706759
+, which has the same
+>
+> > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add
+>
+> > > blk_drain() to virtio_blk_device_unrealize()") is to fix this particular
+>
+> > > bug.
+>
+> > >
+>
+> > > But I can still hit the bug even after applying the commit. Do I miss
+>
+> > > anything?
+>
+> >
+>
+> > Hi Eryu,
+>
+> > This backtrace seems to be caused by this bug (there were two bugs in
+>
+> > 1706759):
+https://bugzilla.redhat.com/show_bug.cgi?id=1708480
+>
+> > Although the solution hasn't been tested on virtio-blk yet, you may
+>
+> > want to apply this patch:
+>
+> >
+https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html
+>
+> > Let me know if this works.
+>
+>
+>
+> Unfortunately, I still see the same segfault & backtrace after applying
+>
+> commit 421afd2fe8dd ("virtio: reset region cache when on queue
+>
+> deletion")
+>
+>
+>
+> Anything I can help to debug?
+>
+>
+Please post the QEMU command-line and the QMP commands use to remove the
+>
+device.
+It's a normal kata instance using virtio-fs as rootfs.
+
+/usr/local/libexec/qemu-kvm -name 
+sandbox-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d \
+ -uuid e03f6b6b-b80b-40c0-8d5b-0cbfed1305d2 -machine 
+q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit \
+ -cpu host -qmp 
+unix:/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait
+ \
+ -qmp 
+unix:/run/vc/vm/debug-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait
+ \
+ -m 2048M,slots=10,maxmem=773893M -device 
+pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= \
+ -device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device 
+virtconsole,chardev=charconsole0,id=console0 \
+ -chardev 
+socket,id=charconsole0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/console.sock,server,nowait
+ \
+ -device 
+virtserialport,chardev=metricagent,id=channel10,name=metric.agent.channel.10 \
+ -chardev 
+socket,id=metricagent,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/metric.agent.channel.sock,server,nowait
+ \
+ -device nvdimm,id=nv0,memdev=mem0 -object 
+memory-backend-file,id=mem0,mem-path=/usr/local/share/containers-image-1.9.0.img,size=268435456
+ \
+ -object rng-random,id=rng0,filename=/dev/urandom -device 
+virtio-rng,rng=rng0,romfile= \
+ -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 \
+ -chardev 
+socket,id=charch0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/kata.sock,server,nowait
+ \
+ -chardev 
+socket,id=char-6fca044b801a78a1,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/vhost-fs.sock
+ \
+ -device 
+vhost-user-fs-pci,chardev=char-6fca044b801a78a1,tag=kataShared,cache-size=8192M 
+-netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 \
+ -device 
+driver=virtio-net-pci,netdev=network-0,mac=76:57:f1:ab:51:5c,disable-modern=false,mq=on,vectors=4,romfile=
+ \
+ -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults 
+-nographic -daemonize \
+ -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on 
+-numa node,memdev=dimm1 -kernel /usr/local/share/kernel \
+ -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 
+i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 
+console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 
+root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro 
+rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=96 
+agent.use_vsock=false init=/usr/lib/systemd/systemd 
+systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service 
+systemd.mask=systemd-networkd.socket \
+ -pidfile 
+/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/pid 
+\
+ -smp 1,cores=1,threads=1,sockets=96,maxcpus=96
+
+QMP command to delete device (the device id is just an example, not the
+one caused the crash):
+
+"{\"arguments\":{\"id\":\"virtio-drive-5967abfb917c8da6\"},\"execute\":\"device_del\"}"
+
+which has been hot plugged by:
+"{\"arguments\":{\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":{\"driver\":\"file\",\"filename\":\"/dev/dm-18\"},\"node-name\":\"drive-5967abfb917c8da6\"},\"execute\":\"blockdev-add\"}"
+"{\"return\": {}}"
+"{\"arguments\":{\"addr\":\"01\",\"bus\":\"pci-bridge-0\",\"drive\":\"drive-5967abfb917c8da6\",\"driver\":\"virtio-blk-pci\",\"id\":\"virtio-drive-5967abfb917c8da6\",\"romfile\":\"\",\"share-rw\":\"on\"},\"execute\":\"device_add\"}"
+"{\"return\": {}}"
+
+>
+>
+The backtrace shows a vcpu thread submitting a request.  The device
+>
+seems to be partially destroyed.  That's surprising because the monitor
+>
+and the vcpu thread should use the QEMU global mutex to avoid race
+>
+conditions.  Maybe seeing the QMP commands will make it clearer...
+>
+>
+Stefan
+Thanks!
+
+Eryu
+
+On Tue, Jan 14, 2020 at 10:50:58AM +0800, Eryu Guan wrote:
+>
+On Mon, Jan 13, 2020 at 04:38:55PM +0000, Stefan Hajnoczi wrote:
+>
+> On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote:
+>
+> > On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote:
+>
+> > > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote:
+>
+> > > >
+>
+> > > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote:
+>
+> > > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
+>
+> > > > > > On Tue, 31 Dec 2019 18:34:34 +0800
+>
+> > > > > > Eryu Guan <address@hidden> wrote:
+>
+> > > > > >
+>
+> > > > > > > Hi,
+>
+> > > > > > >
+>
+> > > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata
+>
+> > > > > > > sandbox, I
+>
+> > > > > > > think it's because io completion hits use-after-free when
+>
+> > > > > > > device is
+>
+> > > > > > > already gone. Is this a known bug that has been fixed? (I went
+>
+> > > > > > > through
+>
+> > > > > > > the git log but didn't find anything obvious).
+>
+> > > > > > >
+>
+> > > > > > > gdb backtrace is:
+>
+> > > > > > >
+>
+> > > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+> > > > > > > Program terminated with signal 11, Segmentation fault.
+>
+> > > > > > > #0 object_get_class (obj=obj@entry=0x0) at
+>
+> > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > > > > 903        return obj->class;
+>
+> > > > > > > (gdb) bt
+>
+> > > > > > > #0  object_get_class (obj=obj@entry=0x0) at
+>
+> > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > > > > #1  0x0000558a2c009e9b in virtio_notify_vector
+>
+> > > > > > > (vdev=0x558a2e7751d0,
+>
+> > > > > > >     vector=<optimized out>) at
+>
+> > > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+> > > > > > > #2  0x0000558a2bfdcb1e in
+>
+> > > > > > > virtio_blk_discard_write_zeroes_complete (
+>
+> > > > > > >     opaque=0x558a2f2fd420, ret=0)
+>
+> > > > > > >     at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+> > > > > > > #3  0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
+>
+> > > > > > >     at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+> > > > > > > #4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized
+>
+> > > > > > > out>,
+>
+> > > > > > >     i1=<optimized out>) at
+>
+> > > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+> > > > > > > #5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+> > > > > > > #6  0x00007fff9ed75780 in ?? ()
+>
+> > > > > > > #7  0x0000000000000000 in ?? ()
+>
+> > > > > > >
+>
+> > > > > > > It seems like qemu was completing a discard/write_zero request,
+>
+> > > > > > > but
+>
+> > > > > > > parent BusState was already freed & set to NULL.
+>
+> > > > > > >
+>
+> > > > > > > Do we need to drain all pending request before unrealizing
+>
+> > > > > > > virtio-blk
+>
+> > > > > > > device? Like the following patch proposed?
+>
+> > > > > > >
+>
+> > > > > > >
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+> > > > > > >
+>
+> > > > > > > If more info is needed, please let me know.
+>
+> > > > > >
+>
+> > > > > > may be this will help:
+>
+> > > > > >
+https://patchwork.kernel.org/patch/11213047/
+>
+> > > > >
+>
+> > > > > Yeah, this looks promising! I'll try it out (though it's a one-time
+>
+> > > > > crash for me). Thanks!
+>
+> > > >
+>
+> > > > After applying this patch, I don't see the original segfaut and
+>
+> > > > backtrace, but I see this crash
+>
+> > > >
+>
+> > > > [Thread debugging using libthread_db enabled]
+>
+> > > > Using host libthread_db library "/lib64/libthread_db.so.1".
+>
+> > > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'.
+>
+> > > > Program terminated with signal 11, Segmentation fault.
+>
+> > > > #0  0x0000561216a57609 in virtio_pci_notify_write
+>
+> > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized
+>
+> > > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> > > > 1324        VirtIOPCIProxy *proxy =
+>
+> > > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent);
+>
+> > > > Missing separate debuginfos, use: debuginfo-install
+>
+> > > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64
+>
+> > > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64
+>
+> > > > libstdc++-4.8.5-28.alios7.1.x86_64
+>
+> > > > numactl-libs-2.0.9-5.1.alios7.x86_64 pixman-0.32.6-3.1.alios7.x86_64
+>
+> > > > zlib-1.2.7-16.2.alios7.x86_64
+>
+> > > > (gdb) bt
+>
+> > > > #0  0x0000561216a57609 in virtio_pci_notify_write
+>
+> > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>, size=<optimized
+>
+> > > > out>) at /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> > > > #1  0x0000561216835b22 in memory_region_write_accessor (mr=<optimized
+>
+> > > > out>, addr=<optimized out>, value=<optimized out>, size=<optimized
+>
+> > > > out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at
+>
+> > > > /usr/src/debug/qemu-4.0/memory.c:502
+>
+> > > > #2  0x0000561216833c5d in access_with_adjusted_size
+>
+> > > > (addr=addr@entry=0, value=value@entry=0x7fcdeab1b8a8,
+>
+> > > > size=size@entry=2, access_size_min=<optimized out>,
+>
+> > > > access_size_max=<optimized out>, access_fn=0x561216835ac0
+>
+> > > > <memory_region_write_accessor>, mr=0x56121846d340, attrs=...)
+>
+> > > >     at /usr/src/debug/qemu-4.0/memory.c:568
+>
+> > > > #3  0x0000561216837c66 in memory_region_dispatch_write
+>
+> > > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2,
+>
+> > > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503
+>
+> > > > #4  0x00005612167e036f in flatview_write_continue
+>
+> > > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=...,
+>
+> > > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> > > > len=len@entry=2, addr1=<optimized out>, l=<optimized out>,
+>
+> > > > mr=0x56121846d340)
+>
+> > > >     at /usr/src/debug/qemu-4.0/exec.c:3279
+>
+> > > > #5  0x00005612167e0506 in flatview_write (fv=0x56121852edd0,
+>
+> > > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address
+>
+> > > > 0x7fce7dd97028 out of bounds>, len=2) at
+>
+> > > > /usr/src/debug/qemu-4.0/exec.c:3318
+>
+> > > > #6  0x00005612167e4a1b in address_space_write (as=<optimized out>,
+>
+> > > > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized
+>
+> > > > out>) at /usr/src/debug/qemu-4.0/exec.c:3408
+>
+> > > > #7  0x00005612167e4aa5 in address_space_rw (as=<optimized out>,
+>
+> > > > addr=<optimized out>, attrs=..., attrs@entry=...,
+>
+> > > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
+>
+> > > > len=<optimized out>, is_write=<optimized out>) at
+>
+> > > > /usr/src/debug/qemu-4.0/exec.c:3419
+>
+> > > > #8  0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00)
+>
+> > > > at /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034
+>
+> > > > #9  0x000056121682255e in qemu_kvm_cpu_thread_fn
+>
+> > > > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281
+>
+> > > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at
+>
+> > > > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502
+>
+> > > > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0
+>
+> > > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6
+>
+> > > >
+>
+> > > > And I searched and found
+>
+> > > >
+https://bugzilla.redhat.com/show_bug.cgi?id=1706759
+, which has the
+>
+> > > > same
+>
+> > > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add
+>
+> > > > blk_drain() to virtio_blk_device_unrealize()") is to fix this
+>
+> > > > particular
+>
+> > > > bug.
+>
+> > > >
+>
+> > > > But I can still hit the bug even after applying the commit. Do I miss
+>
+> > > > anything?
+>
+> > >
+>
+> > > Hi Eryu,
+>
+> > > This backtrace seems to be caused by this bug (there were two bugs in
+>
+> > > 1706759):
+https://bugzilla.redhat.com/show_bug.cgi?id=1708480
+>
+> > > Although the solution hasn't been tested on virtio-blk yet, you may
+>
+> > > want to apply this patch:
+>
+> > >
+>
+> > >
+https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html
+>
+> > > Let me know if this works.
+>
+> >
+>
+> > Unfortunately, I still see the same segfault & backtrace after applying
+>
+> > commit 421afd2fe8dd ("virtio: reset region cache when on queue
+>
+> > deletion")
+>
+> >
+>
+> > Anything I can help to debug?
+>
+>
+>
+> Please post the QEMU command-line and the QMP commands use to remove the
+>
+> device.
+>
+>
+It's a normal kata instance using virtio-fs as rootfs.
+>
+>
+/usr/local/libexec/qemu-kvm -name
+>
+sandbox-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d \
+>
+-uuid e03f6b6b-b80b-40c0-8d5b-0cbfed1305d2 -machine
+>
+q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit \
+>
+-cpu host -qmp
+>
+unix:/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait
+>
+\
+>
+-qmp
+>
+unix:/run/vc/vm/debug-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait
+>
+\
+>
+-m 2048M,slots=10,maxmem=773893M -device
+>
+pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= \
+>
+-device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device
+>
+virtconsole,chardev=charconsole0,id=console0 \
+>
+-chardev
+>
+socket,id=charconsole0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/console.sock,server,nowait
+>
+\
+>
+-device
+>
+virtserialport,chardev=metricagent,id=channel10,name=metric.agent.channel.10 \
+>
+-chardev
+>
+socket,id=metricagent,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/metric.agent.channel.sock,server,nowait
+>
+\
+>
+-device nvdimm,id=nv0,memdev=mem0 -object
+>
+memory-backend-file,id=mem0,mem-path=/usr/local/share/containers-image-1.9.0.img,size=268435456
+>
+\
+>
+-object rng-random,id=rng0,filename=/dev/urandom -device
+>
+virtio-rng,rng=rng0,romfile= \
+>
+-device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 \
+>
+-chardev
+>
+socket,id=charch0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/kata.sock,server,nowait
+>
+\
+>
+-chardev
+>
+socket,id=char-6fca044b801a78a1,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/vhost-fs.sock
+>
+\
+>
+-device
+>
+vhost-user-fs-pci,chardev=char-6fca044b801a78a1,tag=kataShared,cache-size=8192M
+>
+-netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 \
+>
+-device
+>
+driver=virtio-net-pci,netdev=network-0,mac=76:57:f1:ab:51:5c,disable-modern=false,mq=on,vectors=4,romfile=
+>
+\
+>
+-global kvm-pit.lost_tick_policy=discard -vga none -no-user-config
+>
+-nodefaults -nographic -daemonize \
+>
+-object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on
+>
+-numa node,memdev=dimm1 -kernel /usr/local/share/kernel \
+>
+-append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1
+>
+i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k
+>
+console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0
+>
+pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro
+>
+ro rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=96
+>
+agent.use_vsock=false init=/usr/lib/systemd/systemd
+>
+systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service
+>
+systemd.mask=systemd-networkd.socket \
+>
+-pidfile
+>
+/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/pid
+>
+\
+>
+-smp 1,cores=1,threads=1,sockets=96,maxcpus=96
+>
+>
+QMP command to delete device (the device id is just an example, not the
+>
+one caused the crash):
+>
+>
+"{\"arguments\":{\"id\":\"virtio-drive-5967abfb917c8da6\"},\"execute\":\"device_del\"}"
+>
+>
+which has been hot plugged by:
+>
+"{\"arguments\":{\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":{\"driver\":\"file\",\"filename\":\"/dev/dm-18\"},\"node-name\":\"drive-5967abfb917c8da6\"},\"execute\":\"blockdev-add\"}"
+>
+"{\"return\": {}}"
+>
+"{\"arguments\":{\"addr\":\"01\",\"bus\":\"pci-bridge-0\",\"drive\":\"drive-5967abfb917c8da6\",\"driver\":\"virtio-blk-pci\",\"id\":\"virtio-drive-5967abfb917c8da6\",\"romfile\":\"\",\"share-rw\":\"on\"},\"execute\":\"device_add\"}"
+>
+"{\"return\": {}}"
+Thanks.  I wasn't able to reproduce this crash with qemu.git/master.
+
+One thing that is strange about the latest backtrace you posted: QEMU is
+dispatching the memory access instead of using the ioeventfd code that
+that virtio-blk-pci normally takes when a virtqueue is notified.  I
+guess this means ioeventfd has already been disabled due to the hot
+unplug.
+
+Could you try with machine type "i440fx" instead of "q35"?  I wonder if
+pci-bridge/shpc is part of the problem.
+
+Stefan
+signature.asc
+Description:
+PGP signature
+
+On Tue, Jan 14, 2020 at 04:16:24PM +0000, Stefan Hajnoczi wrote:
+>
+On Tue, Jan 14, 2020 at 10:50:58AM +0800, Eryu Guan wrote:
+>
+> On Mon, Jan 13, 2020 at 04:38:55PM +0000, Stefan Hajnoczi wrote:
+>
+> > On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote:
+>
+> > > On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote:
+>
+> > > > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote:
+>
+> > > > >
+>
+> > > > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote:
+>
+> > > > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
+>
+> > > > > > > On Tue, 31 Dec 2019 18:34:34 +0800
+>
+> > > > > > > Eryu Guan <address@hidden> wrote:
+>
+> > > > > > >
+>
+> > > > > > > > Hi,
+>
+> > > > > > > >
+>
+> > > > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata
+>
+> > > > > > > > sandbox, I
+>
+> > > > > > > > think it's because io completion hits use-after-free when
+>
+> > > > > > > > device is
+>
+> > > > > > > > already gone. Is this a known bug that has been fixed? (I
+>
+> > > > > > > > went through
+>
+> > > > > > > > the git log but didn't find anything obvious).
+>
+> > > > > > > >
+>
+> > > > > > > > gdb backtrace is:
+>
+> > > > > > > >
+>
+> > > > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
+>
+> > > > > > > > Program terminated with signal 11, Segmentation fault.
+>
+> > > > > > > > #0 object_get_class (obj=obj@entry=0x0) at
+>
+> > > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > > > > > 903        return obj->class;
+>
+> > > > > > > > (gdb) bt
+>
+> > > > > > > > #0  object_get_class (obj=obj@entry=0x0) at
+>
+> > > > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
+>
+> > > > > > > > #1  0x0000558a2c009e9b in virtio_notify_vector
+>
+> > > > > > > > (vdev=0x558a2e7751d0,
+>
+> > > > > > > >     vector=<optimized out>) at
+>
+> > > > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
+>
+> > > > > > > > #2  0x0000558a2bfdcb1e in
+>
+> > > > > > > > virtio_blk_discard_write_zeroes_complete (
+>
+> > > > > > > >     opaque=0x558a2f2fd420, ret=0)
+>
+> > > > > > > >     at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
+>
+> > > > > > > > #3  0x0000558a2c261c7e in blk_aio_complete
+>
+> > > > > > > > (acb=0x558a2eed7420)
+>
+> > > > > > > >     at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
+>
+> > > > > > > > #4  0x0000558a2c3031db in coroutine_trampoline (i0=<optimized
+>
+> > > > > > > > out>,
+>
+> > > > > > > >     i1=<optimized out>) at
+>
+> > > > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
+>
+> > > > > > > > #5  0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
+>
+> > > > > > > > #6  0x00007fff9ed75780 in ?? ()
+>
+> > > > > > > > #7  0x0000000000000000 in ?? ()
+>
+> > > > > > > >
+>
+> > > > > > > > It seems like qemu was completing a discard/write_zero
+>
+> > > > > > > > request, but
+>
+> > > > > > > > parent BusState was already freed & set to NULL.
+>
+> > > > > > > >
+>
+> > > > > > > > Do we need to drain all pending request before unrealizing
+>
+> > > > > > > > virtio-blk
+>
+> > > > > > > > device? Like the following patch proposed?
+>
+> > > > > > > >
+>
+> > > > > > > >
+https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
+>
+> > > > > > > >
+>
+> > > > > > > > If more info is needed, please let me know.
+>
+> > > > > > >
+>
+> > > > > > > may be this will help:
+>
+> > > > > > >
+https://patchwork.kernel.org/patch/11213047/
+>
+> > > > > >
+>
+> > > > > > Yeah, this looks promising! I'll try it out (though it's a
+>
+> > > > > > one-time
+>
+> > > > > > crash for me). Thanks!
+>
+> > > > >
+>
+> > > > > After applying this patch, I don't see the original segfaut and
+>
+> > > > > backtrace, but I see this crash
+>
+> > > > >
+>
+> > > > > [Thread debugging using libthread_db enabled]
+>
+> > > > > Using host libthread_db library "/lib64/libthread_db.so.1".
+>
+> > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name
+>
+> > > > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'.
+>
+> > > > > Program terminated with signal 11, Segmentation fault.
+>
+> > > > > #0  0x0000561216a57609 in virtio_pci_notify_write
+>
+> > > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>,
+>
+> > > > > size=<optimized out>) at
+>
+> > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> > > > > 1324        VirtIOPCIProxy *proxy =
+>
+> > > > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent);
+>
+> > > > > Missing separate debuginfos, use: debuginfo-install
+>
+> > > > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64
+>
+> > > > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64
+>
+> > > > > libstdc++-4.8.5-28.alios7.1.x86_64
+>
+> > > > > numactl-libs-2.0.9-5.1.alios7.x86_64
+>
+> > > > > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64
+>
+> > > > > (gdb) bt
+>
+> > > > > #0  0x0000561216a57609 in virtio_pci_notify_write
+>
+> > > > > (opaque=0x5612184747e0, addr=0, val=<optimized out>,
+>
+> > > > > size=<optimized out>) at
+>
+> > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
+>
+> > > > > #1  0x0000561216835b22 in memory_region_write_accessor
+>
+> > > > > (mr=<optimized out>, addr=<optimized out>, value=<optimized out>,
+>
+> > > > > size=<optimized out>, shift=<optimized out>, mask=<optimized out>,
+>
+> > > > > attrs=...) at /usr/src/debug/qemu-4.0/memory.c:502
+>
+> > > > > #2  0x0000561216833c5d in access_with_adjusted_size
+>
+> > > > > (addr=addr@entry=0, value=value@entry=0x7fcdeab1b8a8,
+>
+> > > > > size=size@entry=2, access_size_min=<optimized out>,
+>
+> > > > > access_size_max=<optimized out>, access_fn=0x561216835ac0
+>
+> > > > > <memory_region_write_accessor>, mr=0x56121846d340, attrs=...)
+>
+> > > > >     at /usr/src/debug/qemu-4.0/memory.c:568
+>
+> > > > > #3  0x0000561216837c66 in memory_region_dispatch_write
+>
+> > > > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2,
+>
+> > > > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503
+>
+> > > > > #4  0x00005612167e036f in flatview_write_continue
+>
+> > > > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304,
+>
+> > > > > attrs=..., buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out
+>
+> > > > > of bounds>, len=len@entry=2, addr1=<optimized out>, l=<optimized
+>
+> > > > > out>, mr=0x56121846d340)
+>
+> > > > >     at /usr/src/debug/qemu-4.0/exec.c:3279
+>
+> > > > > #5  0x00005612167e0506 in flatview_write (fv=0x56121852edd0,
+>
+> > > > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address
+>
+> > > > > 0x7fce7dd97028 out of bounds>, len=2) at
+>
+> > > > > /usr/src/debug/qemu-4.0/exec.c:3318
+>
+> > > > > #6  0x00005612167e4a1b in address_space_write (as=<optimized out>,
+>
+> > > > > addr=<optimized out>, attrs=..., buf=<optimized out>,
+>
+> > > > > len=<optimized out>) at /usr/src/debug/qemu-4.0/exec.c:3408
+>
+> > > > > #7  0x00005612167e4aa5 in address_space_rw (as=<optimized out>,
+>
+> > > > > addr=<optimized out>, attrs=..., attrs@entry=...,
+>
+> > > > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of
+>
+> > > > > bounds>, len=<optimized out>, is_write=<optimized out>) at
+>
+> > > > > /usr/src/debug/qemu-4.0/exec.c:3419
+>
+> > > > > #8  0x0000561216849da1 in kvm_cpu_exec
+>
+> > > > > (cpu=cpu@entry=0x56121849aa00) at
+>
+> > > > > /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034
+>
+> > > > > #9  0x000056121682255e in qemu_kvm_cpu_thread_fn
+>
+> > > > > (arg=arg@entry=0x56121849aa00) at
+>
+> > > > > /usr/src/debug/qemu-4.0/cpus.c:1281
+>
+> > > > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>)
+>
+> > > > > at /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502
+>
+> > > > > #11 0x00007fce7bef6e25 in start_thread () from
+>
+> > > > > /lib64/libpthread.so.0
+>
+> > > > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6
+>
+> > > > >
+>
+> > > > > And I searched and found
+>
+> > > > >
+https://bugzilla.redhat.com/show_bug.cgi?id=1706759
+, which has the
+>
+> > > > > same
+>
+> > > > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk:
+>
+> > > > > Add
+>
+> > > > > blk_drain() to virtio_blk_device_unrealize()") is to fix this
+>
+> > > > > particular
+>
+> > > > > bug.
+>
+> > > > >
+>
+> > > > > But I can still hit the bug even after applying the commit. Do I
+>
+> > > > > miss
+>
+> > > > > anything?
+>
+> > > >
+>
+> > > > Hi Eryu,
+>
+> > > > This backtrace seems to be caused by this bug (there were two bugs in
+>
+> > > > 1706759):
+https://bugzilla.redhat.com/show_bug.cgi?id=1708480
+>
+> > > > Although the solution hasn't been tested on virtio-blk yet, you may
+>
+> > > > want to apply this patch:
+>
+> > > >
+>
+> > > >
+https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html
+>
+> > > > Let me know if this works.
+>
+> > >
+>
+> > > Unfortunately, I still see the same segfault & backtrace after applying
+>
+> > > commit 421afd2fe8dd ("virtio: reset region cache when on queue
+>
+> > > deletion")
+>
+> > >
+>
+> > > Anything I can help to debug?
+>
+> >
+>
+> > Please post the QEMU command-line and the QMP commands use to remove the
+>
+> > device.
+>
+>
+>
+> It's a normal kata instance using virtio-fs as rootfs.
+>
+>
+>
+> /usr/local/libexec/qemu-kvm -name
+>
+> sandbox-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d \
+>
+>  -uuid e03f6b6b-b80b-40c0-8d5b-0cbfed1305d2 -machine
+>
+> q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit \
+>
+>  -cpu host -qmp
+>
+> unix:/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait
+>
+>  \
+>
+>  -qmp
+>
+> unix:/run/vc/vm/debug-a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/qmp.sock,server,nowait
+>
+>  \
+>
+>  -m 2048M,slots=10,maxmem=773893M -device
+>
+> pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= \
+>
+>  -device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device
+>
+> virtconsole,chardev=charconsole0,id=console0 \
+>
+>  -chardev
+>
+> socket,id=charconsole0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/console.sock,server,nowait
+>
+>  \
+>
+>  -device
+>
+> virtserialport,chardev=metricagent,id=channel10,name=metric.agent.channel.10
+>
+>  \
+>
+>  -chardev
+>
+> socket,id=metricagent,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/metric.agent.channel.sock,server,nowait
+>
+>  \
+>
+>  -device nvdimm,id=nv0,memdev=mem0 -object
+>
+> memory-backend-file,id=mem0,mem-path=/usr/local/share/containers-image-1.9.0.img,size=268435456
+>
+>  \
+>
+>  -object rng-random,id=rng0,filename=/dev/urandom -device
+>
+> virtio-rng,rng=rng0,romfile= \
+>
+>  -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 \
+>
+>  -chardev
+>
+> socket,id=charch0,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/kata.sock,server,nowait
+>
+>  \
+>
+>  -chardev
+>
+> socket,id=char-6fca044b801a78a1,path=/run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/vhost-fs.sock
+>
+>  \
+>
+>  -device
+>
+> vhost-user-fs-pci,chardev=char-6fca044b801a78a1,tag=kataShared,cache-size=8192M
+>
+>  -netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 \
+>
+>  -device
+>
+> driver=virtio-net-pci,netdev=network-0,mac=76:57:f1:ab:51:5c,disable-modern=false,mq=on,vectors=4,romfile=
+>
+>  \
+>
+>  -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config
+>
+> -nodefaults -nographic -daemonize \
+>
+>  -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on
+>
+> -numa node,memdev=dimm1 -kernel /usr/local/share/kernel \
+>
+>  -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1
+>
+> i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp
+>
+> reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests
+>
+> net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1
+>
+> rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 quiet
+>
+> systemd.show_status=false panic=1 nr_cpus=96 agent.use_vsock=false
+>
+> init=/usr/lib/systemd/systemd systemd.unit=kata-containers.target
+>
+> systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket \
+>
+>  -pidfile
+>
+> /run/vc/vm/a670786fcb1758d2348eb120939d90ffacf9f049f10b337284ad49bbcd60936d/pid
+>
+>  \
+>
+>  -smp 1,cores=1,threads=1,sockets=96,maxcpus=96
+>
+>
+>
+> QMP command to delete device (the device id is just an example, not the
+>
+> one caused the crash):
+>
+>
+>
+> "{\"arguments\":{\"id\":\"virtio-drive-5967abfb917c8da6\"},\"execute\":\"device_del\"}"
+>
+>
+>
+> which has been hot plugged by:
+>
+> "{\"arguments\":{\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":{\"driver\":\"file\",\"filename\":\"/dev/dm-18\"},\"node-name\":\"drive-5967abfb917c8da6\"},\"execute\":\"blockdev-add\"}"
+>
+> "{\"return\": {}}"
+>
+> "{\"arguments\":{\"addr\":\"01\",\"bus\":\"pci-bridge-0\",\"drive\":\"drive-5967abfb917c8da6\",\"driver\":\"virtio-blk-pci\",\"id\":\"virtio-drive-5967abfb917c8da6\",\"romfile\":\"\",\"share-rw\":\"on\"},\"execute\":\"device_add\"}"
+>
+> "{\"return\": {}}"
+>
+>
+Thanks.  I wasn't able to reproduce this crash with qemu.git/master.
+>
+>
+One thing that is strange about the latest backtrace you posted: QEMU is
+>
+dispatching the memory access instead of using the ioeventfd code that
+>
+that virtio-blk-pci normally takes when a virtqueue is notified.  I
+>
+guess this means ioeventfd has already been disabled due to the hot
+>
+unplug.
+>
+>
+Could you try with machine type "i440fx" instead of "q35"?  I wonder if
+>
+pci-bridge/shpc is part of the problem.
+Sure, will try it. But it may take some time, as the test bed is busy
+with other testing tasks. I'll report back once I got the results.
+
+Thanks,
+Eryu
+
diff --git a/results/classifier/zero-shot/007/permissions/88281850 b/results/classifier/zero-shot/007/permissions/88281850
new file mode 100644
index 000000000..02513facf
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/88281850
@@ -0,0 +1,291 @@
+permissions: 0.985
+other: 0.983
+debug: 0.979
+graphic: 0.974
+network: 0.973
+device: 0.970
+performance: 0.969
+semantic: 0.968
+boot: 0.967
+socket: 0.966
+files: 0.962
+PID: 0.959
+vnc: 0.945
+KVM: 0.881
+
+[Bug] Take more 150s to boot qemu on ARM64
+
+Hi all,
+I encounter a issue with kernel 5.19-rc1 on a ARM64 board:  it takes
+about 150s between beginning to run qemu command and beginng to boot
+Linux kernel ("EFI stub: Booting Linux Kernel...").
+But in kernel 5.18-rc4, it only takes about 5s. I git bisect the kernel
+code and it finds c2445d387850 ("srcu: Add contention check to
+call_srcu() srcu_data ->lock acquisition").
+The qemu (qemu version is 6.2.92) command i run is :
+
+./qemu-system-aarch64 -m 4G,slots=4,maxmem=8g \
+--trace "kvm*" \
+-cpu host \
+-machine virt,accel=kvm,gic-version=3  \
+-machine smp.cpus=2,smp.sockets=2 \
+-no-reboot \
+-nographic \
+-monitor unix:/home/cx/qmp-test,server,nowait \
+-bios /home/cx/boot/QEMU_EFI.fd \
+-kernel /home/cx/boot/Image  \
+-device
+pcie-root-port,port=0x8,chassis=1,id=net1,bus=pcie.0,multifunction=on,addr=0x1
+\
+-device vfio-pci,host=7d:01.3,id=net0 \
+-device virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=4  \
+-drive file=/home/cx/boot/boot_ubuntu.img,if=none,id=drive0 \
+-append "rdinit=init console=ttyAMA0 root=/dev/vda rootfstype=ext4 rw " \
+-net none \
+-D /home/cx/qemu_log.txt
+I am not familiar with rcu code, and don't know how it causes the issue.
+Do you have any idea about this issue?
+Best Regard,
+
+Xiang Chen
+
+On Mon, Jun 13, 2022 at 08:26:34PM +0800, chenxiang (M) wrote:
+>
+Hi all,
+>
+>
+I encounter a issue with kernel 5.19-rc1 on a ARM64 board:  it takes about
+>
+150s between beginning to run qemu command and beginng to boot Linux kernel
+>
+("EFI stub: Booting Linux Kernel...").
+>
+>
+But in kernel 5.18-rc4, it only takes about 5s. I git bisect the kernel code
+>
+and it finds c2445d387850 ("srcu: Add contention check to call_srcu()
+>
+srcu_data ->lock acquisition").
+>
+>
+The qemu (qemu version is 6.2.92) command i run is :
+>
+>
+./qemu-system-aarch64 -m 4G,slots=4,maxmem=8g \
+>
+--trace "kvm*" \
+>
+-cpu host \
+>
+-machine virt,accel=kvm,gic-version=3  \
+>
+-machine smp.cpus=2,smp.sockets=2 \
+>
+-no-reboot \
+>
+-nographic \
+>
+-monitor unix:/home/cx/qmp-test,server,nowait \
+>
+-bios /home/cx/boot/QEMU_EFI.fd \
+>
+-kernel /home/cx/boot/Image  \
+>
+-device
+>
+pcie-root-port,port=0x8,chassis=1,id=net1,bus=pcie.0,multifunction=on,addr=0x1
+>
+\
+>
+-device vfio-pci,host=7d:01.3,id=net0 \
+>
+-device virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=4  \
+>
+-drive file=/home/cx/boot/boot_ubuntu.img,if=none,id=drive0 \
+>
+-append "rdinit=init console=ttyAMA0 root=/dev/vda rootfstype=ext4 rw " \
+>
+-net none \
+>
+-D /home/cx/qemu_log.txt
+>
+>
+I am not familiar with rcu code, and don't know how it causes the issue. Do
+>
+you have any idea about this issue?
+Please see the discussion here:
+https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/
+Though that report requires ACPI to be forced on to get the
+delay, which results in more than 9,000 back-to-back calls to
+synchronize_srcu_expedited().  I cannot reproduce this on my setup, even
+with an artificial tight loop invoking synchronize_srcu_expedited(),
+but then again I don't have ARM hardware.
+
+My current guess is that the following patch, but with larger values for
+SRCU_MAX_NODELAY_PHASE.  Here "larger" might well be up in the hundreds,
+or perhaps even larger.
+
+If you get a chance to experiment with this, could you please reply
+to the discussion at the above URL?  (Or let me know, and I can CC
+you on the next message in that thread.)
+
+                                                Thanx, Paul
+
+------------------------------------------------------------------------
+
+diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
+index 50ba70f019dea..0db7873f4e95b 100644
+--- a/kernel/rcu/srcutree.c
++++ b/kernel/rcu/srcutree.c
+@@ -513,7 +513,7 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
+ 
+ #define SRCU_INTERVAL          1       // Base delay if no expedited GPs 
+pending.
+ #define SRCU_MAX_INTERVAL      10      // Maximum incremental delay from slow 
+readers.
+-#define SRCU_MAX_NODELAY_PHASE 1       // Maximum per-GP-phase consecutive 
+no-delay instances.
++#define SRCU_MAX_NODELAY_PHASE 3       // Maximum per-GP-phase consecutive 
+no-delay instances.
+ #define SRCU_MAX_NODELAY       100     // Maximum consecutive no-delay 
+instances.
+ 
+ /*
+@@ -522,16 +522,22 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
+  */
+ static unsigned long srcu_get_delay(struct srcu_struct *ssp)
+ {
++       unsigned long gpstart;
++       unsigned long j;
+        unsigned long jbase = SRCU_INTERVAL;
+ 
+        if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), 
+READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
+                jbase = 0;
+-       if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)))
+-               jbase += jiffies - READ_ONCE(ssp->srcu_gp_start);
+-       if (!jbase) {
+-               WRITE_ONCE(ssp->srcu_n_exp_nodelay, 
+READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
+-               if (READ_ONCE(ssp->srcu_n_exp_nodelay) > SRCU_MAX_NODELAY_PHASE)
+-                       jbase = 1;
++       if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) {
++               j = jiffies - 1;
++               gpstart = READ_ONCE(ssp->srcu_gp_start);
++               if (time_after(j, gpstart))
++                       jbase += j - gpstart;
++               if (!jbase) {
++                       WRITE_ONCE(ssp->srcu_n_exp_nodelay, 
+READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
++                       if (READ_ONCE(ssp->srcu_n_exp_nodelay) > 
+SRCU_MAX_NODELAY_PHASE)
++                               jbase = 1;
++               }
+        }
+        return jbase > SRCU_MAX_INTERVAL ? SRCU_MAX_INTERVAL : jbase;
+ }
+
+å¨ 2022/6/13 21:22, Paul E. McKenney åé:
+On Mon, Jun 13, 2022 at 08:26:34PM +0800, chenxiang (M) wrote:
+Hi all,
+
+I encounter a issue with kernel 5.19-rc1 on a ARM64 board:  it takes about
+150s between beginning to run qemu command and beginng to boot Linux kernel
+("EFI stub: Booting Linux Kernel...").
+
+But in kernel 5.18-rc4, it only takes about 5s. I git bisect the kernel code
+and it finds c2445d387850 ("srcu: Add contention check to call_srcu()
+srcu_data ->lock acquisition").
+
+The qemu (qemu version is 6.2.92) command i run is :
+
+./qemu-system-aarch64 -m 4G,slots=4,maxmem=8g \
+--trace "kvm*" \
+-cpu host \
+-machine virt,accel=kvm,gic-version=3  \
+-machine smp.cpus=2,smp.sockets=2 \
+-no-reboot \
+-nographic \
+-monitor unix:/home/cx/qmp-test,server,nowait \
+-bios /home/cx/boot/QEMU_EFI.fd \
+-kernel /home/cx/boot/Image  \
+-device 
+pcie-root-port,port=0x8,chassis=1,id=net1,bus=pcie.0,multifunction=on,addr=0x1
+\
+-device vfio-pci,host=7d:01.3,id=net0 \
+-device virtio-blk-pci,drive=drive0,id=virtblk0,num-queues=4  \
+-drive file=/home/cx/boot/boot_ubuntu.img,if=none,id=drive0 \
+-append "rdinit=init console=ttyAMA0 root=/dev/vda rootfstype=ext4 rw " \
+-net none \
+-D /home/cx/qemu_log.txt
+
+I am not familiar with rcu code, and don't know how it causes the issue. Do
+you have any idea about this issue?
+Please see the discussion here:
+https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/
+Though that report requires ACPI to be forced on to get the
+delay, which results in more than 9,000 back-to-back calls to
+synchronize_srcu_expedited().  I cannot reproduce this on my setup, even
+with an artificial tight loop invoking synchronize_srcu_expedited(),
+but then again I don't have ARM hardware.
+
+My current guess is that the following patch, but with larger values for
+SRCU_MAX_NODELAY_PHASE.  Here "larger" might well be up in the hundreds,
+or perhaps even larger.
+
+If you get a chance to experiment with this, could you please reply
+to the discussion at the above URL?  (Or let me know, and I can CC
+you on the next message in that thread.)
+Ok, thanks, i will reply it on above URL.
+Thanx, Paul
+
+------------------------------------------------------------------------
+
+diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
+index 50ba70f019dea..0db7873f4e95b 100644
+--- a/kernel/rcu/srcutree.c
++++ b/kernel/rcu/srcutree.c
+@@ -513,7 +513,7 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
+#define SRCU_INTERVAL		1	// Base delay if no expedited GPs pending.
+#define SRCU_MAX_INTERVAL     10      // Maximum incremental delay from slow 
+readers.
+-#define SRCU_MAX_NODELAY_PHASE 1       // Maximum per-GP-phase consecutive 
+no-delay instances.
++#define SRCU_MAX_NODELAY_PHASE 3       // Maximum per-GP-phase consecutive 
+no-delay instances.
+  #define SRCU_MAX_NODELAY      100     // Maximum consecutive no-delay 
+instances.
+/*
+@@ -522,16 +522,22 @@ static bool srcu_readers_active(struct srcu_struct *ssp)
+   */
+  static unsigned long srcu_get_delay(struct srcu_struct *ssp)
+  {
++       unsigned long gpstart;
++       unsigned long j;
+        unsigned long jbase = SRCU_INTERVAL;
+if (ULONG_CMP_LT(READ_ONCE(ssp->srcu_gp_seq), READ_ONCE(ssp->srcu_gp_seq_needed_exp)))
+jbase = 0;
+-       if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)))
+-               jbase += jiffies - READ_ONCE(ssp->srcu_gp_start);
+-       if (!jbase) {
+-               WRITE_ONCE(ssp->srcu_n_exp_nodelay, 
+READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
+-               if (READ_ONCE(ssp->srcu_n_exp_nodelay) > SRCU_MAX_NODELAY_PHASE)
+-                       jbase = 1;
++       if (rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq))) {
++               j = jiffies - 1;
++               gpstart = READ_ONCE(ssp->srcu_gp_start);
++               if (time_after(j, gpstart))
++                       jbase += j - gpstart;
++               if (!jbase) {
++                       WRITE_ONCE(ssp->srcu_n_exp_nodelay, 
+READ_ONCE(ssp->srcu_n_exp_nodelay) + 1);
++                       if (READ_ONCE(ssp->srcu_n_exp_nodelay) > 
+SRCU_MAX_NODELAY_PHASE)
++                               jbase = 1;
++               }
+        }
+        return jbase > SRCU_MAX_INTERVAL ? SRCU_MAX_INTERVAL : jbase;
+  }
+.
+
diff --git a/results/classifier/zero-shot/007/permissions/95154278 b/results/classifier/zero-shot/007/permissions/95154278
new file mode 100644
index 000000000..2dc0c2ffc
--- /dev/null
+++ b/results/classifier/zero-shot/007/permissions/95154278
@@ -0,0 +1,165 @@
+permissions: 0.989
+other: 0.953
+debug: 0.951
+device: 0.951
+graphic: 0.950
+PID: 0.949
+vnc: 0.948
+semantic: 0.937
+performance: 0.936
+files: 0.918
+KVM: 0.916
+socket: 0.913
+network: 0.913
+boot: 0.902
+
+[Qemu-devel] [BUG] checkpatch.pl hangs on target/mips/msa_helper.c
+
+If  checkpatch.pl is applied (using switch "-f") on file 
+target/mips/msa_helper.c, it will hang.
+
+There is a workaround for this particular file:
+
+These lines in msa_helper.c:
+
+        uint## BITS ##_t S = _S, T = _T;                            \
+        uint## BITS ##_t as, at, xs, xt, xd;                        \
+
+should be replaced with:
+
+        uint## BITS ## _t S = _S, T = _T;                           \
+        uint## BITS ## _t as, at, xs, xt, xd;                       \
+
+(a space is added after the second "##" in each line)
+
+The workaround is found by partial deleting and undeleting of the code in 
+msa_helper.c in binary search fashion.
+
+This workaround will soon be submitted by me as a patch within a series on misc 
+MIPS issues.
+
+I took a look at checkpatch.pl code, and it looks it is fairly complicated to 
+fix the issue, since it happens in the code segment involving intricate logic 
+conditions.
+
+Regards,
+Aleksandar
+
+On Wed, Jul 04, 2018 at 03:35:18PM +0000, Aleksandar Markovic wrote:
+>
+If  checkpatch.pl is applied (using switch "-f") on file
+>
+target/mips/msa_helper.c, it will hang.
+>
+>
+There is a workaround for this particular file:
+>
+>
+These lines in msa_helper.c:
+>
+>
+uint## BITS ##_t S = _S, T = _T;                            \
+>
+uint## BITS ##_t as, at, xs, xt, xd;                        \
+>
+>
+should be replaced with:
+>
+>
+uint## BITS ## _t S = _S, T = _T;                           \
+>
+uint## BITS ## _t as, at, xs, xt, xd;                       \
+>
+>
+(a space is added after the second "##" in each line)
+>
+>
+The workaround is found by partial deleting and undeleting of the code in
+>
+msa_helper.c in binary search fashion.
+>
+>
+This workaround will soon be submitted by me as a patch within a series on
+>
+misc MIPS issues.
+>
+>
+I took a look at checkpatch.pl code, and it looks it is fairly complicated to
+>
+fix the issue, since it happens in the code segment involving intricate logic
+>
+conditions.
+Thanks for figuring this out, Aleksandar.  Not sure if anyone else has
+the apetite to fix checkpatch.pl.
+
+Stefan
+signature.asc
+Description:
+PGP signature
+
+On 07/11/2018 09:36 AM, Stefan Hajnoczi wrote:
+>
+On Wed, Jul 04, 2018 at 03:35:18PM +0000, Aleksandar Markovic wrote:
+>
+> If  checkpatch.pl is applied (using switch "-f") on file
+>
+> target/mips/msa_helper.c, it will hang.
+>
+>
+>
+> There is a workaround for this particular file:
+>
+>
+>
+> These lines in msa_helper.c:
+>
+>
+>
+>         uint## BITS ##_t S = _S, T = _T;                            \
+>
+>         uint## BITS ##_t as, at, xs, xt, xd;                        \
+>
+>
+>
+> should be replaced with:
+>
+>
+>
+>         uint## BITS ## _t S = _S, T = _T;                           \
+>
+>         uint## BITS ## _t as, at, xs, xt, xd;                       \
+>
+>
+>
+> (a space is added after the second "##" in each line)
+>
+>
+>
+> The workaround is found by partial deleting and undeleting of the code in
+>
+> msa_helper.c in binary search fashion.
+>
+>
+>
+> This workaround will soon be submitted by me as a patch within a series on
+>
+> misc MIPS issues.
+>
+>
+>
+> I took a look at checkpatch.pl code, and it looks it is fairly complicated
+>
+> to fix the issue, since it happens in the code segment involving intricate
+>
+> logic conditions.
+>
+>
+Thanks for figuring this out, Aleksandar.  Not sure if anyone else has
+>
+the apetite to fix checkpatch.pl.
+Anyone else but Paolo ;P
+http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg01250.html
+signature.asc
+Description:
+OpenPGP digital signature
+