summary refs log tree commit diff stats
path: root/llm/deepseek
diff options
context:
space:
mode:
Diffstat (limited to 'llm/deepseek')
-rw-r--r--llm/deepseek656
1 files changed, 0 insertions, 656 deletions
diff --git a/llm/deepseek b/llm/deepseek
deleted file mode 100644
index 674b9d607..000000000
--- a/llm/deepseek
+++ /dev/null
@@ -1,656 +0,0 @@
-If the tiny code generator of qemu makes an error in the binary translation, when emulating an instruction set architecture, we call it a semantic mistranslation bug. Here are three gitlab issues in the toml format. Tell me, if they describe a semantic mistranslation bug. Answer only with yes and no.
-
-1. issue:
-id = 1371
-title = "x86 BLSMSK semantic bug"
-state = "closed"
-created_at = "2022-12-16T06:43:29.794Z"
-closed_at = "2023-03-01T01:08:38.844Z"
-labels = ["Closed::Fixed", "accel: TCG", "target: i386"]
-url = "https://gitlab.com/qemu-project/qemu/-/issues/1371"
-host-os = "Windows 10 20H2"
-host-arch = "x86"
-qemu-version = "7.1.90 (v7.2.0-rc0)"
-guest-os = "None"
-guest-arch = "x86"
-description = """The result of instruction BLSMSK is different with from the CPU. The value of CF is different."""
-reproduce = """1. Compile this code
-void main() {
-    asm("mov rax, 0x65b2e276ad27c67");
-    asm("mov rbx, 0x62f34955226b2b5d");
-    asm("blsmsk eax, ebx");
-}
-
-2. Execute and compare the result with the CPU.
-    - CPU
-        - CF = 0
-    - QEMU
-        - CF = 1"""
-additional = """This bug is discovered by research conducted by KAIST SoftSec."""
-
-2. issue:
-id = 1057
-title = "AArch64: ISV is set to 1 in ESR_EL2 when taking a data abort with post-indexed instructions"
-state = "closed"
-created_at = "2022-06-02T20:50:04.695Z"
-closed_at = "2022-07-18T16:47:38.098Z"
-labels = ["accel: TCG", "target: arm", "workflow::In Progress"]
-url = "https://gitlab.com/qemu-project/qemu/-/issues/1057"
-host-os = "MacOS 12.4"
-host-arch = "ARM64"
-qemu-version = "QEMU emulator version 7.0.0"
-guest-os = "Custom - BedRock Hypervisor running with the NOVA microkernel"
-guest-arch = "ARMv8"
-description = """I think that I have a Qemu bug in my hands, but, I could still be missing something. Consider the following instruction:
-0x0000000000000000:  C3 44 00 B8    str   w3, [x6], #4
-
-notice the last #4, I think this is what we would call a post-indexed instruction (falls into the category of instructions with writeback). As I understand it, those instructions should not have ISV=1 in ESR_EL2 when faulting.
-
-Here is the relevant part of the manual:
-
-For other faults reported in ESR_EL2, ISV is 0 except for the following stage 2 aborts:
-• AArch64 loads and stores of a single general-purpose register (including the register specified with 0b11111, including those with Acquire/Release semantics, but excluding Load Exclusive or Store Exclusive and excluding those with writeback).
-
-
-However, I can see that Qemu sets ISV to 1 here. The ARM hardware that I tested gave me a value of ISV=0 for similar instructions.
-
-Another example of instruction: 0x00000000000002f8:  01 1C 40 38    ldrb  w1, [x0, #1]!"""
-reproduce = """1. Run some hypervisor in EL2
-2. Create a guest running at EL1 that executes one of the mentioned instructions (and make the instruction fault by writing to some unmapped page in SLP)
-3. Observe the value of ESR_EL2 on data abort
-
-Unfortunately, I cannot provide an image to reproduce this (the software is not open-source). But, I would be happy to help test a patch."""
-additional = "n/a"
-
------------------------------------------------------------------------------------------
-
-yes
-
-no
-
------------------------------------------------------------------------------------------
-
-Now I will give you Mailing Threads. Transform them into the toml format i provided you, which should be the only output. If a variable is not given in the mails, write "n/a". The id's should be 10000 and upwards.  the state is closed, when the bug is fixed, and open, when the bug is not fixed. Additionally, add a 'mistranslation' variable, which is set to 'yes', if the bug is a mistranslation bug, and 'no', if not. First Thread:
-
-On Tue, Apr 29, 2025 at 01:55:59PM +0800, Xiaoyao Li wrote:
-> Date: Tue, 29 Apr 2025 13:55:59 +0800
-> From: Xiaoyao Li <xiaoyao.li@intel.com>
-> Subject: Re: [Bug] QEMU TCG warnings after commit c6bd2dd63420 - HTT /
->  CMP_LEG bits
-> 
-> On 4/29/2025 11:02 AM, Ewan Hai wrote:
-> > Hi Community,
-> > 
-> > This email contains 3 bugs appear to share the same root cause.
-> > 
-> > [1] We ran into the following warnings when running QEMU v10.0.0 in TCG
-> > mode:
-> > 
-> > qemu-system-x86_64 \
-> >    -machine q35 \
-> >    -m 4G -smp 4 \
-> >    -kernel ./arch/x86/boot/bzImage \
-> >    -bios /usr/share/ovmf/OVMF.fd \
-> >    -drive file=~/kernel/rootfs.ext4,index=0,format=raw,media=disk \
-> >    -drive file=~/kernel/swap.img,index=1,format=raw,media=disk \
-> >    -nographic \
-> >    -append 'root=/dev/sda rw resume=/dev/sdb console=ttyS0 nokaslr'
-> > 
-> > qemu-system-x86_64: warning: TCG doesn't support requested feature:
-> > CPUID.01H:EDX.ht [bit 28]
-> > qemu-system-x86_64: warning: TCG doesn't support requested feature:
-> > CPUID.80000001H:ECX.cmp-legacy [bit 1]
-> > (repeats 4 times, once per vCPU)
-> > 
-> > Tracing the history shows that commit c6bd2dd63420 "i386/cpu: Set up
-> > CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()" is
-> > what introduced the warnings.
-> > 
-> > Since that commit, TCG unconditionally advertises HTT (CPUID 1 EDX[28])
-> > and CMP_LEG (CPUID 8000_0001 ECX[1]). Because TCG itself has no SMT
-> > support, these bits trigger the warnings above.
-> > 
-> > [2] Also, Zhao pointed me to a similar report on GitLab:
-> > https://gitlab.com/qemu-project/qemu/-/issues/2894
-> > The symptoms there look identical to what we're seeing.
-> > 
-> > By convention we file one issue per email, but these two appear to share
-> > the same root cause, so I'm describing them together here.
-> 
-> It was caused by my two patches. I think the fix can be as follow.
-> If no objection from the community, I can submit the formal patch.
-> 
-> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
-> index 1f970aa4daa6..fb95aadd6161 100644
-> --- a/target/i386/cpu.c
-> +++ b/target/i386/cpu.c
-> @@ -776,11 +776,12 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t
-> vendor1,
->            CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC | CPUID_SEP | \
->            CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV | CPUID_PAT | \
->            CPUID_PSE36 | CPUID_CLFLUSH | CPUID_ACPI | CPUID_MMX | \
-> -          CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS | CPUID_DE)
-> +          CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS | CPUID_DE | \
-> +          CPUID_HT)
->            /* partly implemented:
->            CPUID_MTRR, CPUID_MCA, CPUID_CLFLUSH (needed for Win64) */
->            /* missing:
-> -          CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_HT, CPUID_TM, CPUID_PBE */
-> +          CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_TM, CPUID_PBE */
-> 
->  /*
->   * Kernel-only features that can be shown to usermode programs even if
-> @@ -848,7 +849,8 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t
-> vendor1,
-> 
->  #define TCG_EXT3_FEATURES (CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | \
->            CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A | \
-> -          CPUID_EXT3_3DNOWPREFETCH | CPUID_EXT3_KERNEL_FEATURES)
-> +          CPUID_EXT3_3DNOWPREFETCH | CPUID_EXT3_KERNEL_FEATURES | \
-> +          CPUID_EXT3_CMP_LEG)
-> 
->  #define TCG_EXT4_FEATURES 0
-
-This fix is fine for me...at least from SDM, HTT depends on topology and
-it should exist when user sets "-smp 4".
-
-> > [3] My colleague Alan noticed what appears to be a related problem: if
-> > we launch a guest with '-cpu <model>,-ht --enable-kvm', which means
-> > explicitly removing the ht flag, but the guest still reports HT(cat
-> > /proc/cpuinfo in linux guest) enabled. In other words, under KVM the ht
-> > bit seems to be forced on even when the user tries to disable it.
-> 
-> XiaoYao reminded me that issue [3] stems from a different patch. Please
-> ignore it for now—I'll start a separate thread to discuss that one
-> independently.
-
-I haven't found any other thread :-).
-
-By the way, just curious, in what cases do you need to disbale the HT
-flag? "-smp 4" means 4 cores with 1 thread per core, and is it not
-enough?
-
-As for the “-ht” behavior, I'm also unsure whether this should be fixed
-or not - one possible consideration is whether “-ht” would be useful.
-
-On Tue, Apr 29, 2025 at 01:55:59PM +0800, Xiaoyao Li wrote:
-> Date: Tue, 29 Apr 2025 13:55:59 +0800
-> From: Xiaoyao Li <xiaoyao.li@intel.com>
-> Subject: Re: [Bug] QEMU TCG warnings after commit c6bd2dd63420 - HTT /
->  CMP_LEG bits
-> 
-> On 4/29/2025 11:02 AM, Ewan Hai wrote:
-> > Hi Community,
-> > 
-> > This email contains 3 bugs appear to share the same root cause.
-> > 
-> > [1] We ran into the following warnings when running QEMU v10.0.0 in TCG
-> > mode:
-> > 
-> > qemu-system-x86_64 \
-> >    -machine q35 \
-> >    -m 4G -smp 4 \
-> >    -kernel ./arch/x86/boot/bzImage \
-> >    -bios /usr/share/ovmf/OVMF.fd \
-> >    -drive file=~/kernel/rootfs.ext4,index=0,format=raw,media=disk \
-> >    -drive file=~/kernel/swap.img,index=1,format=raw,media=disk \
-> >    -nographic \
-> >    -append 'root=/dev/sda rw resume=/dev/sdb console=ttyS0 nokaslr'
-> > 
-> > qemu-system-x86_64: warning: TCG doesn't support requested feature:
-> > CPUID.01H:EDX.ht [bit 28]
-> > qemu-system-x86_64: warning: TCG doesn't support requested feature:
-> > CPUID.80000001H:ECX.cmp-legacy [bit 1]
-> > (repeats 4 times, once per vCPU)
-> > 
-> > Tracing the history shows that commit c6bd2dd63420 "i386/cpu: Set up
-> > CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()" is
-> > what introduced the warnings.
-> > 
-> > Since that commit, TCG unconditionally advertises HTT (CPUID 1 EDX[28])
-> > and CMP_LEG (CPUID 8000_0001 ECX[1]). Because TCG itself has no SMT
-> > support, these bits trigger the warnings above.
-> > 
-> > [2] Also, Zhao pointed me to a similar report on GitLab:
-> > https://gitlab.com/qemu-project/qemu/-/issues/2894
-> > The symptoms there look identical to what we're seeing.
-> > 
-> > By convention we file one issue per email, but these two appear to share
-> > the same root cause, so I'm describing them together here.
-> 
-> It was caused by my two patches. I think the fix can be as follow.
-> If no objection from the community, I can submit the formal patch.
-> 
-> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
-> index 1f970aa4daa6..fb95aadd6161 100644
-> --- a/target/i386/cpu.c
-> +++ b/target/i386/cpu.c
-> @@ -776,11 +776,12 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t
-> vendor1,
->            CPUID_PAE | CPUID_MCE | CPUID_CX8 | CPUID_APIC | CPUID_SEP | \
->            CPUID_MTRR | CPUID_PGE | CPUID_MCA | CPUID_CMOV | CPUID_PAT | \
->            CPUID_PSE36 | CPUID_CLFLUSH | CPUID_ACPI | CPUID_MMX | \
-> -          CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS | CPUID_DE)
-> +          CPUID_FXSR | CPUID_SSE | CPUID_SSE2 | CPUID_SS | CPUID_DE | \
-> +          CPUID_HT)
->            /* partly implemented:
->            CPUID_MTRR, CPUID_MCA, CPUID_CLFLUSH (needed for Win64) */
->            /* missing:
-> -          CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_HT, CPUID_TM, CPUID_PBE */
-> +          CPUID_VME, CPUID_DTS, CPUID_SS, CPUID_TM, CPUID_PBE */
-> 
->  /*
->   * Kernel-only features that can be shown to usermode programs even if
-> @@ -848,7 +849,8 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t
-> vendor1,
-> 
->  #define TCG_EXT3_FEATURES (CPUID_EXT3_LAHF_LM | CPUID_EXT3_SVM | \
->            CPUID_EXT3_CR8LEG | CPUID_EXT3_ABM | CPUID_EXT3_SSE4A | \
-> -          CPUID_EXT3_3DNOWPREFETCH | CPUID_EXT3_KERNEL_FEATURES)
-> +          CPUID_EXT3_3DNOWPREFETCH | CPUID_EXT3_KERNEL_FEATURES | \
-> +          CPUID_EXT3_CMP_LEG)
-> 
->  #define TCG_EXT4_FEATURES 0
-
-This fix is fine for me...at least from SDM, HTT depends on topology and
-it should exist when user sets "-smp 4".
-
-> > [3] My colleague Alan noticed what appears to be a related problem: if
-> > we launch a guest with '-cpu <model>,-ht --enable-kvm', which means
-> > explicitly removing the ht flag, but the guest still reports HT(cat
-> > /proc/cpuinfo in linux guest) enabled. In other words, under KVM the ht
-> > bit seems to be forced on even when the user tries to disable it.
-> 
-> XiaoYao reminded me that issue [3] stems from a different patch. Please
-> ignore it for now—I'll start a separate thread to discuss that one
-> independently.
-
-I haven't found any other thread :-).
-
-By the way, just curious, in what cases do you need to disbale the HT
-flag? "-smp 4" means 4 cores with 1 thread per core, and is it not
-enough?
-
-As for the “-ht” behavior, I'm also unsure whether this should be fixed
-or not - one possible consideration is whether “-ht” would be useful.
-
------------------------------------------------------------------------------------------
-
-toml
-[[issue]]
-id = 10000
-title = "QEMU TCG warnings after commit c6bd2dd63420 - HTT / CMP_LEG bits"
-state = "closed"
-created_at = "2025-04-29T11:02:00.000Z"
-closed_at = "n/a"
-labels = ["accel: TCG", "target: i386"]
-url = "n/a"
-host-os = "n/a"
-host-arch = "x86"
-qemu-version = "10.0.0"
-guest-os = "Linux"
-guest-arch = "x86"
-description = """TCG unconditionally advertises HTT (CPUID 1 EDX[28]) and CMP_LEG (CPUID 8000_0001 ECX[1]) despite lacking SMT support, causing warnings."""
-reproduce = """Run QEMU in TCG mode with a Linux guest and observe CPUID-related warnings."""
-additional = """Related to GitLab issue #2894. Fix involves adjusting CPU feature flags in target/i386/cpu.c."""
-mistranslation = "no"
-
-[[issue]]
-id = 10001
-title = "KVM forces HT flag even when explicitly disabled"
-state = "open"
-created_at = "2025-04-29T11:02:00.000Z"
-closed_at = "n/a"
-labels = ["accel: KVM", "target: i386"]
-url = "n/a"
-host-os = "n/a"
-host-arch = "x86"
-qemu-version = "10.0.0"
-guest-os = "Linux"
-guest-arch = "x86"
-description = """When launching a guest with '-cpu <model>,-ht --enable-kvm', the HT flag remains enabled in /proc/cpuinfo."""
-reproduce = """Launch a KVM guest with HT explicitly disabled and check CPU flags."""
-additional = """Separate issue from the TCG warnings. Discussion pending in another thread."""
-mistranslation = "no"
-Second Thread:
-This bug was fixed in the package qemu - 1:9.0.2+ds-4ubuntu5.3
-
------------------------------------------------------------------------------------------
-
-qemu (1:9.0.2+ds-4ubuntu5.3) oracular; urgency=medium
-
-  * d/p/u/lp2049698/*: Add full boot order support on s390x (LP: #2049698)
-  * Cherry-pick prerequisite for above backport (to avoid FTBFS):
-    - d/p/u/lp2049698/0-hw-s390x-sclp.c-include-s390-virtio-ccw.h-to-make.patch
-  * d/qemu-system-data.links: symlink s390-netboot.img -> s390-ccw.img for
-    backwards compatibility, as the code is now combined.
-  * Fix qemu-aarch64-static segfaults running ldconfig.real (LP: #2072564)
-    - lp-2072564-01-linux-user-Honor-elf-alignment-when-placing-images.patch
-    - lp-2072564-02-elfload-Fix-alignment-when-unmapping-excess-reservat.patch
-    Thanks to Dimitry Andric for identifying the fix.
-
- -- Lukas Märdian <slyon@ubuntu.com>  Thu, 13 Mar 2025 17:18:50 +0100
-
-** Changed in: qemu (Ubuntu Oracular)
-       Status: Fix Committed => Fix Released
-
--- 
-You received this bug notification because you are a member of qemu-
-devel-ml, which is subscribed to QEMU.
-https://bugs.launchpad.net/bugs/2072564
-
-Title:
-  qemu-aarch64-static segfaults running ldconfig.real (amd64 host)
-
-Status in QEMU:
-  Fix Released
-Status in qemu package in Ubuntu:
-  Fix Released
-Status in qemu source package in Noble:
-  Fix Committed
-Status in qemu source package in Oracular:
-  Fix Released
-
-Bug description:
-  [ Impact ]
-
-   * QEMU crashes when running (emulating) ldconfig in a Ubuntu 22.04
-  arm64 guest
-
-   * This affects the qemu-user-static 1:8.2.2+ds-0ubuntu1 package on
-  Ubuntu 24.04+, running on a amd64 host.
-
-   * When running docker containers with Ubuntu 22.04 in them, emulating
-  arm64 with qemu-aarch64-static, invocations of ldconfig (actually
-  ldconfig.real) segfault, leading to problems when loading shared
-  libraries.
-
-  [ Test Plan ]
-
-   * Reproducer is very easy:
-
-  $ sudo snap install docker
-  docker 27.5.1 from Canonical** installed
-  $ docker run -ti --platform linux/arm64/v8 ubuntu:22.04
-  Unable to find image 'ubuntu:22.04' locally
-  22.04: Pulling from library/ubuntu
-  0d1c17d4e593: Pull complete 
-  Digest: 
-sha256:ed1544e454989078f5dec1bfdabd8c5cc9c48e0705d07b678ab6ae3fb61952d2
-  Status: Downloaded newer image for ubuntu:22.04
-
-  # Execute ldconfig.real inside the arm64 guest.
-  # This should not crash after the fix!
-  root@ad80af5378dc:/# /sbin/ldconfig.real
-  qemu: uncaught target signal 11 (Segmentation fault) - core dumped
-  Segmentation fault (core dumped)
-
-  [ Where problems could occur ]
-
-   * This changes the alignment of sections in the ELF binary via QEMUs
-  elfloader, if something goes wrong with this change, it could lead to
-  all kind of crashes (segfault) of any emulated binaries.
-
-  [ Other Info ]
-
-   * Upstream bug: https://gitlab.com/qemu-project/qemu/-/issues/1913
-   * Upstream fix: https://gitlab.com/qemu-project/qemu/-/commit/4b7b20a3
-     - Fix dependency (needed for QEMU < 9.20): 
-https://gitlab.com/qemu-project/qemu/-/commit/c81d1faf
-
-  --- original bug report ---
-
-  
-  This affects the qemu-user-static 1:8.2.2+ds-0ubuntu1 package on Ubuntu 
-24.04, running on a amd64 host.
-
-  When running docker containers with Ubuntu 22.04 in them, emulating
-  arm64 with qemu-aarch64-static, invocations of ldconfig (actually
-  ldconfig.real) segfault. For example:
-
-  $ docker run -ti --platform linux/arm64/v8 ubuntu:22.04
-  root@8861ff640a1c:/# /sbin/ldconfig.real
-  Segmentation fault
-
-  If you copy the ldconfig.real binary to the host, and run it directly
-  via qemu-aarch64-static:
-
-  $ gdb --args qemu-aarch64-static ./ldconfig.real
-  GNU gdb (Ubuntu 15.0.50.20240403-0ubuntu1) 15.0.50.20240403-git
-  Copyright (C) 2024 Free Software Foundation, Inc.
-  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
-  This is free software: you are free to change and redistribute it.
-  There is NO WARRANTY, to the extent permitted by law.
-  Type "show copying" and "show warranty" for details.
-  This GDB was configured as "x86_64-linux-gnu".
-  Type "show configuration" for configuration details.
-  For bug reporting instructions, please see:
-  <https://www.gnu.org/software/gdb/bugs/>.
-  Find the GDB manual and other documentation resources online at:
-      <http://www.gnu.org/software/gdb/documentation/>.
-
-  For help, type "help".
-  Type "apropos word" to search for commands related to "word"...
-  Reading symbols from qemu-aarch64-static...
-  Reading symbols from 
-/home/dim/.cache/debuginfod_client/86579812b213be0964189499f62f176bea817bf2/debuginfo...
-  (gdb) r
-  Starting program: /usr/bin/qemu-aarch64-static ./ldconfig.real
-  [Thread debugging using libthread_db enabled]
-  Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
-  [New Thread 0x7ffff76006c0 (LWP 28378)]
-
-  Thread 1 "qemu-aarch64-st" received signal SIGSEGV, Segmentation fault.
-  0x00007fffe801645b in ?? ()
-  (gdb) disassemble
-  No function contains program counter for selected frame.
-
-  It looks like this is a known qemu regression after v8.1.1:
-  https://gitlab.com/qemu-project/qemu/-/issues/1913
-
-  Downgrading the package to qemu-user-
-  static_8.0.4+dfsg-1ubuntu3_amd64.deb fixes the segfault.
-
-To manage notifications about this bug go to:
-https://bugs.launchpad.net/qemu/+bug/2072564/+subscriptions
-
------------------------------------------------------------------------------------------
-
-toml
-[[issue]]
-id = 10002
-title = "qemu-aarch64-static segfaults running ldconfig.real (amd64 host)"
-state = "closed"
-created_at = "n/a"
-closed_at = "2025-03-13T17:18:50+01:00"
-labels = ["accel: TCG", "target: arm", "ubuntu"]
-url = "https://bugs.launchpad.net/bugs/2072564"
-host-os = "Ubuntu 24.04+"
-host-arch = "amd64"
-qemu-version = "1:9.0.2+ds-4ubuntu5.3"
-guest-os = "Ubuntu 22.04 (arm64)"
-guest-arch = "arm64"
-description = """QEMU crashes when running ldconfig in an emulated arm64 guest, affecting qemu-user-static on amd64 hosts."""
-reproduce = """Run 'ldconfig.real' in an emulated arm64 container or via qemu-aarch64-static."""
-additional = """Upstream bug: https://gitlab.com/qemu-project/qemu/-/issues/1913. Fixed by ELF alignment patches."""
-mistranslation = "no"
-Third Thread:
-Hi, all
-
-When I did the cxl memory hot-plug test on QEMU, I accidentally connected 
-two memdev to the same downstream port, the command like below:
-
-> -object memory-backend-ram,size=262144k,share=on,id=vmem0 \
-> -object memory-backend-ram,size=262144k,share=on,id=vmem1 \
-> -device pxb-cxl,bus_nr=12,bus=pcie.0,id=cxl.1 \
-> -device cxl-rp,port=0,bus=cxl.1,id=root_port0,chassis=0,slot=0 \
-> -device cxl-upstream,bus=root_port0,id=us0 \
-> -device cxl-downstream,port=0,bus=us0,id=swport00,chassis=0,slot=5 \
-> -device cxl-downstream,port=0,bus=us0,id=swport01,chassis=0,slot=7 \
-same downstream port but has different slot!
-
-> -device cxl-type3,bus=swport00,volatile-memdev=vmem0,id=cxl-vmem0 \
-> -device cxl-type3,bus=swport01,volatile-memdev=vmem1,id=cxl-vmem1 \
-> -M 
-> cxl-fmw.0.targets.0=cxl.1,cxl-fmw.0.size=64G,cxl-fmw.0.interleave-granularity=4k
->  \
-
-There is no error occurred when vm start, but when I executed the “cxl list” 
-command to view
-the CXL objects info, the process can not end properly.
-
-Then I used strace to trace the process, I found that the process is in 
-infinity loop:
-# strace cxl list
-......
-clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0
-openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3
-write(3, "1\n\0", 3)                    = 3
-close(3)                                = 0
-access("/run/udev/queue", F_OK)         = 0
-clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0
-openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3
-write(3, "1\n\0", 3)                    = 3
-close(3)                                = 0
-access("/run/udev/queue", F_OK)         = 0
-clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0
-openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3
-write(3, "1\n\0", 3)                    = 3
-close(3)                                = 0
-access("/run/udev/queue", F_OK)         = 0
-clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0
-openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3
-write(3, "1\n\0", 3)                    = 3
-close(3)                                = 0
-access("/run/udev/queue", F_OK)         = 0
-clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0
-openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3
-write(3, "1\n\0", 3)                    = 3
-close(3)                                = 0
-access("/run/udev/queue", F_OK)         = 0
-clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0
-openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3
-write(3, "1\n\0", 3)                    = 3
-close(3)                                = 0
-access("/run/udev/queue", F_OK)         = 0
-clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0
-openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3
-write(3, "1\n\0", 3)                    = 3
-close(3)                                = 0
-access("/run/udev/queue", F_OK)         = 0
-clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=1000000}, NULL) = 0
-openat(AT_FDCWD, "/sys/bus/cxl/flush", O_WRONLY|O_CLOEXEC) = 3
-write(3, "1\n\0", 3)                    = 3
-close(3)                                = 0
-access("/run/udev/queue", F_OK)         = 0
-
-[Environment]:
-linux: V6.10-rc3
-QEMU: V9.0.0
-ndctl: v79
-
-I know this is because of the wrong use of the QEMU command, but I think we 
-should 
-be aware of this error in one of the QEMU, OS or ndctl side at least.
-
-Thanks
-Xingtao
-
------------------------------------------------------------------------------------------
-
-toml
-[[issue]]
-id = 10003
-title = "CXL memory hot-plug with incorrect configuration causes infinite loop in 'cxl list'"
-state = "open"
-created_at = "n/a"
-closed_at = "n/a"
-labels = ["device: cxl", "bug"]
-url = "n/a"
-host-os = "n/a"
-host-arch = "n/a"
-qemu-version = "9.0.0"
-guest-os = "Linux (v6.10-rc3)"
-guest-arch = "n/a"
-description = """When incorrectly connecting two memdevs to the same CXL downstream port, 'cxl list' command enters infinite loop."""
-reproduce = """1. Configure CXL with two memdevs on same downstream port (different slots)
-2. Start VM
-3. Run 'cxl list' command"""
-additional = """Environment: QEMU v9.0.0, Linux v6.10-rc3, ndctl v79. Issue occurs despite incorrect configuration - should be caught by either QEMU, OS or ndctl."""
-mistranslation = "no"
-Fourth Thread:
-[SRU] migration was active, but no RAM info was set
-[Impact]
-
- * While live-migrating many instances concurrently, libvirt sometimes return `internal error: migration was active, but no RAM info was set:`
-
- * Effects of this bug are mostly observed in large scale clusters with a lot of live migration activity.
-
- * Has second order effects for consumers of migration monitor such as libvirt and openstack.
-
-[Test Case]
-
-Synthetic reproducer with GDB in comment #21.
-
-Steps to Reproduce:
-1. live evacuate a compute
-2. live migration of one or more instances fails with the above error
-
-N.B Due to the nature of this bug it is difficult consistently reproduce.
-In an environment where it has been observed it is estimated to occur approximately 1/1000 migrations.
-
-[Where problems could occur]
- * In the event of a regression the migration monitor may report an inconsistent state.
-
-[Original Bug Description]
-
-While live-migrating many instances concurrently, libvirt sometimes return internal error: migration was active, but no RAM info was set:
-~~~
-2022-03-30 06:08:37.197 7 WARNING nova.virt.libvirt.driver [req-5c3296cf-88ee-4af6-ae6a-ddba99935e23 - - - - -] [instance: af339c99-1182-4489-b15c-21e52f50f724] Error monitoring migration: internal error: migration was active, but no RAM info was set: libvirt.libvirtError: internal error: migration was active, but no RAM info was set
-[Impact]
-
- * While live-migrating many instances concurrently, libvirt sometimes return `internal error: migration was active, but no RAM info was set:`
-
- * Effects of this bug are mostly observed in large scale clusters with a lot of live migration activity.
-
- * Has second order effects for consumers of migration monitor such as libvirt and openstack.
-
-[Test Case]
-
-Synthetic reproducer with GDB in comment #21.
-
-Steps to Reproduce:
-1. live evacuate a compute
-2. live migration of one or more instances fails with the above error
-
-N.B Due to the nature of this bug it is difficult consistently reproduce.
-In an environment where it has been observed it is estimated to occur approximately 1/1000 migrations.
-
-[Where problems could occur]
- * In the event of a regression the migration monitor may report an inconsistent state.
-
-[Original Bug Description]
-
-While live-migrating many instances concurrently, libvirt sometimes return internal error: migration was active, but no RAM info was set:
-~~~
-2022-03-30 06:08:37.197 7 WARNING nova.virt.libvirt.driver [req-5c3296cf-88ee-4af6-ae6a-ddba99935e23 - - - - -] [instance: af339c99-1182-4489-b15c-21e52f50f724] Error monitoring migration: internal error: migration was active, but no RAM info was set: libvirt.libvirtError: internal error: migration was active, but no RAM info was set
-toml
-[[issue]]
-id = 10004
-title = "Migration active but no RAM info set during concurrent live migrations"
-state = "open"
-created_at = "n/a"
-closed_at = "n/a"
-labels = ["migration", "libvirt", "openstack"]
-url = "n/a"
-host-os = "n/a"
-host-arch = "n/a"
-qemu-version = "n/a"
-guest-os = "n/a"
-guest-arch = "n/a"
-description = """During concurrent live migrations, libvirt reports 'migration was active, but no RAM info was set' error intermittently"""
-reproduce = """1. Perform live evacuation of compute node
-2. Observe failure in approximately 1/1000 migrations with error message
-3. Synthetic reproducer available via GDB in comments"""
-additional = """Affects large scale clusters with heavy migration activity. Impacts libvirt and OpenStack migration monitoring. Difficult to reproduce consistently."""
-mistranslation = "no"