summary refs log tree commit diff stats
path: root/gitlab/issues/target_missing/host_missing/accel_missing/1378.toml
diff options
context:
space:
mode:
Diffstat (limited to 'gitlab/issues/target_missing/host_missing/accel_missing/1378.toml')
-rw-r--r--gitlab/issues/target_missing/host_missing/accel_missing/1378.toml28
1 files changed, 28 insertions, 0 deletions
diff --git a/gitlab/issues/target_missing/host_missing/accel_missing/1378.toml b/gitlab/issues/target_missing/host_missing/accel_missing/1378.toml
new file mode 100644
index 00000000..2a40a9ba
--- /dev/null
+++ b/gitlab/issues/target_missing/host_missing/accel_missing/1378.toml
@@ -0,0 +1,28 @@
+id = 1378
+title = "iSCSI causes memory corruption"
+state = "closed"
+created_at = "2022-12-16T10:44:32.351Z"
+closed_at = "2023-02-16T13:09:22.627Z"
+labels = ["Storage", "kind::Bug", "workflow::Patch available"]
+url = "https://gitlab.com/qemu-project/qemu/-/issues/1378"
+host-os = "Proxmox v7.3-3"
+host-arch = "x86_64 on AMD"
+qemu-version = "kvm --version` => `QEMU emulator version 7.1.0 (pve-qemu-kvm_7.1.0-4)"
+guest-os = "Linux, multiple flavors (debian, home assistant, pure debian 11)"
+guest-arch = "x86_64"
+description = """This is a compound problem, which most likely involves a combination of how TrueNAS SCALE handles iSCSI triggering a problem **and** some memory-handling issue in QEMU leading to a crash. In short any Linux machine started with iSCSI handled by QEMU directly leads to a hard crash within 30s-1h. I was able to find a pattern in logs:
+
+1. First, a message like `QEMU[53139]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 0 ms): TASK_SET_FULL` is logged
+  - it is always `TASK_SET_FULL`
+  - it is always `retry #1 in ... ms`, where only number of miliseconds varies
+  - the line is repeated multiple times, sometimes 5x and sometimes >200x
+2. It is followed by a single line with one of the following:
+  - `double free or corruption (out)`
+  - `double free or corruption (!prev)`
+  - `kvm: ../block/block-backend.c:1567: blk_aio_write_entry: Assertion `!qiov || qiov->size == acb->bytes' failed.`
+  - `kvm: malloc.c:2379: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.`
+  - `kvm: iSCSI CheckCondition: SENSE KEY:UNIT_ATTENTION(6) ASCQ:BUS_RESET(0x2900)`
+  - `malloc(): invalid size (unsorted)`
+3. The virtual machine crashes"""
+reproduce = """I don't have a specific concrete steps, only clues really. This problem started happening after TrueNAS SCALE updated their iSCSI code in Bluefin release to a new upstream version. That iSCSI server still works when iSCSI is mounted by the kernel and QEMU uses a normal `/dev` entry. While there's probably some problem with it, QEMU shouldn't probably crash with memory errors."""
+additional = """While I'm a software developer, I don't code in C on a daily basis. However, looking at the errors, I have a suspicion the problem may be somewhere in the `iscsi_co_generic_cb()`, as it seems the struct is getting damaged (out of bound write?) and causes explosion somewhere down the line."""