diff options
Diffstat (limited to 'results/classifier/accel-gemma3:12b/vmm/721825')
| -rw-r--r-- | results/classifier/accel-gemma3:12b/vmm/721825 | 77 |
1 files changed, 77 insertions, 0 deletions
diff --git a/results/classifier/accel-gemma3:12b/vmm/721825 b/results/classifier/accel-gemma3:12b/vmm/721825 new file mode 100644 index 000000000..d30b66c3e --- /dev/null +++ b/results/classifier/accel-gemma3:12b/vmm/721825 @@ -0,0 +1,77 @@ + +VDI block driver bugs + +Chunqiang Tang reports the following issues with the VDI block driver, these are present in QEMU 0.14: + +"Bug 1. The most serious bug is caused by race condition in updating a new +bmap entry in memory and on disk. Considering the following operation +sequence. + O1: VM issues a write to sector X + O2: VDI allocates a new bmap entry and updates in-memory s->bmap + O3: VDI writes data to disk + O4: The disk I/O for writing sector X fails + O5: VDI reports error to VM and returns. + +Note that the bmap entry is updated in memory, but not persisted on disk. +Now consider another write that immediately follows: + P1: VM issues a write to sector X+1, which locates in the same block as +the previously used sector X. + P2: s->bmap already has one entry for the block, and hence VDI writes +data directly without persisting the new s->bmap entry on disk. + P3: The write disk I/O succeeds + P4: VDI report success to VM, but the bitmap entry is still not +persisted on disk. + +Now suppose the VM powers off gracefully (i.e., the QEMU process quits) +and reboots. The second write to sector X+1, which is reported as finished +successfully, is simply lost, because the corresponding in-memory s->bmap +entry is never persisted on disk. This is exactly what FVD's testing tool +discovers. After the block device is closed and then re-opened, disk +content verification fails. + +This is just one example of the problem. Race condition plus host crash +also causes problems. Consider another example below. + Q1: VM issues a write to sector X + Q2: VDI allocates a new bmap entry and updates in-memory s->bmap + Q3: VDI writes sector X to disk and waits for the callback + Q4: VM issues a write to another sector X+1, which is in the same block +as sector X. + Q5: VDI sees the bitmap entry in s->bmap is already allocated, and +writes sector X+1 to disk. + Q6: Write to sector X+1 finishes, and VDI's callback is invoked. + Q7: VDI acknowledges to the VM the completion of writing sector X+1 + Q8: After observing the completion of writing sector X+1, VM issues a +flush to ensure that sector X+1 is persisted on disk. + Q9: VDI finishes the flush and acknowledge the completion of the +operation. + Q10: ... (some other arbitrary operations, but the disk I/O for writing +sector X is still not finished....) + Q11: The host crashes + +Now the new bitmap entry is not persisted on disk, while both writing to +sector X+1 and the flush has been acknowledged as finished. Sector X+1 is +lost, which is a corruption. This problem exists even if it uses O_DSYNC. +The root cause of the problem is that, if a request updates in-memory +s->bmap, another request that sees this update assumes that the update is +already persisted on disk, which is not. + +Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are +several cases of the code below on failure handling path without setting +error return code, which mistakenly reports failure as success. This +mistake is caught by FVD when doing image content validation. + if (acb->hd_aiocb == NULL) { + /* missing ret = -EIO; */ + goto done; + } + +Bug 3: Similar to the bugs the FVD testing tool found for QCOW2, +vdi_aio_cancel does not perform a complete clean up and there are several +related bugs. First, memory buffer is not freed, acb->orig_buf and +acb->block_buffer. Second, acb->bh is not cancelled. Third, +vdi_aio_setup() does not initialize acb->bh to NULL so that when a request +acb is cancelled and then later reused for another request, its acb->bh != +NULL and the new request fails in vdi_schedule_bh(). This is caught by +FVD's testing tool, when it observes that no I/O failure is injected but +VDI reports a failed I/O request, which indicates a bug in the driver." + +http://permalink.gmane.org/gmane.comp.emulators.qemu/94340 \ No newline at end of file |