qemu-nbd fails to discard bigger chunks This report is moved from systemd to here: https://github.com/systemd/systemd/issues/16242 A qemu-nbd device reports that it can discard a lot of bytes: cat /sys/block/nbd0/queue/discard_max_bytes 2199023255040 And indeed, discard works with small images: $ qemu-img create -f qcow2 /tmp/image.img 2M $ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img $ sudo blkdiscard /dev/nbd0 but not for bigger ones (still smaller than discard_max_bytes): $ qemu-img create -f qcow2 /tmp/image.img 5G $ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img $ sudo blkdiscard /dev/nbd0 On 6/23/20 3:19 PM, TobiasHunger wrote: > Public bug reported: > > This report is moved from systemd to here: > https://github.com/systemd/systemd/issues/16242 > > A qemu-nbd device reports that it can discard a lot of bytes: > > cat /sys/block/nbd0/queue/discard_max_bytes > 2199023255040 That smells fishy. It is 0xffffffff * 512. But in reality, the NBD protocol is (currently) capped at 32 bits, so it cannot handle any request 4G or larger. It is not qemu-nbd that populates /sys/block/nbd0/queue/discard_max_bytes, but the kernel. Are you sure this is not a bug in the kernel's nbd.ko module, where it may be the case that it is reporting -1 as a 32-bit value which then gets mistakenly turned into a faulty advertisement? Can you tweak your software to behave as if /dev/nbd0 had a discard_max_bytes of 0xfffff000 instead? In fact, to prove the bug is in the kernel's nbd.ko and not in qemu-nbd, I created an NBD server using nbdkit: # modprobe nbd # nbdkit memory 5G # nbd-client -b 512 localhost /dev/nbd0 # cat /sys/block/nbd0/queue/discard_max_bytes 2199023255040 Same answer, different nbd server. So it's not qemu's fault. > > And indeed, discard works with small images: > > $ qemu-img create -f qcow2 /tmp/image.img 2M > $ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img > $ sudo blkdiscard /dev/nbd0 > > but not for bigger ones (still smaller than discard_max_bytes): > > $ qemu-img create -f qcow2 /tmp/image.img 5G > $ sudo qemu-nbd --connect=/dev/nbd0 /tmp/image.img > $ sudo blkdiscard /dev/nbd0 > > ** Affects: qemu > Importance: Undecided > Status: New > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org Hmm, carrying on further, with the nbd-client connection, I'm seeing that the kernel DID break things into two separate BLKDISCARD calls, as seen from the nbdkit side of things: # from the blkdiscard strace: ioctl(3, BLKGETSIZE64, [5368709120]) = 0 ioctl(3, BLKSSZGET, [512]) = 0 ioctl(3, BLKDISCARD, [0, 5368709120]) = 0 # from the nbdkit debug log: nbdkit: memory.0: debug: memory: trim count=4294966784 offset=0 fua=0 nbdkit: memory.3: debug: memory: trim count=1073742336 offset=4294966784 fua=0 I'm now comparing the set of ioctl calls made by nbd-client vs. qemu-nbd to see what might explain the difference for why it worked with nbd-client when the two different servers connect to the kernel nbd.ko module. In the meantime, since nbd-client worked but qemu-nbd did not, it does look like this may be qemu's problem after all. Let's get nbd.ko out of the picture. The problem can be reproduced in user space (here, where I built qemu-nbd to log trace messages to stderr): $ truncate --size=3G file $ qemu-nbd -f raw file --trace=nbd_\* $ nbdsh -u nbd://localhost:10810 -c 'h.trim(3*1024*1024*1024,0)' Traceback (most recent call last): File "/usr/lib64/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib64/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/usr/lib64/python3.8/site-packages/nbd.py", line 1762, in nbdsh.shell() File "/usr/lib64/python3.8/site-packages/nbdsh.py", line 100, in shell exec (c, d, d) File "", line 1, in File "/usr/lib64/python3.8/site-packages/nbd.py", line 1098, in trim return libnbdmod.trim (self._o, count, offset, flags) nbd.Error: nbd_trim: trim: command failed: Input/output error (EIO) and looking at the trace output from qemu-nbd, I see: 493771@1592948038.044141:nbd_negotiate_success Negotiation succeeded 493771@1592948038.044167:nbd_trip Reading request 493771@1592948038.044262:nbd_receive_request Got request: { magic = 0x25609513, .flags = 0x0, .type = 0x4, from = 0, len = 3221225472 } 493771@1592948038.044272:nbd_co_receive_request_decode_type Decoding type: handle = 1, type = 4 (trim) 493771@1592948038.044291:nbd_co_send_structured_error Send structured error reply: handle = 1, error = 5 (EIO), msg = 'discard failed' so this is definitely a case of qemu as NBD server NOT honoring requests between 2G and 4G. I'll have a patch posted soon. On 6/23/20 4:35 PM, Eric Blake wrote: > Let's get nbd.ko out of the picture. The problem can be reproduced in > user space (here, where I built qemu-nbd to log trace messages to > stderr): > > $ truncate --size=3G file > $ qemu-nbd -f raw file --trace=nbd_\* > $ nbdsh -u nbd://localhost:10810 -c 'h.trim(3*1024*1024*1024,0)' > nbd.Error: nbd_trim: trim: command failed: Input/output error (EIO) > > > so this is definitely a case of qemu as NBD server NOT honoring requests > between 2G and 4G. I'll have a patch posted soon. https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg06592.html -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org The QEMU project is currently moving its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If the bug has already been fixed in the latest upstream version of QEMU, then please close this ticket as "Fix released". If it is not fixed yet and you think that this bug report here is still valid, then you have two options: 1) If you already have an account on gitlab.com, please open a new ticket for this problem in our new tracker here: https://gitlab.com/qemu-project/qemu/-/issues and then close this ticket here on Launchpad (or let it expire auto- matically after 60 days). Please mention the URL of this bug ticket on Launchpad in the new ticket on GitLab. 2) If you don't have an account on gitlab.com and don't intend to get one, but still would like to keep this ticket opened, then please switch the state back to "New" within the next 60 days (otherwise it will get closed as "Expired"). We will then eventually migrate the ticket auto- matically to the new system (but you won't be the reporter of the bug in the new system and thus won't get notified on changes anymore). Thank you and sorry for the inconvenience. Commit 890cbccb0 included in upstream release 5.1.0.