summary refs log tree commit diff stats
path: root/scripts/qapi/expr.py (unfollow)
Commit message (Collapse)AuthorFilesLines
2023-09-08vmstate: Mark VMStateInfo.get/put() coroutine_mixed_fnKevin Wolf1-3/+5
Migration code can run both in coroutine context (the usual case) and non-coroutine context (at least savevm/loadvm for snapshots). This also affects the VMState callbacks, and devices must consider this. Change the callback definition in VMStateInfo to be explicit about it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230905145002.46391-2-kwolf@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08block: Make more BlockDriver definitions staticKevin Wolf3-3/+3
Most block driver implementations don't have any reason for their BlockDriver to be public. The only exceptions are bdrv_file, bdrv_raw and bdrv_qcow2, which are actually used in other source files. Make all other BlockDriver definitions static if they aren't yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230905130607.35134-3-kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08block/meson.build: Restore alphabetical order of filesKevin Wolf1-6/+6
When commit 5e5733e5999 created block/meson.build, the list of unconditionally added files was in alphabetical order. Later commits added new files in random places. Reorder the list to be alphabetical again. (As for ordering foo.c against foo-*.c, there are both ways used currently; standardise on having foo.c first, even though this is different from the original commit 5e5733e5999.) Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230905130607.35134-2-kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08block: Remove unnecessary variable in bdrv_block_device_infoFabiano Rosas1-3/+2
The commit 5d8813593f ("block/qapi: Let bdrv_query_image_info() recurse") removed the loop where we set the 'bs0' variable, so now it is just the same as 'bs'. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230901184605.32260-3-farosas@suse.de> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08block: Remove bdrv_query_block_node_infoFabiano Rosas2-30/+0
The last call site of this function has been removed by commit c04d0ab026 ("qemu-img: Let info print block graph"). Reviewed-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-ID: <20230901184605.32260-2-farosas@suse.de> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08vmdk: Clean up bdrv_open_child() return value checkDmitry Frolov1-1/+1
bdrv_open_child() may return NULL. Usually return value is checked for this function. Check for return value is more reliable. Fixes: 24bc15d1f6 ("vmdk: Use BdrvChild instead of BDS for references to extents") Signed-off-by: Dmitry Frolov <frolov@swemel.ru> Message-ID: <20230831125926.796205-1-frolov@swemel.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08qemu-img: Update documentation for compressed imagesKevin Wolf1-2/+17
Document the 'compression_type' option for qcow2, and mention that streamOptimized vmdk supports compression, too. Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230901102430.23856-1-kwolf@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08block: Be more verbose in create fallbackHanna Czenczek1-2/+4
For image creation code, we have central fallback code for protocols that do not support creating new images (like NBD or iscsi). So for them, you can only specify existing paths/exports that are overwritten to make clean new images. In such a case, if the given path cannot be opened (assuming a pre-existing image there), we print an error message that tries to describe what is going on: That with this protocol, you cannot create new images, but only overwrite existing ones; and the given path could not be opened as a pre-existing image. However, the current message is confusing, because it does not say that the protocol in question does not support creating new images, but instead that "image creation" is unsupported. This can be interpreted to mean that `qemu-img create` will not work in principle, which is not true. Be more verbose for clarity. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2217204 Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-ID: <20230720140024.46836-1-hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08block/iscsi: Document why we use raw malloc()Peter Maydell1-0/+1
In block/iscsi.c we use a raw malloc() call, which is unusual given the project standard is to use the glib memory allocation functions. Document why we do so, to avoid it being converted to g_malloc() by mistake. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-ID: <20230727150705.2664464-1-peter.maydell@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08qemu-img: omit errno value in error messageMichael Tokarev4-9/+9
I'm getting io-qcow2-244 test failure on mips* due to output mismatch: Take an internal snapshot: -qemu-img: Could not create snapshot 'test': -95 (Operation not supported) +qemu-img: Could not create snapshot 'test': -122 (Operation not supported) No errors were found on the image. This is because errno values might be different across different architectures. This error message in qemu-img.c is the only one which prints errno directly, all the rest print strerror(errno) only. Fix this error message and the expected output of the 3 test cases too. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Message-ID: <20230811110946.2435067-1-mjt@tls.msk.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08block: change reqs_lock to QemuMutexStefan Hajnoczi3-14/+16
CoMutex has poor performance when lock contention is high. The tracked requests list is accessed frequently and performance suffers in QEMU multi-queue block layer scenarios. It is not necessary to use CoMutex for the requests lock. The lock is always released across coroutine yield operations. It is held for relatively short periods of time and it is not beneficial to yield when the lock is held by another coroutine. Change the lock type from CoMutex to QemuMutex to improve multi-queue block layer performance. fio randread bs=4k iodepth=64 with 4 IOThreads handling a virtio-blk device with 8 virtqueues improves from 254k to 517k IOPS (+203%). Full benchmark results and configuration details are available here: https://gitlab.com/stefanha/virt-playbooks/-/commit/980c40845d540e3669add1528739503c2e817b57 In the future we may wish to introduce thread-local tracked requests lists to avoid lock contention completely. That would be much more involved though. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230808155852.2745350-3-stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08block: minimize bs->reqs_lock section in tracked_request_end()Stefan Hajnoczi1-1/+7
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230808155852.2745350-2-stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08iotests: adapt test output for new qemu_cleanup() behaviorFiona Ebner3-0/+30
Since commit ca2a5e630d ("qemu_cleanup: begin drained section after vm_shutdown()"), there will be an additional pause for jobs during qemu_cleanup(). The reason is that the bdrv_drain_all() call in do_vm_stop() is not inside the drained section used by qemu_cleanup() anymore. I.e., there is a second drained section now that ends before the final one in qemu_cleanup() starts. Thus, job_pause() is called twice during cleanup (via child_job_drained_begin()). Test 185 needs to be adapted directly too, because it waits for a specific number of JOB_STATUS_CHANGE events before the BLOCK_JOB_CANCELLED event. Reported-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20230817112538.255111-1-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-08block/vpc: Avoid dynamic stack allocationPhilippe Mathieu-Daudé1-2/+2
Use autofree heap allocation instead of variable-length array on the stack. Here we don't expect the bitmap size to be enormous, and since we're about to read/write it to disk the overhead of the allocation should be fine. The codebase has very few VLAs, and if we can get rid of them all we can make the compiler error on new additions. This is a defensive measure against security bugs where an on-stack dynamic allocation isn't correctly size-checked (e.g. CVE-2021-3527). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> [PMM: expanded commit message] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-ID: <20230811175229.808139-1-peter.maydell@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-06hw/ide/ahci: fix broken SError handlingNiklas Cassel1-2/+1
When encountering an NCQ error, you should not write the NCQ tag to the SError register. This is completely wrong. The SError register has a clear definition, where each bit represents a different error, see PxSERR definition in AHCI 1.3.1. If we write a random value (like the NCQ tag) in SError, e.g. Linux will read SError, and will trigger arbitrary error handling depending on the NCQ tag that happened to be executing. In case of success, ncq_cb() will call ncq_finish(). In case of error, ncq_cb() will call ncq_err() (which will clear ncq_tfs->used), and then call ncq_finish(), thus using ncq_tfs->used is sufficient to tell if finished should get set or not. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230609140844.202795-9-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
2023-09-06hw/ide/ahci: fix ahci_write_fis_sdb()Niklas Cassel1-2/+8
When there is an error, we need to raise a TFES error irq, see AHCI 1.3.1, 5.3.13.1 SDB:Entry. If ERR_STAT is set, we jump to state ERR:FatalTaskfile, which will raise a TFES IRQ unconditionally, regardless if the I bit is set in the FIS or not. Thus, we should never raise a normal IRQ after having sent an error IRQ. It is valid to signal successfully completed commands as finished in the same SDB FIS that generates the error IRQ. The important thing is that commands that did not complete successfully (e.g. commands that were aborted, do not get the finished bit set). Before this commit, there was never a TFES IRQ raised on NCQ error. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230609140844.202795-8-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
2023-09-06hw/ide/ahci: PxCI should not get cleared when ERR_STAT is setNiklas Cassel3-33/+88
For NCQ, PxCI is cleared on command queued successfully. For non-NCQ, PxCI is cleared on command completed successfully. Successfully means ERR_STAT, BUSY and DRQ are all cleared. A command that has ERR_STAT set, does not get to clear PxCI. See AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and RegFIS:ClearCI, and 5.3.16.5 ERR:FatalTaskfile. In the case of non-NCQ commands, not clearing PxCI is needed in order for host software to be able to see which command slot that failed. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Message-id: 20230609140844.202795-7-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
2023-09-06hw/ide/ahci: PxSACT and PxCI is cleared when PxCMD.ST is clearedNiklas Cassel1-0/+5
According to AHCI 1.3.1 definition of PxSACT: This field is cleared when PxCMD.ST is written from a '1' to a '0' by software. This field is not cleared by a COMRESET or a software reset. According to AHCI 1.3.1 definition of PxCI: This field is also cleared when PxCMD.ST is written from a '1' to a '0' by software. Clearing PxCMD.ST is part of the error recovery procedure, see AHCI 1.3.1, section "6.2 Error Recovery". If we don't clear PxCI on error recovery, the previous command will incorrectly still be marked as pending after error recovery. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230609140844.202795-6-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
2023-09-06hw/ide/ahci: simplify and document PxCI handlingNiklas Cassel1-20/+50
The AHCI spec states that: For NCQ, PxCI is cleared on command queued successfully. For non-NCQ, PxCI is cleared on command completed successfully. (A non-NCQ command that completes with error does not clear PxCI.) The current QEMU implementation either clears PxCI in check_cmd(), or in ahci_cmd_done(). check_cmd() will clear PxCI for a command if handle_cmd() returns 0. handle_cmd() will return -1 if BUSY or DRQ is set. The QEMU implementation for NCQ commands will currently not set BUSY or DRQ, so they will always have PxCI cleared by handle_cmd(). ahci_cmd_done() will never even get called for NCQ commands. Non-NCQ commands are executed by ide_bus_exec_cmd(). Non-NCQ commands in QEMU are implemented either in a sync or in an async way. For non-NCQ commands implemented in a sync way, the command handler will return true, and when ide_bus_exec_cmd() sees that a command handler returns true, it will call ide_cmd_done() (which will call ahci_cmd_done()). For a command implemented in a sync way, ahci_cmd_done() will do nothing (since busy_slot is not set). Instead, after ide_bus_exec_cmd() has finished, check_cmd() will clear PxCI for these commands. For non-NCQ commands implemented in an async way (using either aiocb or pio_aiocb), the command handler will return false, ide_bus_exec_cmd() will not call ide_cmd_done(), instead it is expected that the async callback function will call ide_cmd_done() once the async command is done. handle_cmd() will set busy_slot, if and only if BUSY or DRQ is set, and this is checked _after_ ide_bus_exec_cmd() has returned. handle_cmd() will return -1, so check_cmd() will not clear PxCI. When the async callback calls ide_cmd_done() (which will call ahci_cmd_done()), it will see that busy_slot is set, and ahci_cmd_done() will clear PxCI. This seems racy, since busy_slot is set _after_ ide_bus_exec_cmd() has returned. The callback might come before busy_slot gets set. And it is quite confusing that ahci_cmd_done() will be called for all non-NCQ commands when the command is done, but will only clear PxCI in certain cases, even though it will always write a D2H FIS and raise an IRQ. Even worse, in the case where ahci_cmd_done() does not clear PxCI, it still raises an IRQ. Host software might thus read an old PxCI value, since PxCI is cleared (by check_cmd()) after the IRQ has been raised. Try to simplify this by always setting busy_slot for non-NCQ commands, such that ahci_cmd_done() will always be responsible for clearing PxCI for non-NCQ commands. For NCQ commands, clear PxCI when we receive the D2H FIS, but before raising the IRQ, see AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and RegFIS:ClearCI. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Message-id: 20230609140844.202795-5-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
2023-09-06hw/ide/ahci: write D2H FIS when processing NCQ commandNiklas Cassel1-6/+11
The way that BUSY + PxCI is cleared for NCQ (FPDMA QUEUED) commands is described in SATA 3.5a Gold: 11.15 FPDMA QUEUED command protocol DFPDMAQ2: ClearInterfaceBsy "Transmit Register Device to Host FIS with the BSY bit cleared to zero and the DRQ bit cleared to zero and Interrupt bit cleared to zero to mark interface ready for the next command." PxCI is currently cleared by handle_cmd(), but we don't write the D2H FIS to the FIS Receive Area that actually caused PxCI to be cleared. Similar to how ahci_pio_transfer() calls ahci_write_fis_pio() with an additional parameter to write a PIO Setup FIS without raising an IRQ, add a parameter to ahci_write_fis_d2h() so that ahci_write_fis_d2h() also can write the FIS to the FIS Receive Area without raising an IRQ. Change process_ncq_command() to call ahci_write_fis_d2h() without raising an IRQ (similar to ahci_pio_transfer()), such that the FIS Receive Area is in sync with the PxTFD shadow register. E.g. Linux reads status and error fields from the FIS Receive Area directly, so it is wise to keep the FIS Receive Area and the PxTFD shadow register in sync. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Message-id: 20230609140844.202795-4-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
2023-09-06hw/ide/core: set ERR_STAT in unsupported command completionNiklas Cassel1-1/+1
Currently, the first time sending an unsupported command (e.g. READ LOG DMA EXT) will not have ERR_STAT set in the completion. Sending the unsupported command again, will correctly have ERR_STAT set. When ide_cmd_permitted() returns false, it calls ide_abort_command(). ide_abort_command() first calls ide_transfer_stop(), which will call ide_transfer_halt() and ide_cmd_done(), after that ide_abort_command() sets ERR_STAT in status. ide_cmd_done() for AHCI will call ahci_write_fis_d2h() which writes the current status in the FIS, and raises an IRQ. (The status here will not have ERR_STAT set!). Thus, we cannot call ide_transfer_stop() before setting ERR_STAT, as ide_transfer_stop() will result in the FIS being written and an IRQ being raised. The reason why it works the second time, is that ERR_STAT will still be set from the previous command, so when writing the FIS, the completion will correctly have ERR_STAT set. Set ERR_STAT before writing the FIS (calling cmd_done), so that we will raise an error IRQ correctly when receiving an unsupported command. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-id: 20230609140844.202795-3-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
2023-09-06iotests: Add test for data_off checkAlexander Ivanov2-0/+25
Write a pattern to the first cluster. Corrupt the data_off field and check if the field was repaired on image opening and the pattern has not changed. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06iotests: Fix test 131 after repair was added to parallels_open()Alexander Ivanov2-17/+4
Images repairing in parallels_open() was added, thus parallels tests fail. Access to an image leads to repairing the image. Further image check don't detect any corruption. Remove reads after image creation in test 131. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06iotests: Fix cluster size in parallels images tests (131)Alexander Ivanov2-23/+26
In this test cluster size is 64k, but modern tools generate images with cluster size 1M. Calculate cluster size using track field from image header. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06iotests: Refactor tests of parallels images checks (131)Alexander Ivanov1-13/+16
Replace hardcoded numbers by variables. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06iotests: Add test for BAT entries duplication checkAlexander Ivanov2-0/+63
Fill a parallels image with a pattern and write another pattern to the second cluster. Corrupt the image and check if the pattern changes. Repair the image and check the patterns on guest and host sides. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06iotests: Add leak check test for parallels formatAlexander Ivanov2-0/+49
Write a pattern to the last cluster, extend the image by 1 claster, repair and check that the last cluster still has the same pattern. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06iotests: Add out-of-image check test for parallels formatAlexander Ivanov2-0/+83
Fill the image with a pattern to generate entries in the BAT, set the first BAT entry outside the image, try to read the corrupted image. At the image opening it should be repaired, check for zeroes in the first cluster. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Add data_off repairing to parallels_open()Alexander Ivanov1-13/+16
Place data_start/data_end calculation after reading the image header to s->header. Set s->data_start to the offset calculated in parallels_test_data_off(). Call bdrv_check() if data_off is incorrect. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Add data_off checkAlexander Ivanov1-0/+80
data_off field of the parallels image header can be corrupted. Check if this field greater than the header + BAT size and less than file size. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Use bdrv_co_getlength() in parallels_check_outside_image()Alexander Ivanov1-1/+1
bdrv_co_getlength() should be used in coroutine context. Replace bdrv_getlength() by bdrv_co_getlength() in parallels_check_outside_image(). Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Image repairing in parallels_open()Alexander Ivanov1-32/+38
Repair an image at opening if the image is unclean or out-of-image corruption was detected. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Add checking and repairing duplicate offsets in BATAlexander Ivanov1-0/+144
Cluster offsets must be unique among all the BAT entries. Find duplicate offsets in the BAT and fix it by copying the content of the relevant cluster to a newly allocated cluster and set the new cluster offset to the duplicated entry. Add host_cluster_index() helper to deduplicate the code. When new clusters are allocated, the file size increases by 128 Mb. Call parallels_check_leak() to fix this leak. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Add data_start field to BDRVParallelsStateAlexander Ivanov2-3/+5
In the next patch we will need the offset of the data area for host cluster index calculation. Add this field and setting up code. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Add "explicit" argument to parallels_check_leak()Alexander Ivanov1-7/+12
In the on of the next patches we need to repair leaks without changing leaks and leaks_fixed info in res. Also we don't want to print any warning about leaks. Add "explicit" argument to skip info changing if the argument is false. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Check if data_end greater than the file sizeAlexander Ivanov1-0/+5
Initially data_end is set to the data_off image header field and must not be greater than the file size. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Incorrect data end calculation in parallels_open()Alexander Ivanov1-2/+2
The BDRVParallelsState structure contains data_end field that is measured in sectors. In parallels_open() initially this field is set by data_off field from parallels image header. According to the parallels format documentation, data_off field contains an offset, in sectors, from the start of the file to the start of the data area. For "WithoutFreeSpace" images: if data_off is zero, the offset is calculated as the end of the BAT table plus some padding to ensure sector size alignment. The parallels_open() function has code for handling zero value in data_off, but in the result data_end contains the offset in bytes. Replace the alignment to sector size by division by sector size and fix the comparision with s->header_size. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06parallels: Fix comments formatting inside parallels driverAlexander Ivanov1-6/+12
This patch is technically necessary as git patch rendering could result in moving some code from one place to the another and that hits checkpatch.pl warning. This problem specifically happens within next series. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>
2023-09-06MAINTAINERS: add tree to keep parallels format driver changesDenis V. Lunev1-0/+1
Driver changes are driving by me for now. At least we need to get functionally complete check and repair procedure for now. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Stefan Hajnoczi <stefanha@redhat.com>
2023-09-06ppc/xive: Add support for the PC MMIOsCédric Le Goater1-36/+48
The XIVE interrupt contoller maintains various fields on interrupt targets in a structure called NVT. Each unit has a NVT cache, backed by RAM. When the NVT structure is not local (in RAM) to the chip, the XIVE interrupt controller forwards the memory operation to the owning chip using the PC MMIO region configured for this purpose. QEMU does not need to be so precise since software shouldn't perform any of these operations. The model implementation is simplified to return the RAM address of the NVT structure which is then used by pnv_xive_vst_write or read to perform the operation in RAM. Remove the last use of pnv_xive_get_remote(). Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06ppc/xive: Handle END triggers between chips with MMIOsCédric Le Goater2-2/+68
The notify page of the interrupt controller can either be used to receive trigger events from the HW controllers (PHB, PSI) or to reroute interrupts between Interrupt Controllers. In which case, the VSD table is used to determine the address of the notify page of the remote IC and the store data is forwarded. Today, our model grabs the remote VSD (EAS, END, NVT) address using pnv_xive_get_remote() helper. Be more precise and implement remote END triggers using a store on the remote IC notify page. We still have a shortcut in the model for the NVT accesses which we will address later. Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06ppc/xive: Introduce a new XiveRouter end_notify() handlerCédric Le Goater2-10/+20
It will help us model the END triggers on the PowerNV machine, which can be rerouted to another interrupt controller. Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06ppc/xive: Use address_space routines to access the machine RAMCédric Le Goater2-8/+46
to log an error in case of bad configuration of the XIVE tables by the FW. Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06target/ppc: Fix the order of kvm_enable judgment about kvmppc_set_interrupt()jianchunfu2-3/+7
It's unnecessary for non-KVM accelerators(TCG, for example), to call this function, so change the order of kvm_enable() judgment. The static inline function that returns -1 directly does not work in TCG's situation. Signed-off-by: jianchunfu <chunfu.jian@shingroup.cn> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06hw/ppc/e500: fix broken snapshot replayMaksim Kostin1-1/+1
ppce500_reset_device_tree is registered for system reset, but after c4b075318eb1 this function rerandomizes rng-seed via qemu_guest_getrandom_nofail. And when loading a snapshot, it tries to read EVENT_RANDOM that doesn't exist, so we have an error: qemu-system-ppc: Missing random event in the replay log To fix this, use qemu_register_reset_nosnapshotload instead of qemu_register_reset. Reported-by: Vitaly Cheptsov <cheptsov@ispras.ru> Fixes: c4b075318eb1 ("hw/ppc: pass random seed to fdt ") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1634 Signed-off-by: Maksim Kostin <maksim.kostin@ispras.ru> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06target/ppc: Flush inputs to zero with NJ in ppc_store_vscrRichard Henderson1-0/+1
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1779 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06target/ppc: Fix LQ, STQ register-pair order for big-endianNicholas Piggin1-8/+8
LQ, STQ have the same register-pair ordering as LQARX/STQARX., which is the even (lower) register contains the most significant bits. This is not implemented correctly for big-endian. do_ldst_quad() has variables low_addr_gpr and high_addr_gpr which is confusing because they are low and high addresses, whereas LQARX/STQARX. and most such things use the low and high values for lo/hi variables. The conversion to native 128-bit memory access functions missed this strangeness. Fix this by changing the if condition, and change the variable names to hi/lo to match convention. Cc: qemu-stable@nongnu.org Reported-by: Ivan Warren <ivan@vmfacility.fr> Fixes: 57b38ffd0c6f ("target/ppc: Use tcg_gen_qemu_{ld,st}_i128 for LQARX, LQ, STQ") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1836 Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06tests/avocado: ppc64 reverse debugging tests for pseries and powernvNicholas Piggin1-0/+29
These machines run reverse-debugging well enough to pass basic tests. Wire them up. Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06tests/avocado: reverse-debugging cope with re-executing breakpointsNicholas Piggin1-4/+21
The reverse-debugging test creates a trace, then replays it and: 1. Steps the first 10 instructions and records their addresses. 2. Steps backward and verifies their addresses match. 3. Runs to (near) the end of the trace. 4. Sets breakpoints on the first 10 instructions. 5. Continues backward and verifies execution stops at the last breakpoint. Step 5 breaks if any of the other 9 breakpoints are re-executed in the trace after the 10th instruction is run, because those will be unexpectedly hit when reverse continuing. This situation does arise with the ppc pseries machine, the SLOF bios branches to its own entry point. Deal with this by switching steps 3 and 4, so the trace will be run to the end *or* one of the breakpoints being re-executed. Step 5 then reverses from there to the 10th instruction will not hit a breakpoint in between, by definition. Another step is added between steps 2 and 3, which steps forward over the first 10 instructions and verifies their addresses, to support this. Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
2023-09-06tests/avocado: boot ppc64 pseries replay-record test to Linux VFS mountNicholas Piggin1-2/+1
This the ppc64 record-replay test is able to replay the full kernel boot so try enabling it. Acked-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>