summary refs log tree commit diff stats
path: root/include/exec/memory.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* include/system: Move exec/memory.h to system/memory.hRichard Henderson2025-04-231-3201/+0
| | | | | | | | | | | | Convert the existing includes with sed -i ,exec/memory.h,system/memory.h,g Move the include within cpu-all.h into a !CONFIG_USER_ONLY block. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* include/exec/memory: move devend functions to memory-internal.hPierrick Bouvier2025-04-231-18/+0
| | | | | | | | | | | Only system/physmem.c and system/memory.c use those functions, so we can move then to internal header. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20250317183417.285700-17-pierrick.bouvier@linaro.org>
* include/exec/memory: extract devend_big_endian from devend_memopPierrick Bouvier2025-04-231-6/+12
| | | | | | | | | we'll use it in system/memory.c. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20250317183417.285700-16-pierrick.bouvier@linaro.org>
* exec/memory.h: make devend_memop "target defines" agnosticPierrick Bouvier2025-04-231-12/+4
| | | | | | | | | | Will allow to make system/memory.c common later. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20250317183417.285700-6-pierrick.bouvier@linaro.org>
* exec/memory_ldst_phys: extract memory_ldst_phys declarations from cpu-all.hPierrick Bouvier2025-04-231-0/+10
| | | | | | | | | | | | | | They are now accessible through exec/memory.h instead, and we make sure all variants are available for common or target dependent code. Move stl_phys_notdirty function as well. Cached endianness agnostic version rely on st/ld*_p, which is available through tswap.h. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-ID: <20250317183417.285700-5-pierrick.bouvier@linaro.org>
* docs: Fix some typos (found by codespell and typos)Stefan Weil via2025-04-131-2/+2
| | | | | | Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* migration: ram block cpr blockersSteve Sistare2025-03-101-0/+3
| | | | | | | | | | | | | | | | | Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks in the migration stream file and recreate them later, because the physical memory for the blocks is pinned and registered for vfio. Add a blocker for volatile ram blocks. Also add a blocker for RAM_GUEST_MEMFD. Preserving guest_memfd may be sufficient for CPR, but it has not been tested yet. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
* physmem: teach cpu_memory_rw_debug() to write to more memory regionsDavid Hildenbrand2025-02-121-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now, we only allow for writing to memory regions that allow direct access using memcpy etc; all other writes are simply ignored. This implies that debugging guests will not work as expected when writing to MMIO device regions. Let's extend cpu_memory_rw_debug() to write to more memory regions, including MMIO device regions. Reshuffle the condition in memory_access_is_direct() to make it easier to read and add a comment. While this change implies that debug access can now also write to MMIO devices, we now are also permit ELF image loads and similar users of cpu_memory_rw_debug() to write to MMIO devices; currently we ignore these writes. Peter assumes [1] that there's probably a class of guest images, which will start writing junk (likely zeroes) into device model registers; we previously would silently ignore any such bogus ELF sections. Likely these images are of questionable correctness and this can be ignored. If ever a problem, we could make these cases use address_space_write_rom() instead, which is left unchanged for now. This patch is based on previous work by Stefan Zabka. [1] https://lore.kernel.org/all/CAFEAcA_2CEJKFyjvbwmpt=on=GgMVamQ5hiiVt+zUr6AY3X=Xg@mail.gmail.com/ Resolves: https://gitlab.com/qemu-project/qemu/-/issues/213 Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250210084648.33798-8-david@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
* memory: pass MemTxAttrs to memory_access_is_direct()David Hildenbrand2025-02-121-2/+3
| | | | | | | | | | | | We want to pass another flag that will be stored in MemTxAttrs. So pass MemTxAttrs directly. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250210084648.33798-6-david@redhat.com [peterx: Fix MacOS builds] Signed-off-by: Peter Xu <peterx@redhat.com>
* physmem: factor out direct access check into ↵David Hildenbrand2025-02-121-3/+11
| | | | | | | | | | | | | | | memory_region_supports_direct_access() Let's factor the complete "directly accessible" check independent of the "write" condition out so we can reuse it next. We can now split up the checks RAM and ROMD check, so we really only check for RAM DEVICE in case of RAM -- ROM DEVICE is neither RAM not RAM DEVICE. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250210084648.33798-4-david@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
* physmem: factor out RAM/ROMD check in memory_access_is_direct()David Hildenbrand2025-02-121-4/+6
| | | | | | | | | | | | | | | | | Let's factor more of the generic "is this directly accessible" check, independent of the "write" condition out. Note that the "!mr->rom_device" check in the write case essentially disallows the memory_region_is_romd() condition again. Further note that RAM DEVICE regions are also RAM regions, so we can check for RAM+ROMD first. This is a preparation for further changes. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250210084648.33798-3-david@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
* physmem: factor out memory_region_is_ram_device() check in ↵David Hildenbrand2025-02-121-3/+10
| | | | | | | | | | | | | | | memory_access_is_direct() As documented in commit 4a2e242bbb306 ("memory: Don't use memcpy for ram_device regions"), we disallow direct access to RAM DEVICE regions. Let's make this clearer to prepare for further changes. Note that romd regions will never be RAM DEVICE at the same time. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250210084648.33798-2-david@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
* memory: add RAM_PRIVATESteve Sistare2025-01-291-0/+10
| | | | | | | | | | | | | | | | | Define the RAM_PRIVATE flag. In RAMBlock creation functions, if MAP_SHARED is 0 in the flags parameter, in a subsequent patch the implementation may still create a shared mapping if other conditions require it. Callers who specifically want a private mapping, eg for objects specified by the user, must pass RAM_PRIVATE. After RAMBlock creation, MAP_SHARED in the block's flags indicates whether the block is shared or private, and MAP_PRIVATE is omitted. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/1736967650-129648-6-git-send-email-steven.sistare@oracle.com Signed-off-by: Fabiano Rosas <farosas@suse.de>
* include/exec: remove warning_printed from MemoryRegionAlex Bennée2025-01-171-1/+0
| | | | | | | | | | | Since d197063fcf9 (memory: move unassigned_mem_ops to memory.c) this field is unused. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20250116160306.1709518-31-alex.bennee@linaro.org>
* include/exec: fix some copy and paste errors in kdocAlex Bennée2025-01-171-2/+2
| | | | | | | | | A number of copy and paste kdoc comments are referring to the wrong definition. Fix those cases. Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20250116160306.1709518-30-alex.bennee@linaro.org>
* softmmu: Expand comments describing max_bounce_buffer_sizeMattias Nissler2024-11-041-1/+8
| | | | | | | | | | | Clarify how the parameter gets configured and how it is used when servicing DMA mapping requests targeting indirect memory regions. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Message-Id: <20240910213512.843130-1-mnissler@rivosinc.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* softmmu: Support concurrent bounce buffersMattias Nissler2024-09-091-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When DMA memory can't be directly accessed, as is the case when running the device model in a separate process without shareable DMA file descriptors, bounce buffering is used. It is not uncommon for device models to request mapping of several DMA regions at the same time. Examples include: * net devices, e.g. when transmitting a packet that is split across several TX descriptors (observed with igb) * USB host controllers, when handling a packet with multiple data TRBs (observed with xhci) Previously, qemu only provided a single bounce buffer per AddressSpace and would fail DMA map requests while the buffer was already in use. In turn, this would cause DMA failures that ultimately manifest as hardware errors from the guest perspective. This change allocates DMA bounce buffers dynamically instead of supporting only a single buffer. Thus, multiple DMA mappings work correctly also when RAM can't be mmap()-ed. The total bounce buffer allocation size is limited individually for each AddressSpace. The default limit is 4096 bytes, matching the previous maximum buffer size. A new x-max-bounce-buffer-size parameter is provided to configure the limit for PCI devices. Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240819135455.2957406-1-mnissler@rivosinc.com Signed-off-by: Peter Xu <peterx@redhat.com>
* docs: Fix some typos (found by typos) and grammar issuesStefan Weil2024-08-161-1/+1
| | | | | | | | | | | | Fix the misspellings of "overriden" also in code comments. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20240813125638.395461-1-sw@weilnetz.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20240813202329.1237572-20-alex.bennee@linaro.org>
* memory: remove IOMMU MR iommu_set_page_size_mask() callbackEric Auger2024-07-091-38/+0
| | | | | | | | | | | | | | | Everything is now in place to use the Host IOMMU Device callbacks to retrieve the page size mask usable with a given assigned device. This new method brings the advantage to pass the info much earlier to the virtual IOMMU and before the IOMMU MR gets enabled. So let's remove the call to memory_region_iommu_set_page_size_mask in vfio common.c and remove the single implementation of the IOMMU MR callback in the virtio-iommu.c Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
* exec: don't use void* in pointer arithmetic in headersRoman Kiryanov2024-06-281-1/+1
| | | | | | | | | | | void* pointer arithmetic is a GCC extentension which could not be available in other build tools (e.g. C++). This changes removes this assumption. Signed-off-by: Roman Kiryanov <rkir@google.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20240620201654.598024-1-rkir@google.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* exec: avoid using C++ keywords in function parametersRoman Kiryanov2024-06-281-2/+2
| | | | | | | | to use the QEMU headers with a C++ compiler. Signed-off-by: Roman Kiryanov <rkir@google.com> Link: https://lore.kernel.org/r/20240618224553.878869-1-rkir@google.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: Remove IOMMU MR iommu_set_iova_range APIEric Auger2024-06-241-32/+0
| | | | | | | | | | | | Since the host IOVA ranges are now passed through the PCIIOMMUOps set_host_resv_regions and we have removed the only implementation of iommu_set_iova_range() in the virtio-iommu and the only call site in vfio/common, let's retire the IOMMU MR API and its memory wrapper. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
* memory: Constify IOMMUTLBEvent in memory_region_notify_iommu()Philippe Mathieu-Daudé2024-06-191-1/+1
| | | | | | | | @event access is read-only. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20240612132532.85928-3-philmd@linaro.org>
* memory: Constify IOMMUTLBEvent in memory_region_notify_iommu_one()Philippe Mathieu-Daudé2024-06-191-1/+1
| | | | | | | | @event access is read-only. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20240612132532.85928-2-philmd@linaro.org>
* memory: Introduce memory_region_init_ram_guest_memfd()Xiaoyao Li2024-06-051-0/+6
| | | | | | | | | | | | Introduce memory_region_init_ram_guest_memfd() to allocate private guset memfd on the MemoryRegion initialization. It's for the use case of TDVF, which must be private on TDX case. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com> Message-ID: <20240530111643.1091816-4-pankaj.gupta@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: Add Error** argument to memory_get_xlat_addr()Cédric Le Goater2024-05-161-1/+14
| | | | | | | | | | | | | Let the callers do the reporting. This will be useful in vfio_iommu_map_dirty_notify(). Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* system/physmem: Per-AddressSpace bounce bufferingMattias Nissler2024-05-081-0/+19
| | | | | | | | | | | | | | | | | | Instead of using a single global bounce buffer, give each AddressSpace its own bounce buffer. The MapClient callback mechanism moves to AddressSpace accordingly. This is in preparation for generalizing bounce buffer handling further to allow multiple bounce buffers, with a total allocation limit configured per AddressSpace. Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Message-ID: <20240507094210.300566-2-mnissler@rivosinc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> [PMD: Split patch, part 2/2] Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
* system/physmem: Propagate AddressSpace to MapClient helpersMattias Nissler2024-05-081-2/+24
| | | | | | | | | | | | | | | | | | Propagate AddressSpace handler to following helpers: - register_map_client() - unregister_map_client() - notify_map_clients[_locked]() Rename them using 'address_space_' prefix instead of 'cpu_'. The AddressSpace argument will be used in the next commit. Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Message-ID: <20240507094210.300566-2-mnissler@rivosinc.com> [PMD: Split patch, part 1/2] Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
* exec: Rename NEED_CPU_H -> COMPILING_PER_TARGETPhilippe Mathieu-Daudé2024-04-261-2/+2
| | | | | | | | | | | | | | | | | | 'NEED_CPU_H' guard target-specific code; it is defined by meson altogether with the 'CONFIG_TARGET' definition. Rename NEED_CPU_H as COMPILING_PER_TARGET to clarify its meaning. Mechanical change running: $ sed -i s/NEED_CPU_H/COMPILING_PER_TARGET/g $(git grep -l NEED_CPU_H) then manually add a /* COMPILING_PER_TARGET */ comment after the '#endif' when the block is large. Inspired-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20240322161439.6448-4-philmd@linaro.org>
* Merge tag 'migration-20240423-pull-request' of ↵Richard Henderson2024-04-231-2/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://gitlab.com/peterx/qemu into staging Migration pull for 9.1 - Het's new test cases for "channels" - Het's fix for a typo for vsock parsing - Cedric's VFIO error report series - Cedric's one more patch for dirty-bitmap error reports - Zhijian's rdma deprecation patch - Yuan's zeropage optimization to fix double faults on anon mem - Zhijian's COLO fix on a crash # -----BEGIN PGP SIGNATURE----- # # iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZig4HxIccGV0ZXJ4QHJl # ZGhhdC5jb20ACgkQO1/MzfOr1wbQiwD/V5nSJzSuAG4Ra1Fjo+LRG2TT6qk8eNCi # fIytehSw6cYA/0wqarxOF0tr7ikeyhtG3w4xFf44kk6KcPkoVSl1tqoL # =pJmQ # -----END PGP SIGNATURE----- # gpg: Signature made Tue 23 Apr 2024 03:37:19 PM PDT # gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706 # gpg: issuer "peterx@redhat.com" # gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown] # gpg: aka "Peter Xu <peterx@redhat.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706 * tag 'migration-20240423-pull-request' of https://gitlab.com/peterx/qemu: (26 commits) migration/colo: Fix bdrv_graph_rdlock_main_loop: Assertion `!qemu_in_coroutine()' failed. migration/multifd: solve zero page causing multiple page faults migration: Add Error** argument to add_bitmaps_to_list() migration: Modify ram_init_bitmaps() to report dirty tracking errors migration: Add Error** argument to xbzrle_init() migration: Add Error** argument to ram_state_init() memory: Add Error** argument to the global_dirty_log routines migration: Introduce ram_bitmaps_destroy() memory: Add Error** argument to .log_global_start() handler migration: Add Error** argument to .load_setup() handler migration: Add Error** argument to .save_setup() handler migration: Add Error** argument to qemu_savevm_state_setup() migration: Add Error** argument to vmstate_save() migration: Always report an error in ram_save_setup() migration: Always report an error in block_save_setup() vfio: Always report an error in vfio_save_setup() s390/stattrib: Add Error** argument to set_migrationmode() handler tests/qtest/migration: Fix typo for vsock in SocketAddress_to_str tests/qtest/migration: Add negative tests to validate migration QAPIs tests/qtest/migration: Add multifd_tcp_plain test using list of channels instead of uri ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
| * memory: Add Error** argument to the global_dirty_log routinesCédric Le Goater2024-04-231-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the log_global*() handlers take an Error** parameter and return a bool, do the same for memory_global_dirty_log_start() and memory_global_dirty_log_stop(). The error is reported in the callers for now and it will be propagated in the call stack in the next changes. To be noted a functional change in ram_init_bitmaps(), if the dirty pages logger fails to start, there is no need to synchronize the dirty pages bitmaps. colo_incoming_start_dirty_log() could be modified in a similar way. Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paul Durrant <paul@xen.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hyman Huang <yong.huang@smartx.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Fabiano Rosas <farosas@suse.de> Acked-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-12-clg@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
| * memory: Add Error** argument to .log_global_start() handlerCédric Le Goater2024-04-231-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Modify all .log_global_start() handlers to take an Error** parameter and return a bool. Adapt memory_global_dirty_log_start() to interrupt on the first error the loop on handlers. In such case, a rollback is performed to stop dirty logging on all listeners where it was previously enabled. Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Paul Durrant <paul@xen.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240320064911.545001-10-clg@redhat.com [peterx: modify & enrich the comment for listener_add_address_space() ] Signed-off-by: Peter Xu <peterx@redhat.com>
* | RAMBlock: Add support of KVM private guest memfdXiaoyao Li2024-04-231-3/+17
|/ | | | | | | | | | | | | | | | | | | | | | | | | | Add KVM guest_memfd support to RAMBlock so both normal hva based memory and kvm guest memfd based private memory can be associated in one RAMBlock. Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to create private guest_memfd during RAMBlock setup. Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd is more flexible and extensible than simply relying on the VM type because in the future we may have the case that not all the memory of a VM need guest memfd. As a benefit, it also avoid getting MachineState in memory subsystem. Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of confidential guests, such as TDX VM. How and when to set it for memory backends will be implemented in the following patches. Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has KVM guest_memfd allocated. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-ID: <20240320083945.991426-7-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* include/exec/memory.h: correct typosManos Pitsidianakis2024-02-211-1/+1
| | | | | | | | | | Correct typos automatically found with the `typos` tool <https://crates.io/crates/typos> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* Replace "iothread lock" with "BQL" in commentsStefan Hajnoczi2024-01-081-2/+2
| | | | | | | | | | | | | | | The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL) in their names. Update the code comments to use "BQL" instead of "iothread lock". Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Message-id: 20240102153529.486531-5-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* memory: Have memory_region_init_ram_from_fd() handler return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_ram_from_fd return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-14-philmd@linaro.org>
* memory: Have memory_region_init_ram_from_file() handler return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_ram_from_file return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-13-philmd@linaro.org>
* memory: Have memory_region_init_resizeable_ram() return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_resizeable_ram return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-12-philmd@linaro.org>
* memory: Have memory_region_init_rom_device() handler return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_rom_device return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-11-philmd@linaro.org>
* memory: Have memory_region_init_rom_device_nomigrate() return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_rom_device_nomigrate() return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-9-philmd@linaro.org>
* memory: Have memory_region_init_rom() handler return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_rom() return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-8-philmd@linaro.org>
* memory: Have memory_region_init_ram() handler return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_ram() return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-7-philmd@linaro.org>
* memory: Have memory_region_init_rom_nomigrate() handler return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_rom_nomigrate return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-4-philmd@linaro.org> [PMD: Only update 'readonly' field on success (Manos Pitsidianakis)] Message-Id: <af352e7d-3346-4705-be77-6eed86858d18@linaro.org>
* memory: Have memory_region_init_ram_nomigrate() handler return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_ram_nomigrate return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-3-philmd@linaro.org>
* memory: Have memory_region_init_ram_flags_nomigrate() return a booleanPhilippe Mathieu-Daudé2024-01-051-1/+3
| | | | | | | | | | | | Following the example documented since commit e3fe3988d7 ("error: Document Error API usage rules"), have memory_region_init_ram_nomigrate return a boolean indicating whether an error is set or not. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Message-Id: <20231120213301.24349-2-philmd@linaro.org>
* memory: Remove "qemu:" prefix from the "qemu:ram-discard-manager" type nameThomas Huth2023-12-201-1/+1
| | | | | | | | | | | | | | Type names should not contain special characters like ":". Let's remove the whole prefix here since it does not really seem to be helpful to have such a prefix here. The type name is only used internally for an interface type, so the renaming should not affect the user interface or migration. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20231117114457.177308-4-thuth@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* memory: Introduce memory_region_iommu_set_iova_rangesEric Auger2023-11-031-0/+32
| | | | | | | | | | | | This helper will allow to convey information about valid IOVA ranges to virtual IOMMUS. Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com> [ clg: fixes in memory_region_iommu_set_iova_ranges() and iommu_set_iova_ranges() documentation ] Signed-off-by: Cédric Le Goater <clg@redhat.com>
* memory: Let ReservedRegion use RangeEric Auger2023-11-031-2/+2
| | | | | | | | | | | | | A reserved region is a range tagged with a type. Let's directly use the Range type in the prospect to reuse some of the library helpers shipped with the Range type. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
* memory: initialize 'fv' in MemoryRegionCache to make Coverity happyIlya Maximets2023-10-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Coverity scan reports multiple false-positive "defects" for the following series of actions in virtio.c: MemoryRegionCache indirect_desc_cache; address_space_cache_init_empty(&indirect_desc_cache); address_space_cache_destroy(&indirect_desc_cache); For some reason it's unable to recognize the dependency between 'mrs.mr' and 'fv' and insists that '!mrs.mr' check in address_space_cache_destroy may take a 'false' branch, even though it is explicitly initialized to NULL in the address_space_cache_init_empty(): *** CID 1522371: Memory - illegal accesses (UNINIT) /qemu/hw/virtio/virtio.c: 1627 in virtqueue_split_pop() 1621 } 1622 1623 vq->inuse++; 1624 1625 trace_virtqueue_pop(vq, elem, elem->in_num, elem->out_num); 1626 done: >>> CID 1522371: Memory - illegal accesses (UNINIT) >>> Using uninitialized value "indirect_desc_cache.fv" when >>> calling "address_space_cache_destroy". 1627 address_space_cache_destroy(&indirect_desc_cache); 1628 1629 return elem; 1630 1631 err_undo_map: 1632 virtqueue_undo_map_desc(out_num, in_num, iov); ** CID 1522370: Memory - illegal accesses (UNINIT) Instead of trying to silence these false positive reports in 4 different places, initializing 'fv' as well, as this doesn't result in any noticeable performance impact. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Message-Id: <20231009104322.3085887-1-i.maximets@ovn.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
* memory,vhost: Allow for marking memory device memory regions unmergeableDavid Hildenbrand2023-10-121-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's allow for marking memory regions unmergeable, to teach flatview code and vhost to not merge adjacent aliases to the same memory region into a larger memory section; instead, we want separate aliases to stay separate such that we can atomically map/unmap aliases without affecting other aliases. This is desired for virtio-mem mapping device memory located on a RAM memory region via multiple aliases into a memory region container, resulting in separate memslots that can get (un)mapped atomically. As an example with virtio-mem, the layout would look something like this: [...] 0000000240000000-00000020bfffffff (prio 0, i/o): device-memory 0000000240000000-000000043fffffff (prio 0, i/o): virtio-mem 0000000240000000-000000027fffffff (prio 0, ram): alias memslot-0 @mem2 0000000000000000-000000003fffffff 0000000280000000-00000002bfffffff (prio 0, ram): alias memslot-1 @mem2 0000000040000000-000000007fffffff 00000002c0000000-00000002ffffffff (prio 0, ram): alias memslot-2 @mem2 0000000080000000-00000000bfffffff [...] Without unmergable memory regions, all three memslots would get merged into a single memory section. For example, when mapping another alias (e.g., virtio-mem-memslot-3) or when unmapping any of the mapped aliases, memory listeners will first get notified about the removal of the big memory section to then get notified about re-adding of the new (differently merged) memory section(s). In an ideal world, memory listeners would be able to deal with that atomically, like KVM nowadays does. However, (a) supporting this for other memory listeners (vhost-user, vfio) is fairly hard: temporary removal can result in all kinds of issues on concurrent access to guest memory; and (b) this handling is undesired, because temporarily removing+readding can consume quite some time on bigger memslots and is not efficient (e.g., vfio unpinning and repinning pages ...). Let's allow for marking a memory region unmergeable, such that we can atomically (un)map aliases to the same memory region, similar to (un)mapping individual DIMMs. Similarly, teach vhost code to not redo what flatview core stopped doing: don't merge such sections. Merging in vhost code is really only relevant for handling random holes in boot memory where; without this merging, the vhost-user backend wouldn't be able to mmap() some boot memory backed on hugetlb. We'll use this for virtio-mem next. Message-ID: <20230926185738.277351-18-david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>