summary refs log tree commit diff stats
path: root/system/physmem.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* exec: Expose 'target_page.h' API to user emulationPhilippe Mathieu-Daudé2024-04-261-30/+0
| | | | | | | | | | User-only objects might benefit from the "exec/target_page.h" API, which allows to build some objects once for all targets. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Warner Losh <imp@bsdimp.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20231211212003.21686-3-philmd@linaro.org>
* physmem: Introduce ram_block_discard_guest_memfd_range()Xiaoyao Li2024-04-231-0/+23
| | | | | | | | | | | | | | | When memory page is converted from private to shared, the original private memory is back'ed by guest_memfd. Introduce ram_block_discard_guest_memfd_range() for discarding memory in guest_memfd. Based on a patch by Isaku Yamahata <isaku.yamahata@intel.com>. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Message-ID: <20240320083945.991426-12-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* RAMBlock: make guest_memfd require uncoordinated discardPaolo Bonzini2024-04-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some subsystems like VFIO might disable ram block discard, but guest_memfd uses discard operations to implement conversions between private and shared memory. Because of this, sequences like the following can result in stale IOMMU mappings: 1. allocate shared page 2. convert page shared->private 3. discard shared page 4. convert page private->shared 5. allocate shared page 6. issue DMA operations against that shared page This is not a use-after-free, because after step 3 VFIO is still pinning the page. However, DMA operations in step 6 will hit the old mapping that was allocated in step 1. Address this by taking ram_block_discard_is_enabled() into account when deciding whether or not to discard pages. Since kvm_convert_memory()/guest_memfd doesn't implement a RamDiscardManager handler to convey and replay discard operations, this is a case of uncoordinated discard, which is blocked/released by ram_block_discard_require(). Interestingly, this function had no use so far. Alternative approaches would be to block discard of shared pages, but this would cause guests to consume twice the memory if they use VFIO; or to implement a RamDiscardManager and only block uncoordinated discard, i.e. use ram_block_coordinated_discard_require(). [Commit message mostly by Michael Roth <michael.roth@amd.com>] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* RAMBlock: Add support of KVM private guest memfdXiaoyao Li2024-04-231-3/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | Add KVM guest_memfd support to RAMBlock so both normal hva based memory and kvm guest memfd based private memory can be associated in one RAMBlock. Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to create private guest_memfd during RAMBlock setup. Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd is more flexible and extensible than simply relying on the VM type because in the future we may have the case that not all the memory of a VM need guest memfd. As a benefit, it also avoid getting MachineState in memory subsystem. Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of confidential guests, such as TDX VM. How and when to set it for memory backends will be implemented in the following patches. Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has KVM guest_memfd allocated. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-ID: <20240320083945.991426-7-michael.roth@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* physmem: Factor cpu_physical_memory_dirty_bits_cleared() outNicholas Piggin2024-03-121-5/+3
| | | | | | | | | | | Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Thomas Huth <thuth@redhat.com> Message-ID: <20240219061731.232570-1-npiggin@gmail.com> [PMD: Split patch in 2: part 1/2] Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20240312201458.79532-3-philmd@linaro.org Signed-off-by: Peter Xu <peterx@redhat.com>
* physmem: Expose tlb_reset_dirty_range_all()Philippe Mathieu-Daudé2024-03-121-1/+1
| | | | | | | | | | In order to call tlb_reset_dirty_range_all() outside of system/physmem.c, expose its prototype. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20240312201458.79532-2-philmd@linaro.org Signed-off-by: Peter Xu <peterx@redhat.com>
* physmem: Fix wrong address in large address_space_read/write_cached_slow()Jonathan Cameron2024-03-111-6/+57
| | | | | | | | | | | | | | | | | | | | | | | If the access is bigger than the MemoryRegion supports, flatview_read/write_continue() will attempt to update the Memory Region. but the address passed to flatview_translate() is relative to the cache, not to the FlatView. On arm/virt with interleaved CXL memory emulation and virtio-blk-pci this lead to the first part of descriptor being read from the CXL memory and the second part from PA 0x8 which happens to be a blank region of a flash chip and all ffs on this particular configuration. Note this test requires the out of tree ARM support for CXL, but the problem is more general. Avoid this by adding new address_space_read_continue_cached() and address_space_write_continue_cached() which share all the logic with the flatview versions except for the MemoryRegion lookup which is unnecessary as the MemoryRegionCache only covers one MemoryRegion. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240307153710.30907-5-Jonathan.Cameron@huawei.com Signed-off-by: Peter Xu <peterx@redhat.com>
* physmem: Factor out body of flatview_read/write_continue() loopJonathan Cameron2024-03-111-70/+99
| | | | | | | | | | | | | This code will be reused for the address_space_cached accessors shortly. Also reduce scope of result variable now we aren't directly calling this in the loop. Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20240307153710.30907-4-Jonathan.Cameron@huawei.com Signed-off-by: Peter Xu <peterx@redhat.com>
* physmem: Reduce local variable scope in flatview_read/write_continue()Jonathan Cameron2024-03-111-20/+20
| | | | | | | | | | | Precursor to factoring out the inner loops for reuse. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Link: https://lore.kernel.org/r/20240307153710.30907-3-Jonathan.Cameron@huawei.com Signed-off-by: Peter Xu <peterx@redhat.com>
* physmem: Rename addr1 to more informative mr_addr in flatview_read/write() ↵Jonathan Cameron2024-03-111-25/+25
| | | | | | | | | | | | | | | | and similar The calls to flatview_read/write[_continue]() have parameters addr and addr1 but the names give no indication of what they are addresses of. Rename addr1 to mr_addr to reflect that it is the translated address offset within the MemoryRegion returned by flatview_translate(). Similarly rename the parameter in address_space_read/write_cached_slow() Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20240307153710.30907-2-Jonathan.Cameron@huawei.com Signed-off-by: Peter Xu <peterx@redhat.com>
* system/physmem: Do not include 'hw/xen/xen.h' but 'sysemu/xen.h'Philippe Mathieu-Daudé2024-03-091-1/+1
| | | | | | | | | | | physmem.c doesn't use any declaration from "hw/xen/xen.h", it only requires "sysemu/xen.h" and "system/xen-mapcache.h". Suggested-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20231114143816.71079-5-philmd@linaro.org>
* softmmu/physmem: Remove HOST_PAGE_ALIGNRichard Henderson2024-02-291-4/+11
| | | | | | | | | | Align allocation sizes to the maximum of host and target page sizes. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Acked-by: Helge Deller <deller@gmx.de> Message-Id: <20240102015808.132373-15-richard.henderson@linaro.org>
* softmmu/physmem: Remove qemu_host_page_sizeRichard Henderson2024-02-291-1/+1
| | | | | | | | | | Use qemu_real_host_page_size() instead. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Helge Deller <deller@gmx.de> Message-Id: <20240102015808.132373-14-richard.henderson@linaro.org>
* system/physmem: remove redundant arg reassignmentManos Pitsidianakis2024-02-201-5/+2
| | | | | | | | | | Arguments `ram_block` are reassigned to local declarations `block` without further use. Remove re-assignment to reduce noise. Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* physmem: replace function name with __func__ in ram_block_discard_range()Xiaoyao Li2024-02-161-21/+17
| | | | | | | | | | Use __func__ to avoid hard-coded function name. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20240125023328.2520888-1-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* target: Make qemu_target_page_mask() available for *-userIlya Leoshkevich2024-01-291-5/+0
| | | | | | | | | | | | | Currently qemu_target_page_mask() is usable only from the softmmu code. Make it possible to use it from the *-user code as well. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Message-ID: <20231208003754.3688038-2-iii@linux.ibm.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240124075609.14756-2-philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [rth: Split out change to accel/tcg/perf.c] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Replace "iothread lock" with "BQL" in commentsStefan Hajnoczi2024-01-081-3/+3
| | | | | | | | | | | | | | | The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL) in their names. Update the code comments to use "BQL" instead of "iothread lock". Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Message-id: 20240102153529.486531-5-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()Stefan Hajnoczi2024-01-081-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Big QEMU Lock (BQL) has many names and they are confusing. The actual QemuMutex variable is called qemu_global_mutex but it's commonly referred to as the BQL in discussions and some code comments. The locking APIs, however, are called qemu_mutex_lock_iothread() and qemu_mutex_unlock_iothread(). The "iothread" name is historic and comes from when the main thread was split into into KVM vcpu threads and the "iothread" (now called the main loop thread). I have contributed to the confusion myself by introducing a separate --object iothread, a separate concept unrelated to the BQL. The "iothread" name is no longer appropriate for the BQL. Rename the locking APIs to: - void bql_lock(void) - void bql_unlock(void) - bool bql_locked(void) There are more APIs with "iothread" in their names. Subsequent patches will rename them. There are also comments and documentation that will be updated in later patches. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Fabiano Rosas <farosas@suse.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Cédric Le Goater <clg@kaod.org> Acked-by: Peter Xu <peterx@redhat.com> Acked-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Acked-by: Hyman Huang <yong.huang@smartx.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-id: 20240102153529.486531-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* coverity: physmem: use simple assertions instead of modellingVladimir Sementsov-Ogievskiy2023-11-241-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unfortunately Coverity doesn't follow the logic aroung "len" and "l" variables in stacks finishing with flatview_{read,write}_continue() and generate a lot of OVERRUN false-positives. When small buffer (2 or 4 bytes) is passed to mem read/write path, Coverity assumes the worst case of sz=8 in stn_he_p()/ldn_he_p() (defined in include/qemu/bswap.h), and reports buffer overrun. To silence these false-positives we have model functions, which hide real logic from Coverity. However, it turned out that these new two assertions are enough to quiet Coverity. Assertions are better than hiding the logic, so let's drop the modelling and move to assertions for memory r/w call stacks. After patch, the sequence cov-make-library --output-file /tmp/master.xmldb \ scripts/coverity-scan/model.c cov-build --dir ~/covtmp/master make -j9 cov-analyze --user-model-file /tmp/master.xmldb \ --dir ~/covtmp/master --all --strip-path "$(pwd) cov-format-errors --dir ~/covtmp/master \ --html-output ~/covtmp/master_html_report Generate for me the same big set of CIDs excepept for 6 disappeared (so it becomes even better). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Acked-by: David Hildenbrand <david@redhat.com> Message-ID: <20231005140326.332830-1-vsementsov@yandex-team.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* softmmu/physmem: Fixup qemu_ram_block_from_host() documentationDavid Hildenbrand2023-10-121-17/+0
| | | | | | | | | | | | | | Let's fixup the documentation (e.g., removing traces of the ram_addr parameter that no longer exists) and move it to the header file while at it. Message-ID: <20230926185738.277351-4-david@redhat.com> Suggested-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
* system: Rename softmmu/ directory as system/Philippe Mathieu-Daudé2023-10-081-0/+3796
The softmmu/ directory contains files specific to system emulation. Rename it as system/. Update meson rules, the MAINTAINERS file and all the documentation and comments. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20231004090629.37473-14-philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>