summary refs log tree commit diff stats
path: root/include/qemu/osdep.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* include: Move qemu_mprotect_*() to new qemu/mprotect.hPeter Maydell2022-02-211-4/+0
| | | | | | | | | | The qemu_mprotect_*() family of functions are used in very few files; move them from osdep.h to a new qemu/mprotect.h. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-3-peter.maydell@linaro.org
* include: Move qemu_madvise() and related #defines to new qemu/madvise.hPeter Maydell2022-02-211-82/+0
| | | | | | | | | | | The function qemu_madvise() and the QEMU_MADV_* constants associated with it are used in only 10 files. Move them out of osdep.h to a new qemu/madvise.h header that is included where it is needed. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
* 9pfs: Fix segfault in do_readdir_many caused by struct dirent overreadVitaly Chikunov2022-02-171-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | `struct dirent' returned from readdir(3) could be shorter (or longer) than `sizeof(struct dirent)', thus memcpy of sizeof length will overread into unallocated page causing SIGSEGV. Example stack trace: #0 0x00005555559ebeed v9fs_co_readdir_many (/usr/bin/qemu-system-x86_64 + 0x497eed) #1 0x00005555559ec2e9 v9fs_readdir (/usr/bin/qemu-system-x86_64 + 0x4982e9) #2 0x0000555555eb7983 coroutine_trampoline (/usr/bin/qemu-system-x86_64 + 0x963983) #3 0x00007ffff73e0be0 n/a (n/a + 0x0) While fixing this, provide a helper for any future `struct dirent' cloning. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/841 Cc: qemu-stable@nongnu.org Co-authored-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Vitaly Chikunov <vt@altlinux.org> Tested-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Acked-by: Greg Kurz <groug@kaod.org> Tested-by: Vitaly Chikunov <vt@altlinux.org> Message-Id: <20220216181821.3481527-1-vt@altlinux.org> [C.S. - Fix typo in source comment. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
* util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()David Hildenbrand2022-01-071-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's sense support and use it for preallocation. MADV_POPULATE_WRITE does not require a SIGBUS handler, doesn't actually touch page content, and avoids context switches; it is, therefore, faster and easier to handle than our current approach. While MADV_POPULATE_WRITE is, in general, faster than manual prefaulting, and especially faster with 4k pages, there is still value in prefaulting using multiple threads to speed up preallocation. More details on MADV_POPULATE_WRITE can be found in the Linux commits 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1]. This resolves the TODO in do_touch_pages(). In the future, we might want to look into using fallocate(), eventually combined with MADV_POPULATE_READ, when dealing with shared file/fd mappings and not caring about memory bindings. [1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20211217134611.31172-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* osdep: fix HAVE_BROKEN_SIZE_MAX casePaolo Bonzini2021-07-091-1/+1
| | | | | | | | | | | | | While config-host.mak entries are expanded to "1" for compatibility with create-config.sh, tests done directly in meson.build expand to the empty string and cannot be placed to the right of the && operator. Adjust osdep.h after commit e46bd55d9c ("configure: convert HAVE_BROKEN_SIZE_MAX to meson", 2021-07-06) changed the way HAVE_BROKEN_SIZE_MAX is defined. Reported-by: Frederic Bezies <fredbezies@gmail.com> Fixes: e46bd55d9c ("configure: convert HAVE_BROKEN_SIZE_MAX to meson", 2021-07-06) Resolves: #463 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* introduce QEMU_AUTO_VFREEVladimir Sementsov-Ogievskiy2021-06-291-0/+15
| | | | | | | | | Introduce a convenient macro, that works for qemu_memalign() like g_autofree works with g_malloc. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210628121133.193984-2-vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* osdep: provide ROUND_DOWN macroPaolo Bonzini2021-06-251-6/+22
| | | | | | | | | | osdep.h provides a ROUND_UP macro to hide bitwise operations for the purpose of rounding a number up to a power of two; add a ROUND_DOWN macro that does the same with truncation towards zero. While at it, change the formatting of some comments. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under LinuxDavid Hildenbrand2021-06-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no effect on most shared mappings - except for hugetlbfs and anonymous memory. Linux man page: "MAP_NORESERVE: Do not reserve swap space for this mapping. When swap space is reserved, one has the guarantee that it is possible to modify the mapping. When swap space is not reserved one might get SIGSEGV upon a write if no physical memory is available. See also the discussion of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before 2.6, this flag had effect only for private writable mappings." Note that the "guarantee" part is wrong with memory overcommit in Linux. Also, in Linux hugetlbfs is treated differently - we configure reservation of huge pages from the pool, not reservation of swap space (huge pages cannot be swapped). The rough behavior is [1]: a) !Hugetlbfs: 1) Without MAP_NORESERVE *or* with memory overcommit under Linux disabled ("/proc/sys/vm/overcommit_memory == 2"), the following accounting/reservation happens: For a file backed map SHARED or READ-only - 0 cost (the file is the map not swap) PRIVATE WRITABLE - size of mapping per instance For an anonymous or /dev/zero map SHARED - size of mapping PRIVATE READ-only - 0 cost (but of little use) PRIVATE WRITABLE - size of mapping per instance 2) With MAP_NORESERVE, no accounting/reservation happens. b) Hugetlbfs: 1) Without MAP_NORESERVE, huge pages are reserved. 2) With MAP_NORESERVE, no huge pages are reserved. Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able to configure it for !hugetlbfs globally; this toggle now allows configuring it more fine-grained, not for the whole system. The target use case is virtio-mem, which dynamically exposes memory inside a large, sparse memory area to the VM. [1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-10-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap()David Hildenbrand2021-06-151-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The new flag has the following semantics: " RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge pages if applicable) is skipped: will bail out if not supported. When not set, the OS will do the reservation, if supported for the memory type. " Allow passing it into: - memory_region_init_ram_nomigrate() - memory_region_init_resizeable_ram() - memory_region_init_ram_from_file() ... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag. Bail out if the flag is not supported, which is the case right now for both, POSIX and win32. We will add Linux support next and allow specifying RAM_NORESERVE via memory backends. The target use case is virtio-mem, which dynamically exposes memory inside a large, sparse memory area to the VM. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-9-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* util/mmap-alloc: Pass flags instead of separate bools to qemu_ram_mmap()David Hildenbrand2021-06-151-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | Let's pass flags instead of bools to prepare for passing other flags and update the documentation of qemu_ram_mmap(). Introduce new QEMU_MAP_ flags that abstract the mmap() PROT_ and MAP_ flag handling and simplify it. We expose only flags that are currently supported by qemu_ram_mmap(). Maybe, we'll see qemu_mmap() in the future as well that can implement these flags. Note: We don't use MAP_ flags as some flags (e.g., MAP_SYNC) are only defined for some systems and we want to always be able to identify these flags reliably inside qemu_ram_mmap() -- for example, to properly warn when some future flags are not available or effective on a system. Also, this way we can simplify PROT_ handling as well. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210510114328.21835-8-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* softmmu/physmem: Fix ram_block_discard_range() to handle shared anonymous memoryDavid Hildenbrand2021-06-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | We can create shared anonymous memory via "-object memory-backend-ram,share=on,..." which is, for example, required by PVRDMA for mremap() to work. Shared anonymous memory is weird, though. Instead of MADV_DONTNEED, we have to use MADV_REMOVE: MADV_DONTNEED will only remove / zap all relevant page table entries of the current process, the backend storage will not get removed, resulting in no reduced memory consumption and a repopulation of previous content on next access. Shared anonymous memory is internally really just shmem, but without a fd exposed. As we cannot use fallocate() without the fd to discard the backing storage, MADV_REMOVE gets the same job done without a fd as documented in "man 2 madvise". Removing backing storage implicitly invalidates all page table entries with relevant mappings - an additional MADV_DONTNEED is not required. Fixes: 06329ccecfa0 ("mem: add share parameter to memory-backend-ram") Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20210406080126.24010-3-david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* util/osdep: Add qemu_mprotect_rwRichard Henderson2021-06-131-0/+1
| | | | | | | | | For --enable-tcg-interpreter on Windows, we will need this. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* osdep: Make os-win32.h and os-posix.h handle 'extern "C"' themselvesPeter Maydell2021-05-101-4/+4
| | | | | | | | | | | | | Both os-win32.h and os-posix.h include system header files. Instead of having osdep.h include them inside its 'extern "C"' block, make these headers handle that themselves, so that we don't include the system headers inside 'extern "C"'. This doesn't fix any current problems, but it's conceptually the right way to handle system headers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
* include/qemu/osdep.h: Move system includes to topPeter Maydell2021-04-171-7/+13
| | | | | | | | | | | | | | | | Mostly osdep.h puts the system includes at the top of the file; but there are a couple of exceptions where we include a system header halfway through the file. Move these up to the top with the rest so that all the system headers we include are included before we include os-win32.h or os-posix.h. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210416135543.20382-4-peter.maydell@linaro.org Message-id: 20210414184343.26235-1-peter.maydell@linaro.org
* osdep: protect qemu/osdep.h with extern "C"Paolo Bonzini2021-04-171-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | System headers may include templates if compiled with a C++ compiler, which cause the compiler to complain if qemu/osdep.h is included within a C++ source file's 'extern "C"' block. Add an 'extern "C"' block directly to qemu/osdep.h, so that system headers can be kept out of it. There is a stray declaration early in qemu/osdep.h, which needs to be special cased. Add a definition in qemu/compiler.h to make it look nice. config-host.h, CONFIG_TARGET, exec/poison.h and qemu/compiler.h are included outside the 'extern "C"' block; that is not an issue because they consist entirely of preprocessor directives. This allows us to move the include of osdep.h in our two C++ source files outside the extern "C" block they were previously using for it, which in turn means that they compile successfully against newer versions of glib which insist that glib.h is *not* inside an extern "C" block. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210416135543.20382-3-peter.maydell@linaro.org [PMM: Moved disas/arm-a64.cc osdep.h include out of its extern "C" block; explained in commit message why we're doing this] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* osdep: include glib-compat.h before other QEMU headersPaolo Bonzini2021-04-171-1/+7
| | | | | | | | | | | | | | | | | glib-compat.h is sort of like a system header, and it needs to include system headers (glib.h) that may dislike being included under 'extern "C"'. Move it right after all system headers and before all other QEMU headers. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20210416135543.20382-2-peter.maydell@linaro.org [PMM: Added comment about why glib-compat.h is special] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* osdep: build with non-working system() functionJoelle van Dyne2021-01-291-0/+12
| | | | | | | | | | Build without error on hosts without a working system(). If system() is called, return -1 with ENOSYS. Signed-off-by: Joelle van Dyne <j@getutm.app> Message-id: 20210126012457.39046-6-j@getutm.app Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* tcg: Toggle page execution for Apple SiliconRoman Bolshakov2021-01-231-0/+28
| | | | | | | | | | | | | | | Pages can't be both write and executable at the same time on Apple Silicon. macOS provides public API to switch write protection [1] for JIT applications, like TCG. 1. https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon Tested-by: Alexander Graf <agraf@csgraf.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20210113032806.18220-1-r.bolshakov@yadro.com> [rth: Inline the qemu_thread_jit_* functions; drop the MAP_JIT change for a follow-on patch.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* osdep.h: Remove <sys/signal.h> includeMichael Forney2021-01-201-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to 2a4b472c3c, sys/signal.h was only included on OpenBSD (apart from two .c files). The POSIX standard location for this header is just <signal.h> and in fact, OpenBSD's signal.h includes sys/signal.h itself. Unconditionally including <sys/signal.h> on musl causes warnings for just about every source file: /usr/include/sys/signal.h:1:2: warning: #warning redirecting incorrect #include <sys/signal.h> to <signal.h> [-Wcpp] 1 | #warning redirecting incorrect #include <sys/signal.h> to <signal.h> | ^~~~~~~ Since there don't seem to be any platforms which require including <sys/signal.h> in addition to <signal.h>, and some platforms like Haiku lack it completely, just remove it. Tested building on OpenBSD after removing this include. Signed-off-by: Michael Forney <mforney@mforney.org> Tested-by: John Snow <jsnow@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210113215600.16100-1-mforney@mforney.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
* oslib: do not call g_strdup from qemu_get_exec_dirPaolo Bonzini2020-09-301-6/+2
| | | | | | | | Just return the directory without requiring the caller to free it. This also removes a bogus check for NULL in os_find_datadir and module_load_one; g_strdup of a static variable cannot return NULL. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge remote-tracking branch ↵Peter Maydell2020-09-171-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/huth-gitlab/tags/pull-request-2020-09-16' into staging * Fix "readlink -f" problem in iotests on macOS (to fix the Cirrus-CI tests) * Some minor qtest improvements * Fix the unit tests to work on MSYS2, too * Enable building and testing on MSYS2 in the Cirrus-CI * Build FreeBSD with one task again in the Cirrus-CI # gpg: Signature made Wed 16 Sep 2020 12:24:29 BST # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full] # gpg: aka "Thomas Huth <thuth@redhat.com>" [full] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * remotes/huth-gitlab/tags/pull-request-2020-09-16: (24 commits) cirrus: Building freebsd in a single shot ci: Enable msys2 ci in cirrus tests: Fixes test-qdev-global-props.c tests: fix test-util-sockets.c tests: Fixes test-io-channel-file by mask only owner file state mask bits tests: fixes aio-win32 about aio_remove_fd_handler, get it consistence with aio-posix.c tests: Fixes test-io-channel-socket.c tests under msys2/mingw vmstate: Fixes test-vmstate.c on msys2/mingw meson: remove empty else and duplicated gio deps meson: Use -b to ignore CR vs. CR-LF issues on Windows osdep: file locking functions are not available on Win32 tests: test-replication disable /replication/secondary/* on msys2/mingw. tests: Fixes test-replication.c on msys2/mingw. meson: disable crypto tests are empty under win32 meson: Disable test-char on msys2/mingw for fixing tests stuck rcu: fixes test-logging.c by call drain_call_rcu before rmdir_full tests: Convert g_free to g_autofree macro in test-logging.c rcu: Implement drain_call_rcu qga/commands-win32: Fix problem with redundant protype declaration Simplify the .gitignore file ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * osdep: file locking functions are not available on Win32Yonggang Luo2020-09-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Do not declare the following locking functions on Win32: int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive); int qemu_unlock_fd(int fd, int64_t start, int64_t len); int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive); bool qemu_has_ofd_lock(void); Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20200915121318.247-10-luoyonggang@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* | util: introduce qemu_open and qemu_create with error reportingDaniel P. Berrangé2020-09-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | qemu_open_old() works like open(): set errno and return -1 on failure. It has even more failure modes, though. Reporting the error clearly to users is basically impossible for many of them. Our standard cure for "errno is too coarse" is the Error object. Introduce two new helper methods: int qemu_open(const char *name, int flags, Error **errp); int qemu_create(const char *name, int flags, mode_t mode, Error **errp); Note that with this design we no longer require or even accept the O_CREAT flag. Avoiding overloading the two distinct operations means we can avoid variable arguments which would prevent 'errp' from being the last argument. It also gives us a guarantee that the 'mode' is given when creating files, avoiding a latent security bug. Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
* | util: rename qemu_open() to qemu_open_old()Daniel P. Berrangé2020-09-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | We want to introduce a new version of qemu_open() that uses an Error object for reporting problems and make this it the preferred interface. Rename the existing method to release the namespace for the new impl. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
* | monitor: simplify functions for getting a dup'd fdset entryDaniel P. Berrangé2020-09-161-0/+1
|/ | | | | | | | | | | Currently code has to call monitor_fdset_get_fd, then dup the return fd, and then add the duplicate FD back into the fdset. This dance is overly verbose for the caller and introduces extra failure modes which can be avoided by folding all the logic into monitor_fdset_dup_fd_add and removing monitor_fdset_get_fd entirely. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
* meson: infrastructure for building emulatorsPaolo Bonzini2020-08-211-1/+1
| | | | | Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* linux-user: don't use MAP_FIXED in pgd_find_hole_fallbackAlex Bennée2020-07-271-0/+3
| | | | | | | | | | | | | Plain MAP_FIXED has the undesirable behaviour of splatting exiting maps so we don't actually achieve what we want when looking for gaps. We should be using MAP_FIXED_NOREPLACE. As this isn't always available we need to potentially check the returned address to see if the kernel gave us what we asked for. Fixes: ad592e37dfc ("linux-user: provide fallback pgd_find_hole for bare chroots") Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20200724064509.331-9-alex.bennee@linaro.org>
* util: add qemu_get_host_physmem utility functionAlex Bennée2020-07-271-0/+12
| | | | | | | | | | | | | | | | This will be used in a future patch. For POSIX systems _SC_PHYS_PAGES isn't standardised but at least appears in the man pages for Open/FreeBSD. The result is advisory so any users of it shouldn't just fail if we can't work it out. The win32 stub currently returns 0 until someone with a Windows system can develop and test a patch. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: BALATON Zoltan <balaton@eik.bme.hu> Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com> Message-Id: <20200724064509.331-5-alex.bennee@linaro.org>
* qemu/osdep: Reword qemu_get_exec_dir() documentationPhilippe Mathieu-Daudé2020-07-211-1/+4
| | | | | | | | | | This comment is confuse, reword it a bit. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Michael Rolnik <mrolnik@gmail.com> Tested-by: Michael Rolnik <mrolnik@gmail.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20200714164257.23330-3-f4bug@amsat.org>
* util: Introduce qemu_get_host_name()Michal Privoznik2020-07-131-0/+10
| | | | | | | | | | | This function offers operating system agnostic way to fetch host name. It is implemented for both POSIX-like and Windows systems. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* osdep.h: For Haiku, define SIGIO as equivalent to SIGPOLLDavid CARLIER2020-07-131-0/+4
| | | | | | | | | | | | | | Haiku doesn't provide SIGIO; fix this up in osdep.h by defining it as equal to SIGPOLL. Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200703145614.16684-6-peter.maydell@linaro.org [PMM: Expanded commit message] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* osdep.h: Always include <sys/signal.h> if it existsDavid CARLIER2020-07-131-1/+1
| | | | | | | | | | | | | | | | | | | | Regularize our handling of <sys/signal.h>: currently we include it in osdep.h, but only for OpenBSD, and we include it without an ifdef guard in a couple of C files. This causes problems for Haiku, which doesn't have that header. Instead, check in configure whether sys/signal.h exists, and if it does then always include it from osdep.h. Signed-off-by: David Carlier <devnexen@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200703145614.16684-5-peter.maydell@linaro.org [PMM: Expanded commit message; rename to HAVE_SYS_SIGNAL_H] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* coverity: provide Coverity-friendly MIN_CONST and MAX_CONSTEric Blake2020-07-101-7/+14
| | | | | | | | | | | | | | | | | | | | | | Coverity has problems seeing through __builtin_choose_expr, which result in it abandoning analysis of later functions that utilize a definition that used MIN_CONST or MAX_CONST, such as in qemu-file.c: 50 DECLARE_BITMAP(may_free, MAX_IOV_SIZE); CID 1429992 (#1 of 1): Unrecoverable parse warning (PARSE_ERROR)1. expr_not_constant: expression must have a constant value As has been done in the past (see 07d66672), it's okay to dumb things down when compiling for static analyzers. (Of course, now the syntax-checker has a false positive on our reference to __COVERITY__...) Reported-by: Peter Maydell <peter.maydell@linaro.org> Fixes: CID 1429992, CID 1429995, CID 1429997, CID 1429999 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200629162804.1096180-1-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* osdep: Make MIN/MAX evaluate arguments only onceEric Blake2020-06-261-10/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm not aware of any immediate bugs in qemu where a second runtime evaluation of the arguments to MIN() or MAX() causes a problem, but proactively preventing such abuse is easier than falling prey to an unintended case down the road. At any rate, here's the conversation that sparked the current patch: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg05718.html Update the MIN/MAX macros to only evaluate their argument once at runtime; this uses typeof(1 ? (a) : (b)) to ensure that we are promoting the temporaries to the same type as the final comparison (we have to trigger type promotion, as typeof(bitfield) won't compile; and we can't use typeof((a) + (b)) or even typeof((a) + 0), as some of our uses of MAX are on void* pointers where such addition is undefined). However, we are unable to work around gcc refusing to compile ({}) in a constant context (such as the array length of a static variable), even when only used in the dead branch of a __builtin_choose_expr(), so we have to provide a second macro pair MIN_CONST and MAX_CONST for use when both arguments are known to be compile-time constants and where the result must also be usable as a constant; this second form evaluates arguments multiple times but that doesn't matter for constants. By using a void expression as the expansion if a non-constant is presented to this second form, we can enlist the compiler to ensure the double evaluation is not attempted on non-constants. Alas, as both macros now rely on compiler intrinsics, they are no longer usable in preprocessor #if conditions; those will just have to be open-coded or the logic rewritten into #define or runtime 'if' conditions (but where the compiler dead-code-elimination will probably still apply). I tested that both gcc 10.1.1 and clang 10.0.0 produce errors for all forms of macro mis-use. As the errors can sometimes be cryptic, I'm demonstrating the gcc output: Use of MIN when MIN_CONST is needed: In file included from /home/eblake/qemu/qemu-img.c:25: /home/eblake/qemu/include/qemu/osdep.h:249:5: error: braced-group within expression allowed only inside a function 249 | ({ \ | ^ /home/eblake/qemu/qemu-img.c:92:12: note: in expansion of macro ‘MIN’ 92 | char array[MIN(1, 2)] = ""; | ^~~ Use of MIN_CONST when MIN is needed: /home/eblake/qemu/qemu-img.c: In function ‘is_allocated_sectors’: /home/eblake/qemu/qemu-img.c:1225:15: error: void value not ignored as it ought to be 1225 | i = MIN_CONST(i, n); | ^ Use of MIN in the preprocessor: In file included from /home/eblake/qemu/accel/tcg/translate-all.c:20: /home/eblake/qemu/accel/tcg/translate-all.c: In function ‘page_check_range’: /home/eblake/qemu/include/qemu/osdep.h:249:6: error: token "{" is not valid in preprocessor expressions 249 | ({ \ | ^ Fix the resulting callsites that used #if or computed a compile-time constant min or max to use the new macros. cpu-defs.h is interesting, as CPU_TLB_DYN_MAX_BITS is sometimes used as a constant and sometimes dynamic. It may be worth improving glib's MIN/MAX definitions to be saner, but that is a task for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200625162602.700741-1-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* chardev: Add macOS to list of OSes that support -chardev serialMikhail Gusarov2020-05-041-1/+1
| | | | | | | | | | macOS API for dealing with serial ports/ttys is identical to BSDs. Signed-off-by: Mikhail Gusarov <dottedmag@dottedmag.net> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20200426210956.17324-1-dottedmag@dottedmag.net> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* osdep.h: Drop no-longer-needed Coverity workaroundsPeter Maydell2020-04-141-14/+0
| | | | | | | | | | | | In commit a1a98357e3fd in 2018 we added some workarounds for Coverity not being able to handle the _Float* types introduced by recent glibc. Newer versions of the Coverity scan tools have support for these types, and will fail with errors about duplicate typedefs if we have our workaround. Remove our copy of the typedefs. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200319193323.2038-2-peter.maydell@linaro.org
* osdep: add qemu_unlink()Marc-André Lureau2020-01-021-0/+1
| | | | | | | | | Add a helper function to match qemu_open() which may return files under the /dev/fdset prefix. Those shouldn't be removed, since it's only a qemu namespace. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
* core: replace getpagesize() with qemu_real_host_page_sizeWei Yang2019-10-261-2/+2
| | | | | | | | | | | | | | | | | | | | | There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* memory: fetch pmem size in get_file_size()Stefan Hajnoczi2019-09-161-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Neither stat(2) nor lseek(2) report the size of Linux devdax pmem character device nodes. Commit 314aec4a6e06844937f1677f6cba21981005f389 ("hostmem-file: reject invalid pmem file sizes") added code to hostmem-file.c to fetch the size from sysfs and compare against the user-provided size=NUM parameter: if (backend->size > size) { error_setg(errp, "size property %" PRIu64 " is larger than " "pmem file \"%s\" size %" PRIu64, backend->size, fb->mem_path, size); return; } It turns out that exec.c:qemu_ram_alloc_from_fd() already has an equivalent size check but it skips devdax pmem character devices because lseek(2) returns 0: if (file_size > 0 && file_size < size) { error_setg(errp, "backing store %s size 0x%" PRIx64 " does not match 'size' option 0x" RAM_ADDR_FMT, mem_path, file_size, size); return NULL; } This patch moves the devdax pmem file size code into get_file_size() so that we check the memory size in a single place: qemu_ram_alloc_from_fd(). This simplifies the code and makes it more general. This also fixes the problem that hostmem-file only checks the devdax pmem file size when the pmem=on parameter is given. An unchecked size=NUM parameter can lead to SIGBUS in QEMU so we must always fetch the file size for Linux devdax pmem character device nodes. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190830093056.12572-1-stefanha@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* osdep: Fix mingw compilation regarding stdio formatsCao Jiaxi2019-05-071-5/+5
| | | | | | | | | | | | | | | | | | | | I encountered the following compilation error on mingw: /mnt/d/qemu/include/qemu/osdep.h:97:9: error: '__USE_MINGW_ANSI_STDIO' macro redefined [-Werror,-Wmacro-redefined] #define __USE_MINGW_ANSI_STDIO 1 ^ /mnt/d/llvm-mingw/aarch64-w64-mingw32/include/_mingw.h:433:9: note: previous definition is here #define __USE_MINGW_ANSI_STDIO 0 /* was not defined so it should be 0 */ It turns out that __USE_MINGW_ANSI_STDIO must be set before any system headers are included, not just before stdio.h. Signed-off-by: Cao Jiaxi <driver1998@foxmail.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Message-id: 20190503003719.10233-1-driver1998@foxmail.com Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* hostmem-file: reject invalid pmem file sizesStefan Hajnoczi2019-03-111-0/+13
| | | | | | | | | | | | | | | | | | | | | Guests started with NVDIMMs larger than the underlying host file produce confusing errors inside the guest. This happens because the guest accesses pages beyond the end of the file. Check the pmem file size on startup and print a clear error message if the size is invalid. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1669053 Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Zhang Yi <yi.z.zhang@linux.intel.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190214031004.32522-3-stefanha@redhat.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Pankaj Gupta <pagupta@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* qemu-io: Add generic function for reinitializing optind.Richard W.M. Jones2019-01-311-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On FreeBSD 11.2: $ nbdkit memory size=1M --run './qemu-io -f raw -c "aio_write 0 512" $nbd' Parsing error: non-numeric argument, or extraneous/unrecognized suffix -- aio_write After main option parsing, we reinitialize optind so we can parse each command. However reinitializing optind to 0 does not work on FreeBSD. What happens when you do this is optind remains 0 after the option parsing loop, and the result is we try to parse argv[optind] == argv[0] == "aio_write" as if it was the first parameter. The FreeBSD manual page says: In order to use getopt() to evaluate multiple sets of arguments, or to evaluate a single set of arguments multiple times, the variable optreset must be set to 1 before the second and each additional set of calls to getopt(), and the variable optind must be reinitialized. (From the rest of the man page it is clear that optind must be reinitialized to 1). The glibc man page says: A program that scans multiple argument vectors, or rescans the same vector more than once, and wants to make use of GNU extensions such as '+' and '-' at the start of optstring, or changes the value of POSIXLY_CORRECT between scans, must reinitialize getopt() by resetting optind to 0, rather than the traditional value of 1. (Resetting to 0 forces the invocation of an internal initialization routine that rechecks POSIXLY_CORRECT and checks for GNU extensions in optstring.) This commit introduces an OS-portability function called qemu_reset_optind which provides a way of resetting optind that works on FreeBSD and platforms that use optreset, while keeping it the same as now on other platforms. Note that the qemu codebase sets optind in many other places, but in those other places it's setting a local variable and not using getopt. This change is only needed in places where we are using getopt and the associated global variable optind. Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Message-id: 20190118101114.11759-2-rjones@redhat.com Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
* build-sys: build with Vista API by defaultMarc-André Lureau2019-01-111-2/+2
| | | | | | | | | | | | | Both qemu & qga build with Vista API by default already, by defining _WIN32_WINNT 0x0600. Set it globally in osdep.h instead. This replaces WINVER by _WIN32_WINNT in osdep.h. WINVER doesn't seem to be really useful these days. (see also https://blogs.msdn.microsoft.com/oldnewthing/20070411-00/?p=27283) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181122110039.15972-4-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* build-sys: move windows defines in osdep.h headerMarc-André Lureau2019-01-111-0/+17
| | | | | | | | | | | | This removes some clutter in compilation logging, and allows some easier tweaking per compilation unit/CFLAGS overriding. Note that we can't move those define in os-win32.h, since they must be set before the first system headers are included. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181122110039.15972-3-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* osdep: Work around MinGW assertRichard Henderson2018-10-231-0/+12
| | | | | | | | | | | | | | | | | | | | In several places we use assert(FEATURE), and assume that if FEATURE is disabled, all following code is removed as unreachable. Which allows us to compile-out functions that are only present with FEATURE, and have a link-time failure if the functions remain used. MinGW does not mark its internal function _assert() as noreturn, so the compiler cannot see when code is unreachable, which leads to link errors for this host that are not present elsewhere. The current build-time failure concerns 62823083b8a2, but I remember having seen this same error before. Fix it once and for all for MinGW. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20181022181623.8810-1-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* util: add qemu_write_pidfile()Marc-André Lureau2018-10-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are variants of qemu_create_pidfile() in qemu-pr-helper and qemu-ga. Let's have a common implementation in libqemuutil. The code is initially based from pr-helper write_pidfile(), with various improvements and suggestions from Daniel Berrangé: QEMU will leave the pidfile existing on disk when it exits which initially made me think it avoids the deletion race. The app managing QEMU, however, may well delete the pidfile after it has seen QEMU exit, and even if the app locks the pidfile before deleting it, there is still a race. eg consider the following sequence QEMU 1 libvirtd QEMU 2 1. lock(pidfile) 2. exit() 3. open(pidfile) 4. lock(pidfile) 5. open(pidfile) 6. unlink(pidfile) 7. close(pidfile) 8. lock(pidfile) IOW, at step 8 the new QEMU has successfully acquired the lock, but the pidfile no longer exists on disk because it was deleted after the original QEMU exited. While we could just say no external app should ever delete the pidfile, I don't think that is satisfactory as people don't read docs, and admins don't like stale pidfiles being left around on disk. To make this robust, I think we might want to copy libvirt's approach to pidfile acquisition which runs in a loop and checks that the file on disk /after/ acquiring the lock matches the file that was locked. Then we could in fact safely let QEMU delete its own pidfiles on clean exit.. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180831145314.14736-2-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cacheinfo: add i/d cache_linesize_logEmilio G. Cota2018-10-021-0/+2
| | | | | | Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20180910232752.31565-2-cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* osdep: work around Coverity parsing errorsPaolo Bonzini2018-06-281-0/+15
| | | | | | | | | | Coverity does not like the new _Float* types that are used by recent glibc, and croaks on every single file that includes stdlib.h. Add dummy typedefs to please it. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* osdep: powerpc64 align memory to allow 2MB radix THP page tablesNicholas Piggin2018-06-121-1/+2
| | | | | | | | | This allows KVM with the Book3S radix MMU mode to take advantage of THP and install larger pages in the partition scope page tables (the host translation). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* osdep: add wait.h compat macrosMichael S. Tsirkin2018-05-241-0/+10
| | | | | | | | | | | | | | | Man page for WCOREDUMP says: WCOREDUMP(wstatus) returns true if the child produced a core dump. This macro should be employed only if WIFSIGNALED returned true. This macro is not specified in POSIX.1-2001 and is not available on some UNIX implementations (e.g., AIX, SunOS). Therefore, enclose its use inside #ifdef WCOREDUMP ... #endif. Let's do exactly this. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>