focaccia-qemu - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Files	Lines
2020-10-09	nbd/server: Reject embedded NUL in NBD strings	Eric Blake	1	-10/+20
	The NBD spec is clear that any string sent from the client must not contain embedded NUL characters. If the client passes "a\0", we should reject that option request rather than act on "a". Testing this is not possible with a compliant client, but I was able to use gdb to coerce libnbd into temporarily behaving as such a client. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200930121105.667049-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09	qemu-nbd: Honor SIGINT and SIGHUP	Eric Blake	1	-7/+8
	Honoring just SIGTERM on Linux is too weak; we also want to handle other common signals, and do so even on BSD. Why? Because at least 'qemu-nbd -B bitmap' needs a chance to clean up the in-use bit on bitmaps when the server is shut down via a signal. See also: http://bugzilla.redhat.com/1883608 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200930121105.667049-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: apply comment tweak suggested by Vladimir; fix ifdef around termsig_handler] Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09	block/nbd: nbd_co_reconnect_loop(): don't connect if drained	Vladimir Sementsov-Ogievskiy	1	-0/+3
	In a recent commit 12c75e20a269ac we've improved nbd_co_reconnect_loop() to not make drain wait for additional sleep. Similarly, we shouldn't try to connect, if previous sleep was interrupted by drain begin, otherwise drain_begin will have to wait for the whole connection attempt. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200903190301.367620-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09	block/nbd: fix reconnect-delay	Vladimir Sementsov-Ogievskiy	1	-9/+50
	reconnect-delay has a design flaw: we handle it in the same loop where we do connection attempt. So, reconnect-delay may be exceeded by unpredictable time of connection attempt. Let's instead use separate timer. How to reproduce the bug: 1. Create an image on node1: qemu-img create -f qcow2 xx 100M 2. Start NBD server on node1: qemu-nbd xx 3. On node2 start qemu-io: ./build/qemu-io --image-opts \ driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=15 4. Type 'read 0 512' in qemu-io interface to check that connection works Be careful: you should make steps 5-7 in a short time, less than 15 seconds. 5. Kill nbd server on node1 6. Run 'read 0 512' in qemu-io interface again, to be sure that nbd client goes to reconnect loop. 7. On node1 run the following command sudo iptables -A INPUT -p tcp --dport 10809 -j DROP This will make the connect() call of qemu-io at node2 take a long time. And you'll see that read command in qemu-io will hang for a long time, more than 15 seconds specified by reconnect-delay parameter. It's the bug. 8. Don't forget to drop iptables rule on node1: sudo iptables -D INPUT -p tcp --dport 10809 -j DROP Important note: Step [5] is necessary to reproduce _this_ bug. If we miss step [5], the read command (step 6) will hang for a long time and this commit doesn't help, because there will be not long connect() to unreachable host, but long sendmsg() to unreachable host, which should be fixed by enabling and adjusting keep-alive on the socket, which is a thing for further patch set. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200903190301.367620-4-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09	block/nbd: correctly use qio_channel_detach_aio_context when needed	Vladimir Sementsov-Ogievskiy	1	-2/+2
	Don't use nbd_client_detach_aio_context() driver handler where we want to finalize the connection. We should directly use qio_channel_detach_aio_context() in such cases. Driver handler may (and will) contain another things, unrelated to the qio channel. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200903190301.367620-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09	block/nbd: fix drain dead-lock because of nbd reconnect-delay	Vladimir Sementsov-Ogievskiy	1	-0/+5
	We pause reconnect process during drained section. So, if we have some requests, waiting for reconnect we should cancel them, otherwise they deadlock the drained section. How to reproduce: 1. Create an image: qemu-img create -f qcow2 xx 100M 2. Start NBD server: qemu-nbd xx 3. Start vm with second nbd disk on node2, like this: ./build/x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \ file=/work/images/cent7.qcow2 -drive \ driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=60 \ -vnc :0 -m 2G -enable-kvm -vga std 4. Access the vm through vnc (or some other way?), and check that NBD drive works: dd if=/dev/sdb of=/dev/null bs=1M count=10 - the command should succeed. 5. Now, kill the nbd server, and run dd in the guest again: dd if=/dev/sdb of=/dev/null bs=1M count=10 Now Qemu is trying to reconnect, and dd-generated requests are waiting for the connection (they will wait up to 60 seconds (see reconnect-delay option above) and than fail). But suddenly, vm may totally hang in the deadlock. You may need to increase reconnect-delay period to catch the dead-lock. VM doesn't respond because drain dead-lock happens in cpu thread with global mutex taken. That's not good thing by itself and is not fixed by this commit (true way is using iothreads). Still this commit fixes drain dead-lock itself. Note: probably, we can instead continue to reconnect during drained section. To achieve this, we may move negotiation to the connect thread to make it independent of bs aio context. But expanding drained section doesn't seem good anyway. So, let's now fix the bug the simplest way. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200903190301.367620-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09	nbd: silence maybe-uninitialized warnings	Christian Borntraeger	1	-1/+1
	gcc 10 from Fedora 32 gives me: Compiling C object libblock.fa.p/nbd_server.c.o ../nbd/server.c: In function ‘nbd_co_client_start’: ../nbd/server.c:625:14: error: ‘namelen’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 625 \| rc = nbd_negotiate_send_info(client, NBD_INFO_NAME, namelen, name, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 626 \| errp); \| ~~~~~ ../nbd/server.c:564:14: note: ‘namelen’ was declared here 564 \| uint32_t namelen; \| ^~~~~~~ cc1: all warnings being treated as errors As I cannot see how this can happen, let uns silence the warning. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20200930155859.303148-3-borntraeger@de.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
2020-10-09	tests/acceptance: disable machine_rx_gdbsim on GitLab	Alex Bennée	1	-0/+1
	While I can get the ssh test to fail on my test setup this seems a lot more stable except when on GitLab. Hopefully we can re-enable both once the serial timing patches have been added. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Willian Rampazzo <willianr@redhat.com> Reviewed-by: Cleber Rosa <crosa@redhat.com> Message-Id: <20201007160038.26953-23-alex.bennee@linaro.org>
2020-10-09	cirrus: use V=1 when running tests on FreeBSD and macOS	Paolo Bonzini	1	-3/+3
	Using "V=1" makes it easier to identify hanging tests, especially since they are run with -j1. It is already used on Windows builds, do the same for FreeBSD and macOS. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Ed Maste <emaste@FreeBSD.org> Message-Id: <20201007140103.711142-1-pbonzini@redhat.com> Message-Id: <20201007160038.26953-22-alex.bennee@linaro.org>
2020-10-09	plugin: Fixes compiling errors on msys2/mingw	Yonggang Luo	2	-3/+3
	Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20201001163429.1348-3-luoyonggang@gmail.com> Message-Id: <20201007160038.26953-21-alex.bennee@linaro.org>
2020-10-09	plugins: Fixes a issue when dlsym failed, the handle not closed	Yonggang Luo	1	-0/+1
	Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20201001163429.1348-2-luoyonggang@gmail.com> Message-Id: <20201007160038.26953-20-alex.bennee@linaro.org>
2020-10-09	.mailmap: Fix more contributor entries	Philippe Mathieu-Daudé	1	-0/+2
	These authors have some incorrect author email field. For each of them, there is one commit with the replaced entry. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Stefan Berger <stefanb@linux.ibm.com> Acked-by: Erik Smit <erik.lucas.smit@gmail.com> Message-Id: <20201006160653.2391972-13-f4bug@amsat.org> Message-Id: <20201007160038.26953-19-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add Yandex to the domain map	Philippe Mathieu-Daudé	1	-0/+1
	There is a number of contributors from this domain, add its own entry to the gitdm domain map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru> Message-Id: <20201006160653.2391972-12-f4bug@amsat.org> Message-Id: <20201007160038.26953-18-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add Yadro to the domain map	Philippe Mathieu-Daudé	1	-0/+1
	There is a number of contributions from this domain, add its own entry to the gitdm domain map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20201006160653.2391972-11-f4bug@amsat.org> Message-Id: <20201007160038.26953-17-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add SUSE to the domain map	Philippe Mathieu-Daudé	1	-0/+1
	There is a number of contributors from this domain, add its own entry to the gitdm domain map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Bruce Rogers <brogers@suse.com> Message-Id: <20201006160653.2391972-10-f4bug@amsat.org> Message-Id: <20201007160038.26953-16-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add Nir Soffer to Red Hat domain	Philippe Mathieu-Daudé	1	-0/+1
	Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Nir Soffer <nirsof@gmail.com> Message-Id: <20201006160653.2391972-9-f4bug@amsat.org> Message-Id: <20201007160038.26953-15-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add Qualcomm to the domain map	Philippe Mathieu-Daudé	1	-0/+1
	There is a number of contributions from this domain, add its own entry to the gitdm domain map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Taylor Simpson <tsimpson@quicinc.com> Message-Id: <20201006160653.2391972-8-f4bug@amsat.org> Message-Id: <20201007160038.26953-14-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add Nuvia to the domain map	Philippe Mathieu-Daudé	1	-0/+1
	There is a number of contributions from this domain, add its own entry to the gitdm domain map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Graeme Gregory <graeme@nuviainc.com> Message-Id: <20201006160653.2391972-7-f4bug@amsat.org> Message-Id: <20201007160038.26953-13-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add Google to the domain map	Philippe Mathieu-Daudé	1	-1/+2
	There is a number of contributors from this domain, add its own entry to the gitdm domain map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Erik Kline <ek@google.com> Message-Id: <20201006160653.2391972-6-f4bug@amsat.org> Message-Id: <20201007160038.26953-12-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add ByteDance to the domain map	Philippe Mathieu-Daudé	1	-0/+1
	There is a number of contributors from this domain, add its own entry to the gitdm domain map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com> Message-Id: <20201006160653.2391972-5-f4bug@amsat.org> Message-Id: <20201007160038.26953-11-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add Baidu to the domain map	Philippe Mathieu-Daudé	1	-0/+1
	There is a number of contributors from this domain, add its own entry to the gitdm domain map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Chai Wen <chaiwen@baidu.com> Message-Id: <20201006160653.2391972-4-f4bug@amsat.org> Message-Id: <20201007160038.26953-10-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add more individual contributors	Philippe Mathieu-Daudé	1	-0/+7
	These individual contributors have a number of contributions, add them to the 'individual' group map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Helge Deller <deller@gmx.de> Acked-by: Stefan Weil <sw@weilnetz.de> Acked-by: Niek Linnenbank <nieklinnenbank@gmail.com> Acked-by: David Carlier <devnexen@gmail.com> Acked-by: Paul Zimmerman <pauldzim@gmail.com> Acked-by: Volker Rümelin <vr_qemu@t-online.de> Acked-by: Finn Thain <fthain@telegraphics.com.au> Message-Id: <20201006160653.2391972-3-f4bug@amsat.org> Message-Id: <20201007160038.26953-9-alex.bennee@linaro.org>
2020-10-09	contrib/gitdm: Add more academic domains	Philippe Mathieu-Daudé	1	-0/+4
	There is a number of contributions from these academic domains. Add the entries to the gitdm 'academic' domain map. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Dayeol Lee <dayeol@berkeley.edu> Acked-by: Fan Yang <Fan_Yang@sjtu.edu.cn> Acked-by: Xinyu Li <precinct@mail.ustc.edu.cn> Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Message-Id: <20201006160653.2391972-2-f4bug@amsat.org> Message-Id: <20201007160038.26953-8-alex.bennee@linaro.org>
2020-10-09	tests/docker: Add genisoimage to the docker file	Thomas Huth	4	-0/+4
	genisoimage is needed for running the tests/qtest/cdrom-test qtest. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201006174347.152040-1-thuth@redhat.com> Message-Id: <20201007160038.26953-7-alex.bennee@linaro.org>
2020-10-09	cirrus: msys2/mingw speed is up, add excluded target back	Yonggang Luo	1	-2/+1
	The following target are add back: i386-softmmu,arm-softmmu,ppc-softmmu,mips-softmmu Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20201007145300.1197-3-luoyonggang@gmail.com> Message-Id: <20201007160038.26953-6-alex.bennee@linaro.org>
2020-10-09	cirrus: Fixing and speedup the msys2/mingw CI	Yonggang Luo	1	-45/+67
	Use cache of cirrus caching msys2 The install of msys2 are refer to https://github.com/msys2/setup-msys2 The first time install msys2 would be time consuming, so increase timeout_in to 90m according to https://cirrus-ci.org/faq/#instance-timed-out Apply patch of https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg00072.html [AJB: renamed printenv_script to setup_script] Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20201007145300.1197-2-luoyonggang@gmail.com> Message-Id: <20201007160038.26953-5-alex.bennee@linaro.org>
2020-10-09	hw/ide: restore replay support of IDE	Alex Bennée	1	-2/+2
	A recent change to weak reset handling broke replay due to the use of aio_bh_schedule_oneshot instead of the replay aware replay_bh_schedule_oneshot_event. Fixes: 55adb3c456 ("ide: cancel pending callbacks on SRST") Suggested-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: John Snow <jsnow@redhat.com> Acked-by: John Snow <jsnow@redhat.com> Message-Id: <20201007160038.26953-4-alex.bennee@linaro.org>
2020-10-09	hw/misc/mips_cpc: Start vCPU when powered on	Philippe Mathieu-Daudé	1	-0/+1
	In commit 102ca9667d we set "start-powered-off" on all vCPUs included in the CPS (Coherent Processing System) but forgot to start the vCPUS on when they are powered on in the CPC (Cluster Power Controller). This fixes the following tests: $ avocado run tests/acceptance/machine_mips_malta.py (1/3) test_mips_malta_i6400_framebuffer_logo_1core: PASS (3.67 s) (2/3) test_mips_malta_i6400_framebuffer_logo_7cores: INTERRUPTED: Test interrupted by SIGTERM (30.22 s) (3/3) test_mips_malta_i6400_framebuffer_logo_8cores: INTERRUPTED: Test interrupted by SIGTERM (30.25 s) RESULTS : PASS 1 \| ERROR 0 \| FAIL 0 \| SKIP 0 \| WARN 0 \| INTERRUPT 2 \| CANCEL 0 Fixes: 102ca9667d ("mips/cps: Use start-powered-off CPUState property") Reported-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20201007113942.2523866-1-f4bug@amsat.org> Message-Id: <20201007160038.26953-3-alex.bennee@linaro.org>
2020-10-09	configure: fix performance regression due to PIC objects	Paolo Bonzini	1	-0/+1
	Because most files in QEMU are grouped into static libraries, Meson conservatively compiles them with -fPIC. This is overkill and produces slowdowns up to 20% on some TCG tests. As a stopgap measure, use the b_staticpic option to limit the slowdown to --enable-pie. https://github.com/mesonbuild/meson/pull/7760 will allow us to use b_staticpic=false and let Meson do the right thing. Reported-by: Ahmed Karaman <ahmedkrmn@outlook.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200924092314.1722645-57-pbonzini@redhat.com> Message-Id: <20201007160038.26953-2-alex.bennee@linaro.org>
2020-10-09	error: Use error_fatal to simplify obvious fatal errors (again)	Markus Armbruster	3	-21/+7
	Patch created mechanically by rerunning: $ spatch --in-place --sp-file scripts/coccinelle/use-error_fatal.cocci \ --macro-file scripts/cocci-macro-file.h --use-gitgrep . Variables now unused dropped manually. Cc: Eric Auger <eric.auger@redhat.com> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200722084048.1726105-5-armbru@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2020-10-09	error: Remove NULL checks on error_propagate() calls (again)	Markus Armbruster	3	-15/+5
	Patch created mechanically by rerunning: $ spatch --sp-file scripts/coccinelle/error_propagate_null.cocci \ --macro-file scripts/cocci-macro-file.h \ --use-gitgrep . Cc: Jens Freimann <jfreimann@redhat.com> Cc: Hailiang Zhang <zhang.zhanghailiang@huawei.com> Cc: Juan Quintela <quintela@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200722084048.1726105-4-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2020-10-09	block: Convert 'block_resize' to coroutine	Kevin Wolf	3	-9/+11
	block_resize performs some I/O that could potentially take quite some time, so use it as an example for the new 'coroutine': true annotation in the QAPI schema. bdrv_truncate() requires that we're already in the right AioContext for the BlockDriverState if called in coroutine context. So instead of just taking the AioContext lock, move the QMP handler coroutine to the context. Call blk_unref() only after switching back because blk_unref() may only be called in the main thread. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-15-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	block: Add bdrv_lock()/unlock()	Kevin Wolf	2	-0/+41
	Inside of coroutine context, we can't directly use aio_context_acquire() for the AioContext of a block node because we already own the lock of the current AioContext and we need to avoid double locking to prevent deadlocks. This provides helper functions to lock the AioContext of a node only if it's not the same as the current AioContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-14-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	block: Add bdrv_co_enter()/leave()	Kevin Wolf	2	-0/+40
	Add a pair of functions to temporarily move the current coroutine to the AioContext of a given BlockDriverState. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-13-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	util/async: Add aio_co_reschedule_self()	Kevin Wolf	2	-0/+40
	Add a function that can be used to move the currently running coroutine to a different AioContext (and therefore potentially a different thread). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201005155855.256490-12-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	hmp: Add support for coroutine command handlers	Kevin Wolf	3	-7/+35
	Often, QMP command handlers are not only called to handle QMP commands, but also from a corresponding HMP command handler. In order to give them a consistent environment, optionally run HMP command handlers in a coroutine, too. The implementation is a lot simpler than in QMP because for HMP, we still block the VM while the coroutine is running. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20201005155855.256490-11-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	qmp: Move dispatcher to a coroutine	Kevin Wolf	7	-46/+214
	This moves the QMP dispatcher to a coroutine and runs all QMP command handlers that declare 'coroutine': true in coroutine context so they can avoid blocking the main loop while doing I/O or waiting for other events. For commands that are not declared safe to run in a coroutine, the dispatcher drops out of coroutine context by calling the QMP command handler from a bottom half. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20201005155855.256490-10-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	qapi: Add a 'coroutine' flag for commands	Kevin Wolf	15	-14/+73
	This patch adds a new 'coroutine' flag to QMP command definitions that tells the QMP dispatcher that the command handler is safe to be run in a coroutine. The documentation of the new flag pretends that this flag is already used as intended, which it isn't yet after this patch. We'll implement this in another patch in this series. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-9-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	monitor: Make current monitor a per-coroutine property	Kevin Wolf	5	-13/+33
	This way, a monitor command handler will still be able to access the current monitor, but when it yields, all other code code will correctly get NULL from monitor_cur(). This uses a hash table to map the coroutine pointer to the current monitor of that coroutine. Outside of coroutine context, we associate the current monitor with the leader coroutine of the current thread. Approaches to implement some form of coroutine local storage directly in the coroutine core code have been considered and discarded because they didn't end up being much more generic than the hash table and their performance impact on coroutines not using coroutine local storage was unclear. As the block layer uses a coroutine per I/O request, this is a fast path and we have to be careful. It's safest to just stay out of this path with code only used by the monitor. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20201005155855.256490-8-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	qmp: Call monitor_set_cur() only in qmp_dispatch()	Kevin Wolf	6	-13/+20
	The correct way to set the current monitor for a coroutine handler will be different than for a blocking handler, so monitor_set_cur() needs to be called in qmp_dispatch(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-7-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	qmp: Assert that no other monitor is active	Kevin Wolf	1	-1/+4
	monitor_qmp_dispatch() is never supposed to be called in the context of another monitor, so assert that monitor_cur() is NULL instead of saving and restoring it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-6-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	hmp: Update current monitor only in handle_hmp_command()	Kevin Wolf	2	-10/+5
	The current monitor is updated relatively early in the command handling code even though only the command handler actually needs it. The current monitor will become coroutine-local later, so we can only update it when we know in which coroutine the command will be exectued. Move it to handle_hmp_command() where this information will be available. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20201005155855.256490-5-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	monitor: Use getter/setter functions for cur_mon	Kevin Wolf	21	-46/+73
	cur_mon really needs to be coroutine-local as soon as we move monitor command handlers to coroutines and let them yield. As a first step, just remove all direct accesses to cur_mon so that we can implement this in the getter function later. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-4-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	monitor: Add Monitor parameter to monitor_get_cpu_index()	Kevin Wolf	5	-14/+14
	Most callers actually don't have to rely on cur_mon, but already know for which monitor they call monitor_get_cpu_index(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201005155855.256490-3-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	monitor: Add Monitor parameter to monitor_set_cpu()	Kevin Wolf	3	-7/+7
	Most callers actually don't have to rely on cur_mon, but already know for which monitor they call monitor_set_cpu(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20201005155855.256490-2-kwolf@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-10-09	specs/ppc-spapr-numa: update with new NUMA support	Daniel Henrique Barboza	1	-8/+227
	This update provides more in depth information about the choices and drawbacks of the new NUMA support for the spapr machine. Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20201007172849.302240-6-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-09	spapr_numa: consider user input when defining associativity	Daniel Henrique Barboza	1	-1/+109
	A new function called spapr_numa_define_associativity_domains() is created to calculate the associativity domains and change the associativity arrays considering user input. This is how the associativity domain between two NUMA nodes A and B is calculated: - get the distance D between them - get the correspondent NUMA level 'n_level' for D. This is done via a helper called spapr_numa_get_numa_level() - all associativity arrays were initialized with their own numa_ids, and we're calculating the distance in node_id ascending order, starting from node id 0 (the first node retrieved by numa_state). This will have a cascade effect in the algorithm because the associativity domains that node 0 defines will be carried over to other nodes, and node 1 associativities will be carried over after taking node 0 associativities into account, and so on. This happens because we'll assign assoc_src as the associativity domain of dst as well, for all NUMA levels beyond and including n_level. The PPC kernel expects the associativity domains of the first node (node id 0) to be always 0 [1], and this algorithm will grant that by default. Ultimately, all of this results in a best effort approximation for the actual NUMA distances the user input in the command line. Given the nature of how PAPR itself interprets NUMA distances versus the expectations risen by how ACPI SLIT works, there might be better algorithms but, in the end, it'll also result in another way to approximate what the user really wanted. To keep this commit message no longer than it already is, the next patch will update the existing documentation in ppc-spapr-numa.rst with more in depth details and design considerations/drawbacks. [1] https://lore.kernel.org/linuxppc-dev/5e8fbea3-8faf-0951-172a-b41a2138fbcf@gmail.com/ Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20201007172849.302240-5-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-09	spapr_numa: change reference-points and maxdomain settings	Daniel Henrique Barboza	1	-8/+35
	This is the first guest visible change introduced in spapr_numa.c. The previous settings of both reference-points and maxdomains were too restrictive, but enough for the existing associativity we're setting in the resources. We'll change that in the following patches, populating the associativity arrays based on user input. For those changes to be effective, reference-points and maxdomains must be more flexible. After this patch, we'll have 4 distinct levels of NUMA (0x4, 0x3, 0x2, 0x1) and maxdomains will allow for any type of configuration the user intends to do - under the scope and limitations of PAPR itself, of course. Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20201007172849.302240-4-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-09	spapr_numa: forbid asymmetrical NUMA setups	Daniel Henrique Barboza	1	-0/+34
	The pSeries machine does not support asymmetrical NUMA configurations. This doesn't make much of a different since we're not using user input for pSeries NUMA setup, but this will change in the next patches. To avoid breaking existing setups, gate this change by checking for legacy NUMA support. Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20201007172849.302240-3-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-10-09	spapr: add spapr_machine_using_legacy_numa() helper	Daniel Henrique Barboza	2	-0/+14
	The changes to come to NUMA support are all guest visible. In theory we could just create a new 5_1 class option flag to avoid the changes to cascade to 5.1 and under. The reality is that these changes are only relevant if the machine has more than one NUMA node. There is no need to change guest behavior that has been around for years needlesly. This new helper will be used by the next patches to determine whether we should retain the (soon to be) legacy NUMA behavior in the pSeries machine. The new behavior will only be exposed if: - machine is pseries-5.2 and newer; - more than one NUMA node is declared in NUMA state. Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20201007172849.302240-2-danielhb413@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>