diff options
Diffstat (limited to 'results/classifier/zero-shot/006/semantic')
9 files changed, 1859 insertions, 0 deletions
diff --git a/results/classifier/zero-shot/006/semantic/12360755 b/results/classifier/zero-shot/006/semantic/12360755 new file mode 100644 index 000000000..62cae74a7 --- /dev/null +++ b/results/classifier/zero-shot/006/semantic/12360755 @@ -0,0 +1,301 @@ +semantic: 0.911 +device: 0.902 +graphic: 0.899 +other: 0.886 +boot: 0.818 +vnc: 0.810 +socket: 0.805 +KVM: 0.770 +network: 0.738 + +[Qemu-devel] [BUG] virtio-net linux driver fails to probe on MIPS Malta since 'hw/virtio-pci: fix virtio behaviour' + +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? + +Cheers +James +signature.asc +Description: +Digital signature + +On 03/17/2017 11:57 PM, James Hogan wrote: +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? +Hi, + + +This is strange. This commit changes virtio devices from legacy to virtio +"transitional". +(your command line changes it to legacy) +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +side +there is nothing new. + +Michael, do you have any idea? + +Thanks, +Marcel +Cheers +James + +On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: +> +On 03/17/2017 11:57 PM, James Hogan wrote: +> +> Hi, +> +> +> +> I've bisected the following failure of the virtio_net linux v4.10 driver +> +> to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: +> +> +> +> virtio_net virtio0: virtio: device uses modern interface but does not have +> +> VIRTIO_F_VERSION_1 +> +> virtio_net: probe of virtio0 failed with error -22 +> +> +> +> To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). +> +> +> +> It appears that adding ",disable-modern=on,disable-legacy=off" to the +> +> virtio-net -device makes it work again. +> +> +> +> I presume this should really just work out of the box. Any ideas why it +> +> isn't? +> +> +> +> +Hi, +> +> +> +This is strange. This commit changes virtio devices from legacy to virtio +> +"transitional". +> +(your command line changes it to legacy) +> +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +> +side +> +there is nothing new. +> +> +Michael, do you have any idea? +> +> +Thanks, +> +Marcel +My guess would be firmware mishandling 64 bit BARs - we saw such +a case on sparc previously. As a result you are probably reading +all zeroes from features register or something like that. +Marcel, could you send a patch making the bar 32 bit? +If that helps we know what the issue is. + +> +> Cheers +> +> James +> +> + +On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote: +On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: +On 03/17/2017 11:57 PM, James Hogan wrote: +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? +Hi, + + +This is strange. This commit changes virtio devices from legacy to virtio +"transitional". +(your command line changes it to legacy) +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +side +there is nothing new. + +Michael, do you have any idea? + +Thanks, +Marcel +My guess would be firmware mishandling 64 bit BARs - we saw such +a case on sparc previously. As a result you are probably reading +all zeroes from features register or something like that. +Marcel, could you send a patch making the bar 32 bit? +If that helps we know what the issue is. +Sure, + +Thanks, +Marcel +Cheers +James + +On 03/20/2017 05:43 PM, Michael S. Tsirkin wrote: +On Mon, Mar 20, 2017 at 05:21:22PM +0200, Marcel Apfelbaum wrote: +On 03/17/2017 11:57 PM, James Hogan wrote: +Hi, + +I've bisected the following failure of the virtio_net linux v4.10 driver +to probe in QEMU v2.9.0-rc1 emulating a MIPS Malta machine: + +virtio_net virtio0: virtio: device uses modern interface but does not have +VIRTIO_F_VERSION_1 +virtio_net: probe of virtio0 failed with error -22 + +To QEMU commit 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour"). + +It appears that adding ",disable-modern=on,disable-legacy=off" to the +virtio-net -device makes it work again. + +I presume this should really just work out of the box. Any ideas why it +isn't? +Hi, + + +This is strange. This commit changes virtio devices from legacy to virtio +"transitional". +(your command line changes it to legacy) +Linux 4.10 supports virtio modern/transitional (as far as I know) and on QEMU +side +there is nothing new. + +Michael, do you have any idea? + +Thanks, +Marcel +My guess would be firmware mishandling 64 bit BARs - we saw such +a case on sparc previously. As a result you are probably reading +all zeroes from features register or something like that. +Marcel, could you send a patch making the bar 32 bit? +If that helps we know what the issue is. +Hi James, + +Can you please check if the below patch fixes the problem? +Please note it is not a solution. + +diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c +index f9b7244..5b4d429 100644 +--- a/hw/virtio/virtio-pci.c ++++ b/hw/virtio/virtio-pci.c +@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d, +Error **errp) + } + + pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx, +- PCI_BASE_ADDRESS_SPACE_MEMORY | +- PCI_BASE_ADDRESS_MEM_PREFETCH | +- PCI_BASE_ADDRESS_MEM_TYPE_64, ++ PCI_BASE_ADDRESS_SPACE_MEMORY, + &proxy->modern_bar); + + proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap); + + +Thanks, +Marcel + +Hi Marcel, + +On Tue, Mar 21, 2017 at 04:16:58PM +0200, Marcel Apfelbaum wrote: +> +Can you please check if the below patch fixes the problem? +> +Please note it is not a solution. +> +> +diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c +> +index f9b7244..5b4d429 100644 +> +--- a/hw/virtio/virtio-pci.c +> ++++ b/hw/virtio/virtio-pci.c +> +@@ -1671,9 +1671,7 @@ static void virtio_pci_device_plugged(DeviceState *d, +> +Error **errp) +> +} +> +> +pci_register_bar(&proxy->pci_dev, proxy->modern_mem_bar_idx, +> +- PCI_BASE_ADDRESS_SPACE_MEMORY | +> +- PCI_BASE_ADDRESS_MEM_PREFETCH | +> +- PCI_BASE_ADDRESS_MEM_TYPE_64, +> ++ PCI_BASE_ADDRESS_SPACE_MEMORY, +> +&proxy->modern_bar); +> +> +proxy->config_cap = virtio_pci_add_mem_cap(proxy, &cfg.cap); +Sorry for the delay trying this, I was away last week. + +No, it doesn't seem to make any difference. + +Thanks +James +signature.asc +Description: +Digital signature + diff --git a/results/classifier/zero-shot/006/semantic/14887122 b/results/classifier/zero-shot/006/semantic/14887122 new file mode 100644 index 000000000..1212284cf --- /dev/null +++ b/results/classifier/zero-shot/006/semantic/14887122 @@ -0,0 +1,263 @@ +semantic: 0.928 +device: 0.919 +socket: 0.914 +graphic: 0.910 +other: 0.890 +vnc: 0.871 +network: 0.855 +boot: 0.831 +KVM: 0.814 + +[BUG][RFC] CPR transfer Issues: Socket permissions and PID files + +Hello, + +While testing CPR transfer I encountered two issues. The first is that the +transfer fails when running with pidfiles due to the destination qemu process +attempting to create the pidfile while it is still locked by the source +process. The second is that the transfer fails when running with the -run-with +user=$USERID parameter. This is because the destination qemu process creates +the UNIX sockets used for the CPR transfer before dropping to the lower +permissioned user, which causes them to be owned by the original user. The +source qemu process then does not have permission to connect to it because it +is already running as the lesser permissioned user. + +Reproducing the first issue: + +Create a source and destination qemu instance associated with the same VM where +both processes have the -pidfile parameter passed on the command line. You +should see the following error on the command line of the second process: + +qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource +temporarily unavailable + +Reproducing the second issue: + +Create a source and destination qemu instance associated with the same VM where +both processes have -run-with user=$USERID passed on the command line, where +$USERID is a different user from the one launching the processes. Then attempt +a CPR transfer using UNIX sockets for the main and cpr sockets. You should +receive the following error via QMP: +{"error": {"class": "GenericError", "desc": "Failed to connect to 'cpr.sock': +Permission denied"}} + +I provided a minimal patch that works around the second issue. + +Thank you, +Ben Chaney + +--- +include/system/os-posix.h | 4 ++++ +os-posix.c | 8 -------- +util/qemu-sockets.c | 21 +++++++++++++++++++++ +3 files changed, 25 insertions(+), 8 deletions(-) + +diff --git a/include/system/os-posix.h b/include/system/os-posix.h +index ce5b3bccf8..2a414a914a 100644 +--- a/include/system/os-posix.h ++++ b/include/system/os-posix.h +@@ -55,6 +55,10 @@ void os_setup_limits(void); +void os_setup_post(void); +int os_mlock(bool on_fault); + ++extern struct passwd *user_pwd; ++extern uid_t user_uid; ++extern gid_t user_gid; ++ +/** +* qemu_alloc_stack: +* @sz: pointer to a size_t holding the requested usable stack size +diff --git a/os-posix.c b/os-posix.c +index 52925c23d3..9369b312a0 100644 +--- a/os-posix.c ++++ b/os-posix.c +@@ -86,14 +86,6 @@ void os_set_proc_name(const char *s) +} + + +-/* +- * Must set all three of these at once. +- * Legal combinations are unset by name by uid +- */ +-static struct passwd *user_pwd; /* NULL non-NULL NULL */ +-static uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */ +-static gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */ +- +/* +* Prepare to change user ID. user_id can be one of 3 forms: +* - a username, in which case user ID will be changed to its uid, +diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c +index 77477c1cd5..987977ead9 100644 +--- a/util/qemu-sockets.c ++++ b/util/qemu-sockets.c +@@ -871,6 +871,14 @@ static bool saddr_is_tight(UnixSocketAddress *saddr) +#endif +} + ++/* ++ * Must set all three of these at once. ++ * Legal combinations are unset by name by uid ++ */ ++struct passwd *user_pwd; /* NULL non-NULL NULL */ ++uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */ ++gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */ ++ +static int unix_listen_saddr(UnixSocketAddress *saddr, +int num, +Error **errp) +@@ -947,6 +955,19 @@ static int unix_listen_saddr(UnixSocketAddress *saddr, +error_setg_errno(errp, errno, "Failed to bind socket to %s", path); +goto err; +} ++ if (user_pwd) { ++ if (chown(un.sun_path, user_pwd->pw_uid, user_pwd->pw_gid) < 0) { ++ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", +path); ++ goto err; ++ } ++ } ++ else if (user_uid != -1 && user_gid != -1) { ++ if (chown(un.sun_path, user_uid, user_gid) < 0) { ++ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", +path); ++ goto err; ++ } ++ } ++ +if (listen(sock, num) < 0) { +error_setg_errno(errp, errno, "Failed to listen on socket"); +goto err; +-- +2.40.1 + +Thank you Ben. I appreciate you testing CPR and shaking out the bugs. +I will study these and propose patches. + +My initial reaction to the pidfile issue is that the orchestration layer must +pass a different filename when starting the destination qemu instance. When +using live update without containers, these types of resource conflicts in the +global namespaces are a known issue. + +- Steve + +On 3/14/2025 2:33 PM, Chaney, Ben wrote: +Hello, + +While testing CPR transfer I encountered two issues. The first is that the +transfer fails when running with pidfiles due to the destination qemu process +attempting to create the pidfile while it is still locked by the source +process. The second is that the transfer fails when running with the -run-with +user=$USERID parameter. This is because the destination qemu process creates +the UNIX sockets used for the CPR transfer before dropping to the lower +permissioned user, which causes them to be owned by the original user. The +source qemu process then does not have permission to connect to it because it +is already running as the lesser permissioned user. + +Reproducing the first issue: + +Create a source and destination qemu instance associated with the same VM where +both processes have the -pidfile parameter passed on the command line. You +should see the following error on the command line of the second process: + +qemu-system-x86_64: cannot create PID file: Cannot lock pid file: Resource +temporarily unavailable + +Reproducing the second issue: + +Create a source and destination qemu instance associated with the same VM where +both processes have -run-with user=$USERID passed on the command line, where +$USERID is a different user from the one launching the processes. Then attempt +a CPR transfer using UNIX sockets for the main and cpr sockets. You should +receive the following error via QMP: +{"error": {"class": "GenericError", "desc": "Failed to connect to 'cpr.sock': +Permission denied"}} + +I provided a minimal patch that works around the second issue. + +Thank you, +Ben Chaney + +--- +include/system/os-posix.h | 4 ++++ +os-posix.c | 8 -------- +util/qemu-sockets.c | 21 +++++++++++++++++++++ +3 files changed, 25 insertions(+), 8 deletions(-) + +diff --git a/include/system/os-posix.h b/include/system/os-posix.h +index ce5b3bccf8..2a414a914a 100644 +--- a/include/system/os-posix.h ++++ b/include/system/os-posix.h +@@ -55,6 +55,10 @@ void os_setup_limits(void); +void os_setup_post(void); +int os_mlock(bool on_fault); + ++extern struct passwd *user_pwd; ++extern uid_t user_uid; ++extern gid_t user_gid; ++ +/** +* qemu_alloc_stack: +* @sz: pointer to a size_t holding the requested usable stack size +diff --git a/os-posix.c b/os-posix.c +index 52925c23d3..9369b312a0 100644 +--- a/os-posix.c ++++ b/os-posix.c +@@ -86,14 +86,6 @@ void os_set_proc_name(const char *s) +} + + +-/* +- * Must set all three of these at once. +- * Legal combinations are unset by name by uid +- */ +-static struct passwd *user_pwd; /* NULL non-NULL NULL */ +-static uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */ +-static gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */ +- +/* +* Prepare to change user ID. user_id can be one of 3 forms: +* - a username, in which case user ID will be changed to its uid, +diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c +index 77477c1cd5..987977ead9 100644 +--- a/util/qemu-sockets.c ++++ b/util/qemu-sockets.c +@@ -871,6 +871,14 @@ static bool saddr_is_tight(UnixSocketAddress *saddr) +#endif +} + ++/* ++ * Must set all three of these at once. ++ * Legal combinations are unset by name by uid ++ */ ++struct passwd *user_pwd; /* NULL non-NULL NULL */ ++uid_t user_uid = (uid_t)-1; /* -1 -1 >=0 */ ++gid_t user_gid = (gid_t)-1; /* -1 -1 >=0 */ ++ +static int unix_listen_saddr(UnixSocketAddress *saddr, +int num, +Error **errp) +@@ -947,6 +955,19 @@ static int unix_listen_saddr(UnixSocketAddress *saddr, +error_setg_errno(errp, errno, "Failed to bind socket to %s", path); +goto err; +} ++ if (user_pwd) { ++ if (chown(un.sun_path, user_pwd->pw_uid, user_pwd->pw_gid) < 0) { ++ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", +path); ++ goto err; ++ } ++ } ++ else if (user_uid != -1 && user_gid != -1) { ++ if (chown(un.sun_path, user_uid, user_gid) < 0) { ++ error_setg_errno(errp, errno, "Failed to change permissions on socket %s", +path); ++ goto err; ++ } ++ } ++ +if (listen(sock, num) < 0) { +error_setg_errno(errp, errno, "Failed to listen on socket"); +goto err; +-- +2.40.1 + diff --git a/results/classifier/zero-shot/006/semantic/70294255 b/results/classifier/zero-shot/006/semantic/70294255 new file mode 100644 index 000000000..ec81c83b1 --- /dev/null +++ b/results/classifier/zero-shot/006/semantic/70294255 @@ -0,0 +1,1066 @@ +semantic: 0.858 +socket: 0.858 +device: 0.857 +graphic: 0.857 +other: 0.852 +network: 0.846 +vnc: 0.837 +boot: 0.811 +KVM: 0.806 + +[Qemu-devel] 答复: Re: 答复: Re: 答复: Re: 答复: Re: [BUG]COLO failover hang + +hi: + +yes.it is better. + +And should we delete + + + + +#ifdef WIN32 + + QIO_CHANNEL(cioc)-ï¼event = CreateEvent(NULL, FALSE, FALSE, NULL) + +#endif + + + + +in qio_channel_socket_acceptï¼ + +qio_channel_socket_new already have it. + + + + + + + + + + + + +åå§é®ä»¶ + + + +åä»¶äººï¼ address@hidden +æ¶ä»¶äººï¼ç广10165992 +æéäººï¼ address@hidden address@hidden address@hidden address@hidden +æ¥ æ ï¼2017å¹´03æ22æ¥ 15:03 +主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: çå¤: Re: [BUG]COLO failover hang + + + + + +Hi, + +On 2017/3/22 9:42, address@hidden wrote: +ï¼ diff --git a/migration/socket.c b/migration/socket.c +ï¼ +ï¼ +ï¼ index 13966f1..d65a0ea 100644 +ï¼ +ï¼ +ï¼ --- a/migration/socket.c +ï¼ +ï¼ +ï¼ +++ b/migration/socket.c +ï¼ +ï¼ +ï¼ @@ -147,8 +147,9 @@ static gboolean +socket_accept_incoming_migration(QIOChannel *ioc, +ï¼ +ï¼ +ï¼ } +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ trace_migration_socket_incoming_accepted() +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") +ï¼ +ï¼ +ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) +ï¼ +ï¼ +ï¼ migration_channel_process_incoming(migrate_get_current(), +ï¼ +ï¼ +ï¼ QIO_CHANNEL(sioc)) +ï¼ +ï¼ +ï¼ object_unref(OBJECT(sioc)) +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ Is this patch ok? +ï¼ + +Yes, i think this works, but a better way maybe to call +qio_channel_set_feature() +in qio_channel_socket_accept(), we didn't set the SHUTDOWN feature for the +socket accept fd, +Or fix it by this: + +diff --git a/io/channel-socket.c b/io/channel-socket.c +index f546c68..ce6894c 100644 +--- a/io/channel-socket.c ++++ b/io/channel-socket.c +@@ -330,9 +330,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, + Error **errp) + { + QIOChannelSocket *cioc +- +- cioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)) +- cioc-ï¼fd = -1 ++ ++ cioc = qio_channel_socket_new() + cioc-ï¼remoteAddrLen = sizeof(ioc-ï¼remoteAddr) + cioc-ï¼localAddrLen = sizeof(ioc-ï¼localAddr) + + +Thanks, +Hailiang + +ï¼ I have test it . The test could not hang any more. +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ åå§é®ä»¶ +ï¼ +ï¼ +ï¼ +ï¼ åä»¶äººï¼ address@hidden +ï¼ æ¶ä»¶äººï¼ address@hidden address@hidden +ï¼ æéäººï¼ address@hidden address@hidden address@hidden +ï¼ æ¥ æ ï¼2017å¹´03æ22æ¥ 09:11 +ï¼ ä¸» é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: [BUG]COLO failover hang +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ On 2017/3/21 19:56, Dr. David Alan Gilbert wrote: +ï¼ ï¼ * Hailiang Zhang (address@hidden) wrote: +ï¼ ï¼ï¼ Hi, +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ Thanks for reporting this, and i confirmed it in my test, and it is a bug. +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ Though we tried to call qemu_file_shutdown() to shutdown the related fd, in +ï¼ ï¼ï¼ case COLO thread/incoming thread is stuck in read/write() while do +failover, +ï¼ ï¼ï¼ but it didn't take effect, because all the fd used by COLO (also migration) +ï¼ ï¼ï¼ has been wrapped by qio channel, and it will not call the shutdown API if +ï¼ ï¼ï¼ we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), +QIO_CHANNEL_FEATURE_SHUTDOWN). +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ Cc: Dr. David Alan Gilbert address@hidden +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ I doubted migration cancel has the same problem, it may be stuck in write() +ï¼ ï¼ï¼ if we tried to cancel migration. +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, +Error **errp) +ï¼ ï¼ï¼ { +ï¼ ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing") +ï¼ ï¼ï¼ migration_channel_connect(s, ioc, NULL) +ï¼ ï¼ï¼ ... ... +ï¼ ï¼ï¼ We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), +QIO_CHANNEL_FEATURE_SHUTDOWN) above, +ï¼ ï¼ï¼ and the +ï¼ ï¼ï¼ migrate_fd_cancel() +ï¼ ï¼ï¼ { +ï¼ ï¼ï¼ ... ... +ï¼ ï¼ï¼ if (s-ï¼state == MIGRATION_STATUS_CANCELLING && f) { +ï¼ ï¼ï¼ qemu_file_shutdown(f) --ï¼ This will not take effect. No ? +ï¼ ï¼ï¼ } +ï¼ ï¼ï¼ } +ï¼ ï¼ +ï¼ ï¼ (cc'd in Daniel Berrange). +ï¼ ï¼ I see that we call qio_channel_set_feature(ioc, +QIO_CHANNEL_FEATURE_SHUTDOWN) at the +ï¼ ï¼ top of qio_channel_socket_new so I think that's safe isn't it? +ï¼ ï¼ +ï¼ +ï¼ Hmm, you are right, this problem is only exist for the migration incoming fd, +thanks. +ï¼ +ï¼ ï¼ Dave +ï¼ ï¼ +ï¼ ï¼ï¼ Thanks, +ï¼ ï¼ï¼ Hailiang +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ On 2017/3/21 16:10, address@hidden wrote: +ï¼ ï¼ï¼ï¼ Thank youã +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ I have test areadyã +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ When the Primary Node panic,the Secondary Node qemu hang at the same +placeã +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ Incorrding +http://wiki.qemu-project.org/Features/COLO +ï¼kill Primary Node +qemu will not produce the problem,but Primary Node panic canã +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ I think due to the feature of channel does not support +QIO_CHANNEL_FEATURE_SHUTDOWN. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ when failover,channel_shutdown could not shut down the channel. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ so the colo_process_incoming_thread will hang at recvmsg. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ I test a patch: +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ diff --git a/migration/socket.c b/migration/socket.c +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ index 13966f1..d65a0ea 100644 +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ --- a/migration/socket.c +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +++ b/migration/socket.c +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ @@ -147,8 +147,9 @@ static gboolean +socket_accept_incoming_migration(QIOChannel *ioc, +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ } +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ trace_migration_socket_incoming_accepted() +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), +"migration-socket-incoming") +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), +QIO_CHANNEL_FEATURE_SHUTDOWN) +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ migration_channel_process_incoming(migrate_get_current(), +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ QIO_CHANNEL(sioc)) +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ object_unref(OBJECT(sioc)) +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ My test will not hang any more. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ åå§é®ä»¶ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ åä»¶äººï¼ address@hidden +ï¼ ï¼ï¼ï¼ æ¶ä»¶äººï¼ç广10165992 address@hidden +ï¼ ï¼ï¼ï¼ æéäººï¼ address@hidden address@hidden +ï¼ ï¼ï¼ï¼ æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 +ï¼ ï¼ï¼ï¼ 主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ Hi,Wang. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ You can test this branch: +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ and please follow wiki ensure your own configuration correctly. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +http://wiki.qemu-project.org/Features/COLO +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ Thanks +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ Zhang Chen +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ On 03/21/2017 03:27 PM, address@hidden wrote: +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ hi. +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ I test the git qemu master have the same problem. +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) bt +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, +ï¼ ï¼ï¼ï¼ ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read +ï¼ ï¼ï¼ï¼ ï¼ (address@hidden, address@hidden "", +ï¼ ï¼ï¼ï¼ ï¼ address@hidden, address@hidden) at io/channel.c:114 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, +ï¼ ï¼ï¼ï¼ ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at +ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file-channel.c:78 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at +ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:295 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at migration/qemu-file.c:555 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:568 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:648 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, +ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at migration/colo.c:244 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized +ï¼ ï¼ï¼ï¼ ï¼ outï¼, address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ address@hidden) +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ at migration/colo.c:264 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread +ï¼ ï¼ï¼ï¼ ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ $3 = 0 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) bt +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, +ï¼ ï¼ï¼ï¼ ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at +ï¼ ï¼ï¼ï¼ ï¼ gmain.c:3054 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, +ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at gmain.c:3630 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at +ï¼ ï¼ï¼ï¼ ï¼ util/main-loop.c:258 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #5 main_loop_wait (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ util/main-loop.c:506 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized +ï¼ ï¼ï¼ï¼ ï¼ outï¼) at vl.c:4709 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ $1 = 6 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ May be socket_accept_incoming_migration should +ï¼ ï¼ï¼ï¼ ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ thank you. +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ åå§é®ä»¶ +ï¼ ï¼ï¼ï¼ ï¼ address@hidden +ï¼ ï¼ï¼ï¼ ï¼ address@hidden +ï¼ ï¼ï¼ï¼ ï¼ address@hidden@huawei.comï¼ +ï¼ ï¼ï¼ï¼ ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 +ï¼ ï¼ï¼ï¼ ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ On 03/15/2017 05:06 PM, wangguang wrote: +ï¼ ï¼ï¼ï¼ ï¼ ï¼ am testing QEMU COLO feature described here [QEMU +ï¼ ï¼ï¼ï¼ ï¼ ï¼ Wiki]( +http://wiki.qemu-project.org/Features/COLO +). +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. +ï¼ ï¼ï¼ï¼ ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. +ï¼ ï¼ï¼ï¼ ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": +ï¼ ï¼ï¼ï¼ ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's +ï¼ ï¼ï¼ï¼ ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ I found that the colo in qemu is not complete yet. +ï¼ ï¼ï¼ï¼ ï¼ ï¼ Do the colo have any plan for development? +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ Yes, We are developing. You can see some of patch we pushing. +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ In our internal version can run it successfully, +ï¼ ï¼ï¼ï¼ ï¼ The failover detail you can ask Zhanghailiang for help. +ï¼ ï¼ï¼ï¼ ï¼ Next time if you have some question about COLO, +ï¼ ï¼ï¼ï¼ ï¼ please cc me and zhanghailiang address@hidden +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ Thanks +ï¼ ï¼ï¼ï¼ ï¼ Zhang Chen +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ centos7.2+qemu2.7.50 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ (gdb) bt +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized +outï¼, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, +errp=0x0) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ io/channel-socket.c:497 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden "", address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at io/channel.c:97 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file-channel.c:78 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:257 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:523 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:603 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/colo.c:215 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message +(errp=0x7f3d62bfaa48, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:546 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:649 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc..so.6 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ -- +ï¼ ï¼ï¼ï¼ ï¼ ï¼ View this message in context: +http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html +ï¼ ï¼ï¼ï¼ ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ -- +ï¼ ï¼ï¼ï¼ ï¼ Thanks +ï¼ ï¼ï¼ï¼ ï¼ Zhang Chen +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ +ï¼ ï¼ -- +ï¼ ï¼ Dr. David Alan Gilbert / address@hidden / Manchester, UK +ï¼ ï¼ +ï¼ ï¼ . +ï¼ ï¼ +ï¼ + +On 2017/3/22 16:09, address@hidden wrote: +hi: + +yes.it is better. + +And should we delete +Yes, you are right. +#ifdef WIN32 + + QIO_CHANNEL(cioc)-ï¼event = CreateEvent(NULL, FALSE, FALSE, NULL) + +#endif + + + + +in qio_channel_socket_acceptï¼ + +qio_channel_socket_new already have it. + + + + + + + + + + + + +åå§é®ä»¶ + + + +åä»¶äººï¼ address@hidden +æ¶ä»¶äººï¼ç广10165992 +æéäººï¼ address@hidden address@hidden address@hidden address@hidden +æ¥ æ ï¼2017å¹´03æ22æ¥ 15:03 +主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: çå¤: Re: [BUG]COLO failover hang + + + + + +Hi, + +On 2017/3/22 9:42, address@hidden wrote: +ï¼ diff --git a/migration/socket.c b/migration/socket.c +ï¼ +ï¼ +ï¼ index 13966f1..d65a0ea 100644 +ï¼ +ï¼ +ï¼ --- a/migration/socket.c +ï¼ +ï¼ +ï¼ +++ b/migration/socket.c +ï¼ +ï¼ +ï¼ @@ -147,8 +147,9 @@ static gboolean +socket_accept_incoming_migration(QIOChannel *ioc, +ï¼ +ï¼ +ï¼ } +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ trace_migration_socket_incoming_accepted() +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), "migration-socket-incoming") +ï¼ +ï¼ +ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), QIO_CHANNEL_FEATURE_SHUTDOWN) +ï¼ +ï¼ +ï¼ migration_channel_process_incoming(migrate_get_current(), +ï¼ +ï¼ +ï¼ QIO_CHANNEL(sioc)) +ï¼ +ï¼ +ï¼ object_unref(OBJECT(sioc)) +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ Is this patch ok? +ï¼ + +Yes, i think this works, but a better way maybe to call +qio_channel_set_feature() +in qio_channel_socket_accept(), we didn't set the SHUTDOWN feature for the +socket accept fd, +Or fix it by this: + +diff --git a/io/channel-socket.c b/io/channel-socket.c +index f546c68..ce6894c 100644 +--- a/io/channel-socket.c ++++ b/io/channel-socket.c +@@ -330,9 +330,8 @@ qio_channel_socket_accept(QIOChannelSocket *ioc, + Error **errp) + { + QIOChannelSocket *cioc +- +- cioc = QIO_CHANNEL_SOCKET(object_new(TYPE_QIO_CHANNEL_SOCKET)) +- cioc-ï¼fd = -1 ++ ++ cioc = qio_channel_socket_new() + cioc-ï¼remoteAddrLen = sizeof(ioc-ï¼remoteAddr) + cioc-ï¼localAddrLen = sizeof(ioc-ï¼localAddr) + + +Thanks, +Hailiang + +ï¼ I have test it . The test could not hang any more. +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ åå§é®ä»¶ +ï¼ +ï¼ +ï¼ +ï¼ åä»¶äººï¼ address@hidden +ï¼ æ¶ä»¶äººï¼ address@hidden address@hidden +ï¼ æéäººï¼ address@hidden address@hidden address@hidden +ï¼ æ¥ æ ï¼2017å¹´03æ22æ¥ 09:11 +ï¼ ä¸» é¢ ï¼Re: [Qemu-devel] çå¤: Re: çå¤: Re: [BUG]COLO failover hang +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ +ï¼ On 2017/3/21 19:56, Dr. David Alan Gilbert wrote: +ï¼ ï¼ * Hailiang Zhang (address@hidden) wrote: +ï¼ ï¼ï¼ Hi, +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ Thanks for reporting this, and i confirmed it in my test, and it is a bug. +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ Though we tried to call qemu_file_shutdown() to shutdown the related fd, in +ï¼ ï¼ï¼ case COLO thread/incoming thread is stuck in read/write() while do +failover, +ï¼ ï¼ï¼ but it didn't take effect, because all the fd used by COLO (also migration) +ï¼ ï¼ï¼ has been wrapped by qio channel, and it will not call the shutdown API if +ï¼ ï¼ï¼ we didn't qio_channel_set_feature(QIO_CHANNEL(sioc), +QIO_CHANNEL_FEATURE_SHUTDOWN). +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ Cc: Dr. David Alan Gilbert address@hidden +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ I doubted migration cancel has the same problem, it may be stuck in write() +ï¼ ï¼ï¼ if we tried to cancel migration. +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, +Error **errp) +ï¼ ï¼ï¼ { +ï¼ ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(ioc), "migration-fd-outgoing") +ï¼ ï¼ï¼ migration_channel_connect(s, ioc, NULL) +ï¼ ï¼ï¼ ... ... +ï¼ ï¼ï¼ We didn't call qio_channel_set_feature(QIO_CHANNEL(sioc), +QIO_CHANNEL_FEATURE_SHUTDOWN) above, +ï¼ ï¼ï¼ and the +ï¼ ï¼ï¼ migrate_fd_cancel() +ï¼ ï¼ï¼ { +ï¼ ï¼ï¼ ... ... +ï¼ ï¼ï¼ if (s-ï¼state == MIGRATION_STATUS_CANCELLING && f) { +ï¼ ï¼ï¼ qemu_file_shutdown(f) --ï¼ This will not take effect. No ? +ï¼ ï¼ï¼ } +ï¼ ï¼ï¼ } +ï¼ ï¼ +ï¼ ï¼ (cc'd in Daniel Berrange). +ï¼ ï¼ I see that we call qio_channel_set_feature(ioc, +QIO_CHANNEL_FEATURE_SHUTDOWN) at the +ï¼ ï¼ top of qio_channel_socket_new so I think that's safe isn't it? +ï¼ ï¼ +ï¼ +ï¼ Hmm, you are right, this problem is only exist for the migration incoming fd, +thanks. +ï¼ +ï¼ ï¼ Dave +ï¼ ï¼ +ï¼ ï¼ï¼ Thanks, +ï¼ ï¼ï¼ Hailiang +ï¼ ï¼ï¼ +ï¼ ï¼ï¼ On 2017/3/21 16:10, address@hidden wrote: +ï¼ ï¼ï¼ï¼ Thank youã +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ I have test areadyã +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ When the Primary Node panic,the Secondary Node qemu hang at the same +placeã +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ Incorrding +http://wiki.qemu-project.org/Features/COLO +ï¼kill Primary Node +qemu will not produce the problem,but Primary Node panic canã +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ I think due to the feature of channel does not support +QIO_CHANNEL_FEATURE_SHUTDOWN. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ when failover,channel_shutdown could not shut down the channel. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ so the colo_process_incoming_thread will hang at recvmsg. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ I test a patch: +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ diff --git a/migration/socket.c b/migration/socket.c +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ index 13966f1..d65a0ea 100644 +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ --- a/migration/socket.c +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +++ b/migration/socket.c +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ @@ -147,8 +147,9 @@ static gboolean +socket_accept_incoming_migration(QIOChannel *ioc, +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ } +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ trace_migration_socket_incoming_accepted() +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ qio_channel_set_name(QIO_CHANNEL(sioc), +"migration-socket-incoming") +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ + qio_channel_set_feature(QIO_CHANNEL(sioc), +QIO_CHANNEL_FEATURE_SHUTDOWN) +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ migration_channel_process_incoming(migrate_get_current(), +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ QIO_CHANNEL(sioc)) +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ object_unref(OBJECT(sioc)) +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ My test will not hang any more. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ åå§é®ä»¶ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ åä»¶äººï¼ address@hidden +ï¼ ï¼ï¼ï¼ æ¶ä»¶äººï¼ç广10165992 address@hidden +ï¼ ï¼ï¼ï¼ æéäººï¼ address@hidden address@hidden +ï¼ ï¼ï¼ï¼ æ¥ æ ï¼2017å¹´03æ21æ¥ 15:58 +ï¼ ï¼ï¼ï¼ 主 é¢ ï¼Re: [Qemu-devel] çå¤: Re: [BUG]COLO failover hang +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ Hi,Wang. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ You can test this branch: +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +https://github.com/coloft/qemu/tree/colo-v5.1-developing-COLO-frame-v21-with-shared-disk +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ and please follow wiki ensure your own configuration correctly. +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +http://wiki.qemu-project.org/Features/COLO +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ Thanks +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ Zhang Chen +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ï¼ On 03/21/2017 03:27 PM, address@hidden wrote: +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ hi. +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ I test the git qemu master have the same problem. +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) bt +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #0 qio_channel_socket_readv (ioc=0x7f65911b4e50, iov=0x7f64ef3fd880, +ï¼ ï¼ï¼ï¼ ï¼ niov=1, fds=0x0, nfds=0x0, errp=0x0) at io/channel-socket.c:461 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #1 0x00007f658e4aa0c2 in qio_channel_read +ï¼ ï¼ï¼ï¼ ï¼ (address@hidden, address@hidden "", +ï¼ ï¼ï¼ï¼ ï¼ address@hidden, address@hidden) at io/channel.c:114 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #2 0x00007f658e3ea990 in channel_get_buffer (opaque=ï¼optimized outï¼, +ï¼ ï¼ï¼ï¼ ï¼ buf=0x7f65907cb838 "", pos=ï¼optimized outï¼, size=32768) at +ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file-channel.c:78 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #3 0x00007f658e3e97fc in qemu_fill_buffer (f=0x7f65907cb800) at +ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:295 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #4 0x00007f658e3ea2e1 in qemu_peek_byte (address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at migration/qemu-file.c:555 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #5 0x00007f658e3ea34b in qemu_get_byte (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:568 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #6 0x00007f658e3ea552 in qemu_get_be32 (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ migration/qemu-file.c:648 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #7 0x00007f658e3e66e5 in colo_receive_message (f=0x7f65907cb800, +ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at migration/colo.c:244 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #8 0x00007f658e3e681e in colo_receive_check_message (f=ï¼optimized +ï¼ ï¼ï¼ï¼ ï¼ outï¼, address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ address@hidden) +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ at migration/colo.c:264 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #9 0x00007f658e3e740e in colo_process_incoming_thread +ï¼ ï¼ï¼ï¼ ï¼ (opaque=0x7f658eb30360 ï¼mis_current.31286ï¼) at migration/colo.c:577 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #10 0x00007f658be09df3 in start_thread () from /lib64/libpthread.so.0 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #11 0x00007f65881983ed in clone () from /lib64/libc.so.6 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ $2 = 0x7f658ff7d5c0 "migration-socket-incoming" +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features Do not support QIO_CHANNEL_FEATURE_SHUTDOWN +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ $3 = 0 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) bt +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #0 socket_accept_incoming_migration (ioc=0x7fdcceeafa90, +ï¼ ï¼ï¼ï¼ ï¼ condition=G_IO_IN, opaque=0x7fdcceeafa90) at migration/socket.c:137 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #1 0x00007fdcc6966350 in g_main_dispatch (context=ï¼optimized outï¼) at +ï¼ ï¼ï¼ï¼ ï¼ gmain.c:3054 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #2 g_main_context_dispatch (context=ï¼optimized outï¼, +ï¼ ï¼ï¼ï¼ ï¼ address@hidden) at gmain.c:3630 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #3 0x00007fdccb8a6dcc in glib_pollfds_poll () at util/main-loop.c:213 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #4 os_host_main_loop_wait (timeout=ï¼optimized outï¼) at +ï¼ ï¼ï¼ï¼ ï¼ util/main-loop.c:258 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #5 main_loop_wait (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ util/main-loop.c:506 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #6 0x00007fdccb526187 in main_loop () at vl.c:1898 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ #7 main (argc=ï¼optimized outï¼, argv=ï¼optimized outï¼, envp=ï¼optimized +ï¼ ï¼ï¼ï¼ ï¼ outï¼) at vl.c:4709 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼features +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ $1 = 6 +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ (gdb) p ioc-ï¼name +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ $2 = 0x7fdcce1b1ab0 "migration-socket-listener" +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ May be socket_accept_incoming_migration should +ï¼ ï¼ï¼ï¼ ï¼ call qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN)?? +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ thank you. +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ åå§é®ä»¶ +ï¼ ï¼ï¼ï¼ ï¼ address@hidden +ï¼ ï¼ï¼ï¼ ï¼ address@hidden +ï¼ ï¼ï¼ï¼ ï¼ address@hidden@huawei.comï¼ +ï¼ ï¼ï¼ï¼ ï¼ *æ¥ æ ï¼*2017å¹´03æ16æ¥ 14:46 +ï¼ ï¼ï¼ï¼ ï¼ *主 é¢ ï¼**Re: [Qemu-devel] COLO failover hang* +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ On 03/15/2017 05:06 PM, wangguang wrote: +ï¼ ï¼ï¼ï¼ ï¼ ï¼ am testing QEMU COLO feature described here [QEMU +ï¼ ï¼ï¼ï¼ ï¼ ï¼ Wiki]( +http://wiki.qemu-project.org/Features/COLO +). +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ When the Primary Node panic,the Secondary Node qemu hang. +ï¼ ï¼ï¼ï¼ ï¼ ï¼ hang at recvmsg in qio_channel_socket_readv. +ï¼ ï¼ï¼ï¼ ï¼ ï¼ And I run { 'execute': 'nbd-server-stop' } and { "execute": +ï¼ ï¼ï¼ï¼ ï¼ ï¼ "x-colo-lost-heartbeat" } in Secondary VM's +ï¼ ï¼ï¼ï¼ ï¼ ï¼ monitor,the Secondary Node qemu still hang at recvmsg . +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ I found that the colo in qemu is not complete yet. +ï¼ ï¼ï¼ï¼ ï¼ ï¼ Do the colo have any plan for development? +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ Yes, We are developing. You can see some of patch we pushing. +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ Has anyone ever run it successfully? Any help is appreciated! +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ In our internal version can run it successfully, +ï¼ ï¼ï¼ï¼ ï¼ The failover detail you can ask Zhanghailiang for help. +ï¼ ï¼ï¼ï¼ ï¼ Next time if you have some question about COLO, +ï¼ ï¼ï¼ï¼ ï¼ please cc me and zhanghailiang address@hidden +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ Thanks +ï¼ ï¼ï¼ï¼ ï¼ Zhang Chen +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ centos7.2+qemu2.7.50 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ (gdb) bt +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #0 0x00007f3e00cc86ad in recvmsg () from /lib64/libpthread.so.0 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #1 0x00007f3e0332b738 in qio_channel_socket_readv (ioc=ï¼optimized +outï¼, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ iov=ï¼optimized outï¼, niov=ï¼optimized outï¼, fds=0x0, nfds=0x0, +errp=0x0) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ io/channel-socket.c:497 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #2 0x00007f3e03329472 in qio_channel_read (address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden "", address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at io/channel.c:97 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #3 0x00007f3e032750e0 in channel_get_buffer (opaque=ï¼optimized outï¼, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ buf=0x7f3e05910f38 "", pos=ï¼optimized outï¼, size=32768) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file-channel.c:78 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #4 0x00007f3e0327412c in qemu_fill_buffer (f=0x7f3e05910f00) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:257 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #5 0x00007f3e03274a41 in qemu_peek_byte (address@hidden, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/qemu-file.c:510 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #6 0x00007f3e03274aab in qemu_get_byte (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:523 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #7 0x00007f3e03274cb2 in qemu_get_be32 (address@hidden) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/qemu-file.c:603 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #8 0x00007f3e03271735 in colo_receive_message (f=0x7f3e05910f00, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ address@hidden) at migration/colo.c:215 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #9 0x00007f3e0327250d in colo_wait_handle_message +(errp=0x7f3d62bfaa48, +ï¼ ï¼ï¼ï¼ ï¼ ï¼ checkpoint_request=ï¼synthetic pointerï¼, f=ï¼optimized outï¼) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:546 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #10 colo_process_incoming_thread (opaque=0x7f3e067245e0) at +ï¼ ï¼ï¼ï¼ ï¼ ï¼ migration/colo.c:649 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #11 0x00007f3e00cc1df3 in start_thread () from /lib64/libpthread.so.0 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ #12 0x00007f3dfc9c03ed in clone () from /lib64/libc..so.6 +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ -- +ï¼ ï¼ï¼ï¼ ï¼ ï¼ View this message in context: +http://qemu.11.n7.nabble.com/COLO-failover-hang-tp473250.html +ï¼ ï¼ï¼ï¼ ï¼ ï¼ Sent from the Developer mailing list archive at Nabble.com. +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ -- +ï¼ ï¼ï¼ï¼ ï¼ Thanks +ï¼ ï¼ï¼ï¼ ï¼ Zhang Chen +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ ï¼ +ï¼ ï¼ï¼ï¼ +ï¼ ï¼ï¼ +ï¼ ï¼ -- +ï¼ ï¼ Dr. David Alan Gilbert / address@hidden / Manchester, UK +ï¼ ï¼ +ï¼ ï¼ . +ï¼ ï¼ +ï¼ + diff --git a/results/classifier/zero-shot/006/semantic/gitlab_semantic_addsubps b/results/classifier/zero-shot/006/semantic/gitlab_semantic_addsubps new file mode 100644 index 000000000..ba7fa933e --- /dev/null +++ b/results/classifier/zero-shot/006/semantic/gitlab_semantic_addsubps @@ -0,0 +1,33 @@ +semantic: 0.974 +device: 0.758 +other: 0.732 +graphic: 0.700 +vnc: 0.544 +boot: 0.465 +socket: 0.426 +network: 0.393 +KVM: 0.192 + +x86 SSE/SSE2/SSE3 instruction semantic bugs with NaN + +Description of problem +The result of SSE/SSE2/SSE3 instructions with NaN is different from the CPU. From Intel manual Volume 1 Appendix D.4.2.2, they defined the behavior of such instructions with NaN. But I think QEMU did not implement this semantic exactly because the byte result is different. + +Steps to reproduce + +Compile this code + +void main() { + asm("mov rax, 0x000000007fffffff; push rax; mov rax, 0x00000000ffffffff; push rax; movdqu XMM1, [rsp];"); + asm("mov rax, 0x2e711de7aa46af1a; push rax; mov rax, 0x7fffffff7fffffff; push rax; movdqu XMM2, [rsp];"); + asm("addsubps xmm1, xmm2"); +} + +Execute and compare the result with the CPU. This problem happens with other SSE/SSE2/SSE3 instructions specified in the manual, Volume 1 Appendix D.4.2.2. + +CPU xmm1[3] = 0xffffffff + +QEMU xmm1[3] = 0x7fffffff + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/006/semantic/gitlab_semantic_adox b/results/classifier/zero-shot/006/semantic/gitlab_semantic_adox new file mode 100644 index 000000000..be39cec22 --- /dev/null +++ b/results/classifier/zero-shot/006/semantic/gitlab_semantic_adox @@ -0,0 +1,46 @@ +semantic: 0.990 +graphic: 0.782 +device: 0.776 +vnc: 0.663 +boot: 0.599 +socket: 0.556 +network: 0.426 +other: 0.286 +KVM: 0.240 + +x86 ADOX and ADCX semantic bug +Description of problem +The result of instruction ADOX and ADCX are different from the CPU. The value of one of EFLAGS is different. + +Steps to reproduce + +Compile this code + + +void main() { + asm("push 512; popfq;"); + asm("mov rax, 0xffffffff84fdbf24"); + asm("mov rbx, 0xb197d26043bec15d"); + asm("adox eax, ebx"); +} + + + +Execute and compare the result with the CPU. This problem happens with ADCX, too (with CF). + +CPU + +OF = 0 + + +QEMU + +OF = 1 + + + + + + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/006/semantic/gitlab_semantic_bextr b/results/classifier/zero-shot/006/semantic/gitlab_semantic_bextr new file mode 100644 index 000000000..ec7c57fb1 --- /dev/null +++ b/results/classifier/zero-shot/006/semantic/gitlab_semantic_bextr @@ -0,0 +1,35 @@ +semantic: 0.993 +graphic: 0.790 +device: 0.717 +boot: 0.516 +vnc: 0.471 +socket: 0.397 +network: 0.219 +other: 0.099 +KVM: 0.091 + +x86 BEXTR semantic bug +Description of problem +The result of instruction BEXTR is different with from the CPU. The value of destination register is different. I think QEMU does not consider the operand size limit. + +Steps to reproduce + +Compile this code + +void main() { + asm("mov rax, 0x17b3693f77fb6e9"); + asm("mov rbx, 0x8f635a775ad3b9b4"); + asm("mov rcx, 0xb717b75da9983018"); + asm("bextr eax, ebx, ecx"); +} + +Execute and compare the result with the CPU. + +CPU +RAX = 0x5a + +QEMU +RAX = 0x635a775a + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/006/semantic/gitlab_semantic_blsi b/results/classifier/zero-shot/006/semantic/gitlab_semantic_blsi new file mode 100644 index 000000000..7a43b8907 --- /dev/null +++ b/results/classifier/zero-shot/006/semantic/gitlab_semantic_blsi @@ -0,0 +1,30 @@ +semantic: 0.983 +graphic: 0.873 +device: 0.790 +socket: 0.764 +vnc: 0.756 +boot: 0.678 +network: 0.672 +other: 0.609 +KVM: 0.412 + +x86 BLSI and BLSR semantic bug +Description of problem +The result of instruction BLSI and BLSR is different from the CPU. The value of CF is different. + +Steps to reproduce + +Compile this code + + +void main() { + asm("blsi rax, rbx"); +} + + + +Execute and compare the result with the CPU. The value of CF is exactly the opposite. This problem happens with BLSR, too. + + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/006/semantic/gitlab_semantic_blsmsk b/results/classifier/zero-shot/006/semantic/gitlab_semantic_blsmsk new file mode 100644 index 000000000..db8658526 --- /dev/null +++ b/results/classifier/zero-shot/006/semantic/gitlab_semantic_blsmsk @@ -0,0 +1,37 @@ +semantic: 0.987 +device: 0.743 +graphic: 0.735 +vnc: 0.612 +socket: 0.607 +boot: 0.585 +network: 0.366 +other: 0.269 +KVM: 0.163 + +x86 BLSMSK semantic bug +Description of problem +The result of instruction BLSMSK is different with from the CPU. The value of CF is different. + +Steps to reproduce + +Compile this code + +void main() { + asm("mov rax, 0x65b2e276ad27c67"); + asm("mov rbx, 0x62f34955226b2b5d"); + asm("blsmsk eax, ebx"); +} + +Execute and compare the result with the CPU. + +CPU + +CF = 0 + + +QEMU + +CF = 1 + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. diff --git a/results/classifier/zero-shot/006/semantic/gitlab_semantic_bzhi b/results/classifier/zero-shot/006/semantic/gitlab_semantic_bzhi new file mode 100644 index 000000000..672c4b22d --- /dev/null +++ b/results/classifier/zero-shot/006/semantic/gitlab_semantic_bzhi @@ -0,0 +1,48 @@ +semantic: 0.920 +graphic: 0.652 +device: 0.589 +vnc: 0.287 +boot: 0.220 +network: 0.203 +socket: 0.198 +other: 0.064 +KVM: 0.064 + +x86 BZHI semantic bug +Description of problem +The result of instruction BZHI is different from the CPU. The value of destination register and SF of EFLAGS are different. + +Steps to reproduce + +Compile this code + + +void main() { + asm("mov rax, 0xb1aa9da2fe33fe3"); + asm("mov rbx, 0x80000000ffffffff"); + asm("mov rcx, 0xf3fce8829b99a5c6"); + asm("bzhi rax, rbx, rcx"); +} + + + +Execute and compare the result with the CPU. + +CPU + +RAX = 0x0x80000000ffffffff +SF = 1 + + +QEMU + +RAX = 0xffffffff +SF = 0 + + + + + + +Additional information +This bug is discovered by research conducted by KAIST SoftSec. |