diff options
Diffstat (limited to 'gitlab/issues_toml/target_missing/host_missing/accel_missing/1601.toml')
| -rw-r--r-- | gitlab/issues_toml/target_missing/host_missing/accel_missing/1601.toml | 88 |
1 files changed, 88 insertions, 0 deletions
diff --git a/gitlab/issues_toml/target_missing/host_missing/accel_missing/1601.toml b/gitlab/issues_toml/target_missing/host_missing/accel_missing/1601.toml new file mode 100644 index 000000000..888c4123b --- /dev/null +++ b/gitlab/issues_toml/target_missing/host_missing/accel_missing/1601.toml @@ -0,0 +1,88 @@ +id = 1601 +title = "QEMU Guest Agent (qga) high CPU usage (1 core at 100%). May happen with guest-network-get-interfaces. Strace says: EAGAIN (Resource temporarily unavailable)" +state = "opened" +created_at = "2023-04-13T16:11:33.310Z" +closed_at = "n/a" +labels = ["Guest Agent"] +url = "https://gitlab.com/qemu-project/qemu/-/issues/1601" +host-os = "Fedora 37" +host-arch = "x86_64" +qemu-version = "QEMU emulator version 7.0.0 (qemu-7.0.0-15.fc37)" +guest-os = "Fedora 37" +guest-arch = "x86_64" +description = """I have a VM that has the QEMU guest agent installed. I use the QGA to get information periodically about the network interfaces. Meaning, I execute the `guest-network-get-interfaces` in a period around 1-2 seconds each. + +After a while (maybe a day or so) the QGA seems to lock up with the CPU at 100% in 1 core. It does not reply to more commands, and restarting the service sometimes doesn't work, so a hard reboot it is. + +`dmesg` doesn't show anything useful/relevant. When attempting to edit the `qemu-guest-agent.service` and append `/usr/bin/strace` to it, I can get this in a loop: + +``` +strace[114154]: write(4, "{\\"return\\": [{\\"name\\": \\"lo\\", \\"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) +strace[114154]: write(4, "{\\"return\\": [{\\"name\\": \\"lo\\", \\"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) +strace[114154]: write(4, "{\\"return\\": [{\\"name\\": \\"lo\\", \\"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) +strace[114154]: write(4, "{\\"return\\": [{\\"name\\": \\"lo\\", \\"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) +strace[114154]: write(4, "{\\"return\\": [{\\"name\\": \\"lo\\", \\"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) +strace[114154]: write(4, "{\\"return\\": [{\\"name\\": \\"lo\\", \\"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) +strace[114154]: write(4, "{\\"return\\": [{\\"name\\": \\"lo\\", \\"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) +strace[114154]: write(4, "{\\"return\\": [{\\"name\\": \\"lo\\", \\"ip-a"..., 2047) = -1 EAGAIN (Resource temporarily unavailable) +``` + +I don't have more knowledge to debug this further. I can help to provide more info if some guidance is provided. + +**Don't know if it helps/affects**, but the guest VM is running Docker with around 10 containers or so, so when QGA works, I get around 18 network interfaces, counting loopback, docker `veth`s and `br` interfaces.""" +reproduce = """1. Create a VM with Fedora 37 +2. Install the QEMU Guest Agent +3. Call `guest-network-get-interfaces` in a loop every 1-2 seconds (after it finishes) through QGA using the unix socket using the provided python script, called as: `python qga.py --socket /run/test-vm-108.qga '{ "execute": "guest-network-get-interfaces" }'` +4. Eventually, the guest agent will lock up at 100% CPU usage on 1 core""" +additional = """Python script used to call QGA: +``` +import argparse +import socket +import sys + +def main(): + buf_size = 1024 + timeout_secs = .5 + + parser = argparse.ArgumentParser() + parser.add_argument('--socket', required=True, help='Path to Unix socket') + parser.add_argument('request', help='Request to send') + args = parser.parse_args() + + unix_socket_path = args.socket + request = args.request + + try: + with socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) as sock: + sock.settimeout(timeout_secs) + sock.connect(unix_socket_path) + + request_bytes = request.encode('utf-8') + sock.sendall(request_bytes) + + response_bytes = b'' + received_bytes = sock.recv(buf_size) + response_bytes += received_bytes + + sock.setblocking(False) + while True: + try: + received_bytes = sock.recv(buf_size) + if not received_bytes: + break + response_bytes += received_bytes + except (BlockingIOError, TimeoutError): + break + except (FileNotFoundError, ConnectionRefusedError): + sock.close() + sys.exit() + + response = response_bytes.decode('utf-8').strip() + print(response) + + except (TimeoutError, FileNotFoundError, BlockingIOError, ConnectionRefusedError): + sys.exit() + +if __name__ == "__main__": + main() +```""" |