1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
|
files: 0.829
device: 0.798
performance: 0.794
boot: 0.786
vnc: 0.721
permissions: 0.715
semantic: 0.713
PID: 0.710
network: 0.701
graphic: 0.694
other: 0.654
debug: 0.644
KVM: 0.626
socket: 0.605
file transfer over cifs to 64bit guest corrupts large files
qemu 4.0 compiled fom source.
vm called by
qemu-system-x86_64 -cpu qemu64 -smp 4 -m 4G -drive file=/data/images/slack14.2_64bit_test.qcow2,format=qcow2 -cdrom /mnt/smb1/slackware/iso/slackware64-14.2-install-dvd.iso -boot c -net nic,macaddr=02:00:00:11:11:17,model=i82551 -net bridge,br=br0 -enable-kvm -k en-gb -display vnc=:3 -monitor telnet:localhost:7103,server,nowait,nodelay
copying large files eg 2.4gb or reading them on a cifs mount in the guest causes corruption every time. For smaller files 40-60mb corruption is more than 50% of the time. tested by md5sum on cifs server, or on host machine vs. on guest vm.
corruption is seen only with 64bit guest using cifs with i82551 emulated network device
ie. 32bit guest using cifs with i82551 emulated network device gives no corruption.
changing the emulated device to vmxnet3 removes the data corruption (see below)
qemu-system-x86_64 -cpu qemu64 -smp 4 -m 4G -drive file=/data/images/slack14.2_64bit_test.qcow2,format=qcow2 -cdrom /mnt/smb1/slackware/iso/slackware64-14.2-install-dvd.iso -boot c -net nic,macaddr=02:00:00:11:11:17,model=vmxnet3 -net bridge,br=br0 -enable-kvm -k en-gb -display vnc=:3 -monitor telnet:localhost:7103,server,nowait,nodelay
this corruption is repeatable. ie. I created new vm, call using top example, installed 64bit linux, mounted cifs share and copied 2.4gb file to /tmp then run md5sum "filecopied"
the md5sum is different every time. copy same file to the host, or to a 32bit guest with the same virtual network device and bridge and md5sums are correct. The host pysical network adapter is
lspci|grep Ether
1e:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 11)
physically connected via gigabit ethernet to cifs server (via gigabit switch)
On Fri, Jun 05, 2020 at 12:30:39PM -0000, timsoft wrote:
> Public bug reported:
>
> qemu 4.0 compiled fom source.
> vm called by
> qemu-system-x86_64 -cpu qemu64 -smp 4 -m 4G -drive file=/data/images/slack14.2_64bit_test.qcow2,format=qcow2 -cdrom /mnt/smb1/slackware/iso/slackware64-14.2-install-dvd.iso -boot c -net nic,macaddr=02:00:00:11:11:17,model=i82551 -net bridge,br=br0 -enable-kvm -k en-gb -display vnc=:3 -monitor telnet:localhost:7103,server,nowait,nodelay
>
> copying large files eg 2.4gb or reading them on a cifs mount in the guest causes corruption every time. For smaller files 40-60mb corruption is more than 50% of the time. tested by md5sum on cifs server, or on host machine vs. on guest vm.
> corruption is seen only with 64bit guest using cifs with i82551 emulated network device
> ie. 32bit guest using cifs with i82551 emulated network device gives no corruption.
>
> changing the emulated device to vmxnet3 removes the data corruption (see
> below)
>
> qemu-system-x86_64 -cpu qemu64 -smp 4 -m 4G -drive
> file=/data/images/slack14.2_64bit_test.qcow2,format=qcow2 -cdrom
> /mnt/smb1/slackware/iso/slackware64-14.2-install-dvd.iso -boot c -net
> nic,macaddr=02:00:00:11:11:17,model=vmxnet3 -net bridge,br=br0 -enable-
> kvm -k en-gb -display vnc=:3 -monitor
> telnet:localhost:7103,server,nowait,nodelay
>
> this corruption is repeatable. ie. I created new vm, call using top example, installed 64bit linux, mounted cifs share and copied 2.4gb file to /tmp then run md5sum "filecopied"
> the md5sum is different every time. copy same file to the host, or to a 32bit guest with the same virtual network device and bridge and md5sums are correct. The host pysical network adapter is
> lspci|grep Ether
> 1e:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 11)
>
> physically connected via gigabit ethernet to cifs server (via gigabit
> switch)
Not a solution but some comments:
1. As a sanity-check you could try "nc <guest-ip> 1234 </path/to/file" on
the host and "nc -l -p 1234 >/tmp/file" in the guest. Netcat simply
sends/receives data over a TCP connection (it's a much simpler test
than CIFS). Is the checksum okay?
2. I don't know the CIFS network protocol, but if Wireshark can dissect
it then you could compare the flows between the vmxnet3 and the
i82551. This is only feasible if Wireshark can produce an unencrypted
conversation and the CIFS protocol doesn't have many protocol header
fields that differ between two otherwise identical sessions.
3. virtio-net is the most widely used and high-performance NIC model.
Other emulated NIC models are mainly there for very old guests that
lack virtio guest drivers.
thanks for the suggestion. I tried using netcat (nc) to transfer a large file from host to guest, and also from fileserver to guest with the problematic i82551 emulated network adapter on the host and the files transfered reliably. (correct md5sum 3 out of 3 attempts)
I also tried md5sum of the same file mounted on the guest fs as before and it still corrupts the data.
this seems to imply there is something in the cifs implementation which reacts adversly with this particular combination of virtual network hardware, the fact it works with the vmxnet3 emulated card, would support that conclusion.
On Wed, Jun 17, 2020 at 02:55:55PM -0000, timsoft wrote:
> thanks for the suggestion. I tried using netcat (nc) to transfer a large file from host to guest, and also from fileserver to guest with the problematic i82551 emulated network adapter on the host and the files transfered reliably. (correct md5sum 3 out of 3 attempts)
> I also tried md5sum of the same file mounted on the guest fs as before and it still corrupts the data.
> this seems to imply there is something in the cifs implementation which reacts adversly with this particular combination of virtual network hardware, the fact it works with the vmxnet3 emulated card, would support that conclusion.
I'm not sure if someone will look into it because the eepro100
(i82551) NIC device is old an not used much nowadays.
However, if someone does decide to investigate and wants to brainstorm
debugging ideas or needs help, feel free to contact me.
The QEMU project is currently moving its bug tracking to another system.
For this we need to know which bugs are still valid and which could be
closed already. Thus we are setting older bugs to "Incomplete" now.
If the bug has already been fixed in the latest upstream version of QEMU,
then please close this ticket as "Fix released".
If it is not fixed yet and you think that this bug report here is still
valid, then you have two options:
1) If you already have an account on gitlab.com, please open a new ticket
for this problem in our new tracker here:
https://gitlab.com/qemu-project/qemu/-/issues
and then close this ticket here on Launchpad (or let it expire auto-
matically after 60 days). Please mention the URL of this bug ticket on
Launchpad in the new ticket on GitLab.
2) If you don't have an account on gitlab.com and don't intend to get
one, but still would like to keep this ticket opened, then please switch
the state back to "New" within the next 60 days (otherwise it will get
closed as "Expired"). We will then eventually migrate the ticket auto-
matically to the new system (but you won't be the reporter of the bug
in the new system and thus won't get notified on changes anymore).
Thank you and sorry for the inconvenience.
[Expired for QEMU because there has been no activity for 60 days.]
|