summary refs log tree commit diff stats
path: root/gitlab/issues_text/target_missing/host_missing/accel_missing/1889
blob: ecf5eb937313b88ce1705f9dc3c0de4da9da97cf (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
IO delays on live migration lv initialization
Description of problem:
Hi,

When I live migrate a VM via Proxmox and the destination is an LVM thin pool I see that at the start of copying the disk it's first initialized.

This leads the thin volume to be directly 100% allocated which needs to be discarded afterwards. Not ideal but ....

The more annoying thing is that this initialization step used 100% of disk IO. In iotop I see it writing over 1000MB/sec. The nasty side effect is that other VM's on that host are negatively affected. It's not completely locked up, I can ssh in and look around, but storage intensive things see more delay. With e.g. http requests timing out. And even a simple ls command could take 10+ seconds which is normally instant.


I've previously reported it on the [proxmox forum](https://forum.proxmox.com/threads/io-delays-on-live-migration-lv-initialization.132296/#post-582050) but the call was made that this is behavior from Qemu.

> The zeroing happens, because of what QEMU does when the discard option is enabled:


When I disable discard for the VM disk I can see that it's not pre-initialized during migration, but not having that defeats the purpose of having an lvm thin pool.

For the (disk) migration itself I can set a bandwidth limit ... could we do something similar for initialization?


Even better would be to not initialize at all when using LVM thin. As far as I understand it the new blocks allocated by lvm thin should always be empty.
Steps to reproduce:
1. Migrate a vm with a large disk
2. look in iotop on the new host, would be see more write IO then the network could handle.. just before the disk content is transferred.
3. look in another VM on the destination host, reading from disk would be significantly slower then normal.
Additional information:
An example VM config
```
agent: 1,fstrim_cloned_disks=1
balloon: 512
bootdisk: scsi0
cores: 6
ide2: none,media=cdrom
memory: 8196
name: ...
net0: virtio=...,bridge=...
numa: 0
onboot: 1
ostype: l26
scsi0: thin_pool_hwraid:vm-301-disk-0,discard=on,format=raw,size=16192M
scsi1: thin_pool_hwraid:vm-301-disk-1,discard=on,format=raw,size=26G
scsihw: virtio-scsi-pci
serial0: socket
smbios1: uuid=...
sockets: 1
```