VS Hot Migration Fails after Upgrade to 6.0
Issue
After upgrading to version 6.0, hot migration fails for VSs with the following message:
Running: virsh migrate --live --unsafe byyjnxgstgywrm qemu+ssh://192.168.163.136:22/system?no_verify=1 tcp:192.168.163.136
error: internal error: process exited while connecting to monitor: 2018-10-30T11:02:45.478516Z qemu-kvm: unable to map backing store for guest RAM: Cannot allocate memory
Fatal: Execution of virsh migrate --live --unsafe byyjnxgstgywrm qemu+ssh://192.168.163.136:22/system?no_verify=1 tcp:192.168.163.136 failed
Environment
OnApp version 6.0
Resolution 1
Case 1
If you do not have any VSs migrated, the issue can be fixed in the following way:
Check the value of
HugePages_Total
on the destination compute resource:# cat /proc/meminfo | grep HugePages_Total HugePages_Total: 0
CODEThese values indicate that the huge pages are disabled and hot migration is impossible.
Reserve some resources for the huge pages. Before VS migration, reserve the huge pages for the qemu process:
• Identify the size of a huge page by checkingHugepagesize
in the/proc/meminfo
file. For example:# cat /proc/meminfo | grep Hugepagesize Hugepagesize: 2048 kB
CODE• Calculate the number of the huge pages to be reserved using the formula:
HPAGES = $((($GUESTs_SIZE_IN_MB)/ 2) + 10)
CODEWhere:
GUEST_SIZE_IN_MB - the amount of memory size in MB required for VSs migration.For example:
You need to migrate two virtual servers with 1 GB of RAM. Divide the amount of the huge page memory that you need by the size of a huge page:
HPAGES = $(((1024+1024)/ 2) + 10) = 1029
CODE• Allocate the memory as huge pages by adding the number of the huge pages to the
/proc/sys/vm/nr_hugepages
file.# echo 1029 >/proc/sys/vm/nr_hugepages
CODECheck if the huge pages values are configured accordingly:
# cat /proc/meminfo | grep -i huge AnonHugePages: 112640 kB HugePages_Total: 1029 HugePages_Free: 1029 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB
CODE
Case 2
You have already migrated some VSs, but need to migrate one more. Check what is already used and add that number to it. In this case, do the following:
Check for the current free huge pages:
#cat /proc/meminfo | grep HugePages_Free HugePages_Free: 802
CODEMultiply
HugePages_Free
by 2 (2 is the number of Mb of a single page).Example: 802*2=1604
CODEA virtual server is migrated with no issues if RAM does not exceed 1600 Mb.
If the virtual server RAM exceeds this value, the number of
HugePages_Total
must be increased:# cat /proc/meminfo | grep HugePages_Total HugePages_Total: 1029 #cat /proc/meminfo | grep HugePages_Free HugePages_Free: 802
CODECalculate the new
HugePages_Total
value:Example: VS with 2Gb RAM, require 1024 of HugePages_Free but we have 802 1024-802=222 - is the number of pages that we need to add to our reserve
CODESet a new value:
New value for HugePages_Total = HugePages_Total + 222 +10 = 1261 Where 10 HugePages reserved for some factors.
CODERun:
# echo 1261 >/proc/sys/vm/nr_hugepages # cat /proc/meminfo | grep HugePages_Total HugePages_Total: 1261
CODE
Resolution 2
You can specify how large the pool of huge pages can grow in the /proc/sys/vm/nr_overcommit_hugepages
file if more huge pages than /proc/sys/vm/nr_hugepages
are requested by applications. Specifying any non-zero value in this file indicates the number of "surplus" huge pages from the kernel's normal page pool that the hugetlb
subsystem is allowed to obtain when the persistent huge page pool is exhausted. As these surplus huge pages become unused, they are freed back to the kernel's normal page pool.
Check the value of huge pages on the destination compute resource:
# cat /proc/sys/vm/nr_overcommit_hugepages 0
CODEThese values indicate that the huge pages overcommit is disabled.
Identify the size of a huge page by checking the
Hugepagesize
entry in the/proc/meminfo
file.Example: # cat /proc/meminfo | grep Hugepagesize Hugepagesize: 2048 kB
CODE- Set the huge page overcommit to a large value (large enough to meet your peak huge page requests). Thus, you can have a dynamic huge page allocation in your environments.
For example:
You have a compute resource with 32 Gb (32768 Mb) of RAM, so you can set the huge page overcommit value to almost 80% of free RAM (approximately 26000 Mb). Divide this value by 2 (2 is the number of Mb of a single page). Use the following command:
# echo 13000 > /proc/sys/vm/nr_overcommit_hugepage
CODE
Cause
In previous versions, the VSs could be booted with the enabled huge pages. In OnApp 5.10 version, huge pages were removed and the resources cannot be allocated on the destination compute resource. To enable transparent huge pages for CentOS 6 and CentOS 7 CloudBoot KVM compute resources, run the following command in the custom config file:
echo always > /sys/kernel/mm/transparent_hugepage/enabled