Integrated Storage Auto Healing

OnApp introduces auto healing -  an auto-scheduling option to repair degraded vdisks. This functionality can be used only in case there are no serious issues with Integrated Storage. The following conditions should be met (it can be checked using the compute zone diagnostics): 

  • No disks with partial memberlist found 
  • No disks with no stripe replicas found 
  • No disks with no redundancy found
  • No partially online disks found 
  • No disks in other degraded states found
  • No partial nodes found 
  • No inactive nodes found 
  • No nodes with delayed ping found 
  • No nodes with high utilization found 
  • No out of space nodes found 
  • No inactive controllers found 
  • No unreferenced NBDs found 
  • No reused NBDs found 
  • No dangling device mappers found
  • No disks with inactive cache
  • No stale cache volumes

It is recommended to disable auto healing before Integrated Storage upgrade.



On this page:

Configure auto healing for data store



To enable auto healing for data store:

  1. Go to your Control Panel > AdminSettings menu.
  2. Click the Data Stores icon. You'll see a list of the data stores on your system.
  3. Click the Actions button next to the data store you want to change, then click Edit.
  4. Move the Auto Healing slider to the right to enable auto healing. 
  5. Click the Save Data Store button to finish.

To disable auto healing for data store:

  1. Go to your Control Panel > Admin > Settings menu.
  2. Click the Data Stores icon. You'll see a list of the data stores on your system.
  3. Click the Actions button next to the data store you want to change, then click Edit.
  4. Move the Auto Healing slider to the left to disable auto healing. 
  5. Click the Save Data Store button to finish.

AutoHealing script is performed every 1 hour on each IS Datastore marked as auto_healing. Auto healing repairs disks one by one for each datastore, where it is enabled. There are several conditions, when auto healing will not proceed:

  • if diagnostics fails
  • if there are active repair or rebalance transactions
  • if there are no degraded disks in datastore
  • auto healing will not try to repair disk if its last repair transaction is failed in last 24 hours



Emails about auto healing events



You will receive the following email notifications about auto healing process:

  • hourly emails about degraded vdisks
  • if auto healing is impossible because of issues with Integrated Storage, you will receive an email with the following text: "Degraded vdisks found, but there are problems with Integrated Storage and Auto Healing will not start until you log in and investigate/repair the problems. "
  • if auto healing is running, you will receive an email with the following text: "Degraded vdisks found" and auto healing will start processing the list of degraded vdisks.