During the backup process, we take a snapshot of the virtual disk to be backuped. The snapshot occupies 10% of the virtual disk size in the data store. Thus, in order to make a backup, you should have had a free space of at least 10% of virtual disk size at nodes where virtual disk is allocated.



Zombie Disks

Zombie disks are disks that do not show up in the OnApp database. Typically they are snapshots taken of a VS's disk that OnApp uses to make backups. They become zombie disks after a backup process completes or fails, and the system cannot remove them.

Clean Up Zombie Disk

To remove zombie disks you can use the Some zombie disks found clean up option in the Disk Health section at Storage > label > Diagnostics.

If that fails, backend cleanup will be needed. This can be done from any compute resource or backup server in the zone.

  1. Look up the zombie disk info:
onappstore diskinfo uuid=<identifier> readable=true
CODE

Then look up the "parent" line:

parent = xxxxxxxxxxxx
CODE

If the disk has the parent parameter, then it is a backup snapshot (zombie disk), and it is safe to remove it.


2. Once you have found the snapshot identifier, you can turn it off:

onappstore offline uuid=<identifier>
CODE


Turning off the disk may fail with an error. 

 Click here to see additional information

Sample error:

[root@0.0.0.00 ~]# onappstore offline uuid=xxxxxxxxxxx
result=FAILURE error=onappstore offlineVDisk xxxxxxxxx failed on frontend 2956989790 with error map: [] and optional error: API call failed for a subset of nodes. Failures: [('2956989790', u'Failed to detect device mapper node [/dev/mapper/4msa57bjlxkciz] disappear. out:, err:None')] completion_time=51
CODE

You can find the possible solution here.

90% of the time, it fails to offline the disk because of the stale mount, device mappers, or stuck processes, which can be found and removed following the link provided.

If the stuck process has a /bdev in the process line, it is best to open a support ticket.

In the above error log, it actually gives you the identifier of the compute resource/backup server, where the issue is: frontend 2956989790 

To find which resource this is, run the following command on all compute resources/backup servers:

onappstore getid | grep <frontend identifier>
CODE

Example: onappstore getid | grep 2956989790

This will output something like this on the correct resource:

>unicasthosts= backends= ipaddr=10.10.10.10 result=SUCCESS uuid=xxxxxxxxxx completion_time=0
CODE

Then, you need to ssh to that resource, or if you are on the same resource, you will need to check for the stuck processes, device mappers or mounts, as described in the document.


3. Once you have cleaned up all the stuck processes keeping the vDisk active, you may turn off the disk:

onappstore offline uuid=
CODE


4. Remove the zombie disk with the following command:

onappstore delete uuid=<identifier>
CODE