During the backup process, we take a snapshot of the virtual disk to be backed up. The snapshot occupies 10% of the virtual disk size in the data store. Thus, in order to make a backup, you should have had a free space of at least 10% of virtual disk size at nodes where virtual disk is allocated.
Zombie disks are disks that do not show up in the OnApp database. Typically they are snapshots taken of a VS's disk that OnApp uses to make backups. They become zombie disks after a backup process completes or fails, and the system cannot remove them.
Clean Up Zombie Disk
To remove zombie disks you can use the "Some zombie disks found clean up option (Delete ALL)" in the Disk Health section:
If that fails, backend cleanup will be needed. This can be done from any compute resource or backup server in the zone.
- Look up the zombie disk info:
onappstore diskinfo uuid=<identifier> readable=true
Then look up the "parent" line:
parent = xxxxxxxxxxxx
If the disk has the parent parameter, then it is a backup snapshot (zombie disk), and it is safe to remove it.
2. Once you have found the snapshot identifier, you can turn it off:
onappstore offline uuid=<identifier>
Turning off the disk may fail with an error.
[email@example.com ~]# onappstore offline uuid=xxxxxxxxxxx result=FAILURE error=onappstore offlineVDisk xxxxxxxxx failed on frontend 2956989790 with error map:  and optional error: API call failed for a subset of nodes. Failures: [('2956989790', u'Failed to detect device mapper node [/dev/mapper/4msa57bjlxkciz] disappear. out:, err:None')] completion_time=51
You can find the possible solution here.
90% of the time, it fails to offline the disk because of the stale mount, device mappers, or stuck processes, which can be found and removed following the link provided.
In the above error log, it actually gives you the identifier of the compute resource/backup server, where the issue is:
To find which resource this is, run the following command on all compute resources/backup servers:
onappstore getid | grep <frontend identifier>
Example: onappstore getid | grep 2956989790
This will output something like this on the correct resource:
>unicasthosts= backends= ipaddr=10.10.10.10 result=SUCCESS uuid=xxxxxxxxxx completion_time=0
Then, you need to ssh to that resource, or if you are on the same resource, you will need to check for the stuck processes, device mappers or mounts, as described in the document.
3. Once you have cleaned up all the stuck processes keeping the vDisk active, you may turn off the disk:
onappstore offline uuid=
4. Remove the zombie disk with the following command:
onappstore delete uuid=<identifier>