Question


How do I remove backup snapshots on a compute resource without getting an LVM I/O error?


Environment


All OnApp versions

LVM data store


Answer


The following three methods depending on a virtual server's operating system and a compute resource type show how OnApp performs backups. It does not cover incremental backups:

Linux Backups on XEN Compute Resource (LVM Logical Volume Is Formatted With Filesystem)

  1.  An LVM snapshot is created. In /dev/mapper, the devices appear as follows:

    /dev/mapper/onapp-aaaaaaaa/ccccccccccc-real 
    /dev/mapper/onapp-aaaaaaaa/backup-dddddddd 
    /dev/mapper/onapp-aaaaaaaa/backup-dddddddd-cow
    CODE
  2. The LVM snapshot is mounted to a temporary directory.
  3. The data is tarred to a backup file (rsynced for incremental backups).
  4. Umount.
  5. The LVM snapshot is deleted.


Linux Backups on a KVM Compute Resource (Partition Is Created on LVM Logical Volume and This Partition Is Formatted With Filesystem)

  1. An LVM snapshot is created. In /dev/mapper, the devices appear as follows:

    /dev/mapper/onapp-aaaaaaaa/ccccccccccc-real 
    /dev/mapper/onapp-aaaaaaaa/backup-dddddddd 
    /dev/mapper/onapp-aaaaaaaa/backup-dddddddd-cow
    CODE
  2. kpartx -a -p X ... is used to create a device for partition. In /dev/mapper, the devices appear as follows:

    /dev/mapper/backup-ddddddddX1
    CODE
  3. The partition is mounted to a temporary directory.
  4. The data is tarred to a backup file (rsynced for incremental backups).
  5. The partition is unmounted.
  6. kpartx -d -p X ... is used to create the device for partition.
  7. The LVM snapshot is deleted.


Windows Backups on XEN and KVM Compute Resource (Partition Is Created on LVM Logical Volume and This Partition Is Formatted With Filesystem)

  1. An LVM snapshot is created. In /dev/mapper, the devices appear as follows:

    /dev/mapper/onapp-aaaaaaaa/ccccccccccc-real 
    /dev/mapper/onapp-aaaaaaaa/backup-dddddddd 
    /dev/mapper/onapp-aaaaaaaa/backup-dddddddd-cow
    CODE
  2. kpartx -a -p X ... is used to create a device for partition . In /dev/mapper, the devices appear as follows:

    /dev/mapper/backup-ddddddddX1
    CODE
  3. ntfsclone backs up data to the file.
  4. kpartx -d -p X ... is used to create the device for partition.
  5. The LVM snapshot is deleted.


To detect and remove the backup snapshots:

Detection of Used Snapshots

Running the following command on a compute resource or backup server will show the running processes. As mentioned above, these are the main processes for a backup:

# ps ax|grep -e 'tar' -e 'ntfsclone' -e 'rsync'
CODE

To avoid an LVM I/O error after a backup snapshot is removed, the snapshot should be deleted on the compute resource where it is active.

Detection of Active Snapshots

Running this command on a compute resource or backup server will show all the snapshots that are currently marked active on the server:

# lvscan|grep Active|grep Snapshot
CODE

The snapshots can be removed with:

lvremove /dev/onapp-aaaaaaa/backup-bbbbbbb
CODE

lvremove can fail if the device is busy.

Sometimes lvscan|grep Active|grep Snapshot does not show active snapshots. Run:

ls -al /dev/mapper
CODE

and look for the ...X1 and backup-... files to get a 100 % result.

If lvremove failed because of a busy device error, you can do the following:

Linux VS & XEN compute resource

The snapshot /dev/onapp-aaaaaaa/backup-bbbbbbb is mounted. To check it, use:

# mount
CODE

If the snapshot appears in the output, try unmounting it with:

# umount /dev/onapp-aaaaaaa/backup-bbbbbbb
CODE

Then, run lvremove again.

Linux&KVM/Windows

The partition device was not deleted (/dev/mapper/backup-bbbbbbX1 still exists). Delete it with:

# kpartx -d -p X /dev/onapp-aaaaaaa/backup-bbbbbbb
CODE

Then, /dev/mapper/backup-ddddddddX1 should disappear, and you can run lvremove again.

Removing Backup Snapshots I/O Errors

The steps below should be used only if the steps above do not work.

Typically, the errors are look like as follows:

  /dev/onapp-kagaedbrcmye6q/backup-tl3h1shnd3cre9: read failed after 0 of 4096 at 53687025664: Input/output error 
  /dev/onapp-kagaedbrcmye6q/backup-tl3h1shnd3cre9: read failed after 0 of 4096 at 53687083008: Input/output error 
  /dev/onapp-kagaedbrcmye6q/backup-tl3h1shnd3cre9: read failed after 0 of 4096 at 0: Input/output error 
  /dev/onapp-kagaedbrcmye6q/backup-tl3h1shnd3cre9: read failed after 0 of 4096 at 4096: Input/output error 
  /dev/onapp-kagaedbrcmye6q/backup-t6izl9vaqcar0o: read failed after 0 of 4096 at 751619211264: Input/output error 
  /dev/onapp-kagaedbrcmye6q/backup-t6izl9vaqcar0o: read failed after 0 of 4096 at 751619268608: Input/output error 
  /dev/onapp-kagaedbrcmye6q/backup-t6izl9vaqcar0o: read failed after 0 of 4096 at 0: Input/output error 
  /dev/onapp-kagaedbrcmye6q/backup-t6izl9vaqcar0o: read failed after 0 of 4096 at 4096: Input/output error 
  /dev/mapper/backup-t6izl9vaqcar0oX1: read failed after 0 of 4096 at 751616065536: Input/output error 
  /dev/mapper/backup-t6izl9vaqcar0oX1: read failed after 0 of 4096 at 751616122880: Input/output error 
  /dev/mapper/backup-t6izl9vaqcar0oX1: read failed after 0 of 4096 at 0: Input/output error 
  /dev/mapper/backup-t6izl9vaqcar0oX1: read failed after 0 of 4096 at 4096: Input/output error
CODE

To remove the errors, run:

# dmsetup remove backup-bbbbbbbbbX1 
# dmsetup remove onapp--aaaaaaaaa-backup--bbbbbbbbbb 
# dmsetup remove onapp--aaaaaaaaa-backup--bbbbbbbbbb-cow
CODE

Make sure that you run dmsetup remove on the backup snapshots only and not on any other lines. Otherwise, it can cause data loss or corruption. If you are unsure, please contact the OnApp Support team.