Replacing an Integrated Storage (IS) disk drive is a step-by-step procedure to replace one drive with another. There are two methods that you can apply to replace an IS disk drive, either via UI or via CLI.

Before you begin

Ensure that all content on a storage node is redundant and no last copy of any existing virtual disk resides there. You can check it, using one of the following methods:

  • Storage Health Check that is available at the Diagnostics page on your Control Panel UI or directly on a backup server or a compute resource where the drive should be replaced by running the getdegradedvdisks command.
  • After you check the content on a storage node, you can follow one of the next procedures.


Replace IS Disk Drive via UI


  1. Forget a storage node that is the same disk drive you need to replace. Go to your Control Panel > Integrated Storage > Nodes menu. Click the Actions icon for a selected node and click Forget.
  2. Unassign the disk drive from the Integrated Storage Controller:
    • Go to Settings > Compute Resources icon and click a label of the destination compute resource.
    • Click Tools and then click Manage Devices.
    • On the Devices page, click Edit Device Configuration.
    • In the Disks section, click Unassigned next to Status for a destination disk.
    • Click Next and then click Finish.
  3. Replace the disk physically on the compute resource. If a compute resource does not support live diskhotplug from the physical layer, then other vDisks on other storage nodes on the compute resource have to be redundant, the virtual servers - migrated from the compute resource, and the compute resource - brought to maintenance.
  4. Assign the new disk drive to the Integrated Storage Controller:
    • Go to Settings > Compute Resources icon and click the label of a destination compute resource.
    • Click Tools and then click Manage Devices.
    • On the Devices page, click Edit Device Configuration.
    • In the Disks section, click Assign and select the Format checkbox next to Status for a destination disk.
    • Click Next and then click Finish.

Sometimes when you try to assign drives to Storage Controllers the transaction fails with such errors as result=FAILURE error=Connection refused by the API on node 10.200.18.3. As a result, drives are missing inside Storage Controller. 

Issue

[root@10.0.20.31 ~]# diskhotplug assign 2 0 /dev/sde result=FAILURE error=Connection refused by the API on node 10.200.18.3. [root@10.0.20.31 ~]# diskhotplug assign 2 1 /dev/sdf result=FAILURE error=Connection refused by the API on node 10.200.18.3. [root@10.0.20.31 ~]# diskhotplug list Controller 0 Slot 0 - /dev/sda (SCSIid:36000c29ed0e339c68949d027388da6d2_6000c29ed0e339c68949d027388da6d2,NodeID:3664104930) Slot 1 - /dev/sdb (SCSIid:36000c29b6b9d4979512e409b7c2af7b7_6000c29b6b9d4979512e409b7c2af7b7,NodeID:3860830766) Controller 2 Slot 0 - /dev/sde (SCSIid:3500a075125445421_194825445421,NodeID:3713670630)
Slot 1 - /dev/sdf (SCSIid:3500a075125444f04_194825444F04,NodeID:3926860425)
CODE


Resolution

It may be fixed by rebooting compute resources, however, you can resolve the issue without a reboot as follows:

Before, make sure that all vDisks are healthy because we need to stop affected storage VS/Controller.


[root@10.0.20.31 ~]# virsh list | grep STORAGENODE2
 41 STORAGENODE2 running

[root@10.0.20.31 ~]# virsh shutdown STORAGENODE2
Domain STORAGENODE2 is being shutdown

[root@10.0.20.31 ~]# virsh undefine STORAGENODE2
Domain STORAGENODE2 has been undefined

[root@10.0.20.31 ~]# mv /tmp/ST* /root/

[root@10.0.20.31 ~]# mv /onappstore/VMconfigs/NODE2-STORAGENODE2 /root/

[root@10.0.20.31 ~]# /usr/pythoncontroller/regenerateHotplugMetadata 
[root@10.0.20.31 ~]# diskhotplug list
Controller 0
 Slot 0 - /dev/sda (SCSIid:36000c29ed0e339c68949d027388da6d2_6000c29ed0e339c68949d027388da6d2,NodeID:3664104930)
 Slot 1 - /dev/sdb (SCSIid:36000c29b6b9d4979512e409b7c2af7b7_6000c29b6b9d4979512e409b7c2af7b7,NodeID:3860830766)

[root@10.0.20.31 ~]# dmsetup ls | grep STORAGEDEV2
STORAGEDEV2_SLOT0 (253:10)
STORAGEDEV2_SLOT1 (253:11)
[root@10.0.20.31 ~]# dmsetup remove STORAGEDEV2_SLOT0
[root@10.0.20.31 ~]# dmsetup remove STORAGEDEV2_SLOT1
CODE


Reduce the number of drives in the /onappstore/onappstore.conf file to the one that one controller can hold. For example, if your Storage Controller can hold two drives, leave only two drives in the file.

[root@10.76 ~]# cat /onappstore/onappstore.conf | grep ^disks
disks=[36000c2928a8464f0482024ebd06d035c_6000c2928a8464f0482024ebd06d035c,36000c2983b801914978d2f896e5c8e0d_6000c2983b801914978d2f896e5c8e0d]
Initialize this Storage Controller with the following command:

[root@10.0.20.31 ~]# diskhotplug initNewController
Complete: added new controllers [1]

[root@10.0.20.31 ~]# diskhotplug list
Controller 0
 Slot 0 - /dev/sda (SCSIid:36000c29ed0e339c68949d027388da6d2_6000c29ed0e339c68949d027388da6d2,NodeID:3664104930)
 Slot 1 - /dev/sdb (SCSIid:36000c29b6b9d4979512e409b7c2af7b7_6000c29b6b9d4979512e409b7c2af7b7,NodeID:3860830766)
Controller 1
 Slot 0 - EMPTY
 Slot 1 - EMPTY
CODE

Then you can back original onappstore.conf file from /.rw/

Cause

The error happens due to the difference in sequence numbering of Storage Controllers and their IP addresses. If  'Controller 1' is skipped and the system expects 'Controller 2' to have 10.200.18.3 IP address. But it only has 10.200.18.2 IP address and so the sequence is broken.



When you are finished, the storage node should be recognized by the Storage Controller. Telnet to the Storage Controller and check mount to see if xvdX or vdX is mounted correctly. If the node is mounted correctly, its OnApp UUID will be reported to other compute resources. After all compute resources are updated, you can add the node to the datastore.

To add the node to a destination data store:

  1. Go to Integrated Storage > Data Stores.
  2. Click Edit next to the target data store.
  3. Select a checkbox next to the node that you want to add and click Save.

Replace IS Disk Drive via CLI


  1. Forget a storage node that is the same disk drive you need to replace. Run the following command to forget the node for all virtual disks:

    forgetfromall forgetlist=<UUID>
    CODE
  2. Unassign a particular disk drive from the Integrated Storage Controller by running the diskhotplug directly on the compute resource:

    HV> diskhotplug list #shows storage controllers and slots of disks
    e.g. Slot 0 - /dev/sda (SCSIid:Z2AAL2M41BC14_Z2AAL2M4,NodeID:1337660081)
    CODE


    The NodeID corresponds to the OnApp UUID and the SCSI ID corresponds to the result of onapp_scsi_id.
    You can check which Storage Controller the config maps to in the storagenode config files at /onappstore/VMconfigs/...

    HV> diskhotplug unassign <Controller> <slot> #deselects the disk drive, closes down the paths
    CODE
  3. Replace disk physically on compute resource.
    If a compute resource does not support live diskhotplug from the physical layer, then other vDisks on other storage nodes on the compute resource have to be redundant, the virtual servers - migrated from the compute resource, and the compute resource - brought to maintenance.
  4. Assign the new disk drive to the Integrated Storage Controller, using the formatandconfigure and diskhotplug utilities.

    HV> formatandconfigure /dev/<sdX>
    HV> diskhotplug assign <controller> <free-slot> /dev/<sdX>
    CODE


    When you are finished, the storage node should be recognized by the Storage Controller. Telnet to the Storage Controller and check mount to see if xvdX or vdX is mounted correctly. If the node is mounted correctly, its OnApp UUID will be reported to other compute resources. After all compute resources are updated, you can add the node to the datastore.

  5. Run the following command on the current compute resource or on backup server:

    onappstore addnode uuid=<DS UUID> member_uuid=<NODE UUID>
    CODE

    Where:
    DS UUID - IS Datastore identifier
    NODE UUID - node ID identifier


See Also: