If you have major problems at the OS root level, OpenStack allows you to put your instance into a rescue state, which can help you access your instance and rectify the problems, before restoring it to its active state.

rescue screen shot.png

How does it work?

When put into a rescue state, your image will be rebooted using a rescue image (i.e. the rescue image is mounted as the first boot source). This rescue image should not contain the problem configurations that are causing issues for this instance when booted normally. This means that you can boot and connect to your rescued image, allowing you to correct the problems that are preventing you operating it normally. After correcting these problems, you can revert your image back to its normal active state.

Which rescue image is used?

Currently, the Civo platform only facilitates the default OpenStack rescue image (i.e. you cannot specify which rescue image to use). OpenStack’s current default image is the Ubuntu base image.

Steps to rescue/unrescue your instance

  1. Navigate to your instances page.
  2. Select the instance in question to view its details.
  3. Shut down the instance by clicking the 'Shutdown' button at the top right of the page, and click 'ok' on the confirmation box.
  4. Wait until the instance shows as 'shutdown'.
  5. Click the ‘Rescue’ button at the top right of the page, and click 'ok' on the confirmation box.
  6. Wait until the status shows as ‘in rescued state’.
  7. You can now connect to your rescued instance via SSH using your normal SSH details.

To revert to an active state, click the ‘Unrescue’ button in the top right of the same page.

What kind of problems can be fixed using rescue mode?

Some examples of issues that can be fixed easily by utilising rescue mode are:

Forgotten root password

  1. Put the problem instance into rescue mode.
  2. SSH into the rescued instance as root.
  3. Determine the name/location of your original disk by running lsblk. The output might look something like this:

    NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    vda    253:0    0  2.2G  0 disk 
    └─vda1 253:1    0  2.2G  0 part /
    vdb    253:16   0   25G  0 disk 
    └─vdb1 253:17   0   25G  0 part 
    

    This shows that 'vda1' is our rescue disk and is mounted at '/', and 'vdb1' is our original disk and is not mounted.

  4. Create a directory in your rescue disk in which to mount your original disk, e.g. mkdir /mnt/real-disk.

  5. Mount your original disk to the new directory in the rescue disk. Given the output above this would be mount /dev/vdb1 /mnt/real-disk.

  6. Change your root directory to the the newly mounted location. In our case this would be sudo chroot /mnt/real-disk.

  7. Run sudo passwd root to receive a prompt to change your password.

Blocked out by firewall

  1. Put the problem instance into rescue mode.
  2. SSH into the rescued instance as root.
  3. Determine the name/location of your original disk by running lsblk. The output might look something like this:

    NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
    vda    253:0    0  2.2G  0 disk 
    └─vda1 253:1    0  2.2G  0 part /
    vdb    253:16   0   25G  0 disk 
    └─vdb1 253:17   0   25G  0 part 
    

    This shows that 'vda1' is our rescue disk and is mounted at '/', and 'vdb1' is our original disk and is not mounted.

  4. Create a directory in your rescue disk in which to mount your original disk, e.g. mkdir /mnt/real-disk.

  5. Mount your original disk to the new directory in the rescue disk. Given the output above this would be mount /dev/vdb1 /mnt/real-disk.

  6. Change your root directory to the the newly mounted location. In our case this would be sudo chroot /mnt/real-disk.

  7. Remove the problem firewall rule or rules in the normal way. For example, you could run sudo ufw disable to ensure the problem firewall does not start on boot.

Corrupted file system

  1. Put the problem instance into rescue mode which can be done via the Civo web interface (if your instance is in an error state, you’ll first need to contact our engineers to reactivate it).
  2. SSH into the rescued instance as root.
  3. Determine the name, location and file system of your original disk by running lsblk -f. The output might look something like this:

    NAME   FSTYPE LABEL           UUID                                 MOUNTPOINT
    vda                                                                
    └─vda1 ext4   cloudimg-rootfs b69fcaa8-51c4-4097-b089-4973e52fd39b /
    vdb                                                                
    └─vdb1 ext4   cloudimg-rootfs b69fcaa8-51c4-4097-b089-4973e52fd39b 
    

    This shows that 'vda1' is our rescue disk and is mounted at '/', and 'vdb1' is our original disk and is not mounted. It also shows our file system to be of type ‘ext4’.

  4. If your file system is ‘ext4’ you can now try the fsck command to examine the file system and attempt to repair any errors (note, it’s important that your disk is still unmounted when running fsck). Given our output above, the full command would look be fsck -fy /dev/vdb1. If that works, the instance should be fixed and simply needs to be rebooted. If it doesn’t work, or your file system is not ‘ext4’, proceed to step 6.

  5. Create a directory in your rescue disk in which to mount your original disk, e.g. mkdir /mnt/real-disk.

  6. Mount your original disk to the new directory in the rescue disk. The required command will differ slightly depending on your type of file system:

    • For a ‘ext4’ file system, and given the output above, this would be mount -o sb=131072,ro,noload /dev/vdb1 /mnt/real-disk (note, sb=131072 refers to the specific super block to use, and will be the same for all Openstack instances.)
    • For a ‘XFS’ file system, this would be mount -o nouuid,ro,norecover /dev/vdb1 /mnt/real-disk.
  7. Launch a new instance of same size and template as your rescued one (if Ubuntu, use root user with password, or if Centos, use centos user with password) and with no public networking (on the 'Create New Instance' page, in the networking section, ensure the option for 'private networking only' is checked). We are going to copy all files from the broken instance onto this new one.

  8. If your new instance uses Centos, SSH to the instance (from your rescued instance) with centos@ and set root's password to the default one using sudo -i ; passwd. You may also need to check that /etc/ssh/sshd_config has PermitRootLogin set to yes and run service sshd restart.

  9. Copy all files from the broken instance over to a new instance using the rsync command. In our case this looks like the following:

    rsync --sparse -aAXv /mnt/real-disk/ --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","lost+found","/boot/*","/usr/src/*","/lib/modules/*"} root@10.97.93.15:/
    

    Things to note about this command:

    • '/mnt/real-disk/' is the location of the file system we want to copy
    • we are excluding all these files because they are system files that we do not want to copy
    • '10.97.93.15:/' is the internal IP of our newly created instance, followed by a colon, and then '/' being the destination location on the newly created instance where we want the files to be copied to.
    • after files are copied, you may see an error: "rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1183) [sender=3.1.1]". This is expected, as some files are not able to be copied.
  10. Move the public IP over to the newly created instance (as described in the learn guide: Setting Up High-Availability Between Instances).