If you have major problems at the OS root level, OpenStack allows you to put your instance into a rescue state, which can help you access your instance and rectify the problems, before restoring it to its active state.
How does it work?
When put into a rescue state, your image will be rebooted using a rescue image (i.e. the rescue image is mounted as the first boot source). This rescue image should not contain the problem configurations that are causing issues for this instance when booted normally. This means that you can boot and connect to your rescued image, allowing you to correct the problems that are preventing you operating it normally. After correcting these problems, you can revert your image back to its normal active state.
Which rescue image is used?
Currently, the Civo platform only facilitates the default OpenStack rescue image (i.e. you cannot specify which rescue image to use). OpenStack’s current default image is the Ubuntu base image.
Steps to rescue/unrescue your instance
- Navigate to your instances page.
- Select the instance in question to view its details.
- Shut down the instance by clicking the 'Shutdown' button at the top right of the page, and click 'ok' on the confirmation box.
- Wait until the instance shows as 'shutdown'.
- Click the ‘Rescue’ button at the top right of the page, and click 'ok' on the confirmation box.
- Wait until the status shows as ‘in rescued state’.
- You can now connect to your rescued instance via SSH using your normal SSH details.
To revert to an active state, click the ‘Unrescue’ button in the top right of the same page.
What kind of problems can be fixed using rescue mode?
Some examples of issues that can be fixed easily by utilising rescue mode are:
Forgotten root password
- Put the problem instance into rescue mode.
- SSH into the rescued instance as root.
Determine the name/location of your original disk by running
lsblk
. The output might look something like this:NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 2.2G 0 disk └─vda1 253:1 0 2.2G 0 part / vdb 253:16 0 25G 0 disk └─vdb1 253:17 0 25G 0 part
This shows that 'vda1' is our rescue disk and is mounted at '/', and 'vdb1' is our original disk and is not mounted.
Create a directory in your rescue disk in which to mount your original disk, e.g.
mkdir /mnt/real-disk
.Mount your original disk to the new directory in the rescue disk. Given the output above this would be
mount /dev/vdb1 /mnt/real-disk
.Change your root directory to the the newly mounted location. In our case this would be
sudo chroot /mnt/real-disk
.Run
sudo passwd root
to receive a prompt to change your password.
Blocked out by firewall
- Put the problem instance into rescue mode.
- SSH into the rescued instance as root.
Determine the name/location of your original disk by running
lsblk
. The output might look something like this:NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 2.2G 0 disk └─vda1 253:1 0 2.2G 0 part / vdb 253:16 0 25G 0 disk └─vdb1 253:17 0 25G 0 part
This shows that 'vda1' is our rescue disk and is mounted at '/', and 'vdb1' is our original disk and is not mounted.
Create a directory in your rescue disk in which to mount your original disk, e.g.
mkdir /mnt/real-disk
.Mount your original disk to the new directory in the rescue disk. Given the output above this would be
mount /dev/vdb1 /mnt/real-disk
.Change your root directory to the the newly mounted location. In our case this would be
sudo chroot /mnt/real-disk
.Remove the problem firewall rule or rules in the normal way. For example, you could run
sudo ufw disable
to ensure the problem firewall does not start on boot.
Corrupted file system
- Put the problem instance into rescue mode which can be done via the Civo web interface (if your instance is in an error state, you’ll first need to contact our engineers to reactivate it).
- SSH into the rescued instance as root.
Determine the name, location and file system of your original disk by running
lsblk -f
. The output might look something like this:NAME FSTYPE LABEL UUID MOUNTPOINT vda └─vda1 ext4 cloudimg-rootfs b69fcaa8-51c4-4097-b089-4973e52fd39b / vdb └─vdb1 ext4 cloudimg-rootfs b69fcaa8-51c4-4097-b089-4973e52fd39b
This shows that 'vda1' is our rescue disk and is mounted at '/', and 'vdb1' is our original disk and is not mounted. It also shows our file system to be of type ‘ext4’.
If your file system is ‘ext4’ you can now try the
fsck
command to examine the file system and attempt to repair any errors (note, it’s important that your disk is still unmounted when runningfsck
). Given our output above, the full command would look befsck -fy /dev/vdb1
. If that works, the instance should be fixed and simply needs to be rebooted. If it doesn’t work, or your file system is not ‘ext4’, proceed to step 6.Create a directory in your rescue disk in which to mount your original disk, e.g.
mkdir /mnt/real-disk
.Mount your original disk to the new directory in the rescue disk. The required command will differ slightly depending on your type of file system:
- For a ‘ext4’ file system, and given the output above, this would be
mount -o sb=131072,ro,noload /dev/vdb1 /mnt/real-disk
(note, sb=131072 refers to the specific super block to use, and will be the same for all Openstack instances.) - For a ‘XFS’ file system, this would be
mount -o nouuid,ro,norecover /dev/vdb1 /mnt/real-disk
.
- For a ‘ext4’ file system, and given the output above, this would be
Launch a new instance of same size and template as your rescued one (if Ubuntu, use root user with password, or if Centos, use centos user with password) and with no public networking (on the 'Create New Instance' page, in the networking section, ensure the option for 'private networking only' is checked). We are going to copy all files from the broken instance onto this new one.
If your new instance uses Centos, SSH to the instance (from your rescued instance) with
centos@
and set root's password to the default one usingsudo -i ; passwd
. You may also need to check that/etc/ssh/sshd_config
hasPermitRootLogin
set toyes
and runservice sshd restart
.Copy all files from the broken instance over to a new instance using the rsync command. In our case this looks like the following:
rsync --sparse -aAXv /mnt/real-disk/ --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","lost+found","/boot/*","/usr/src/*","/lib/modules/*"} root@10.97.93.15:/
Things to note about this command:
- '/mnt/real-disk/' is the location of the file system we want to copy
- we are excluding all these files because they are system files that we do not want to copy
- '10.97.93.15:/' is the internal IP of our newly created instance, followed by a colon, and then '/' being the destination location on the newly created instance where we want the files to be copied to.
- after files are copied, you may see an error: "rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1183) [sender=3.1.1]". This is expected, as some files are not able to be copied.
Move the public IP over to the newly created instance (as described in the learn guide: Setting Up High-Availability Between Instances).