With the World Cup now in full swing, I wanted to talk about the MVP of our open-source software team, our star player, the one and only Ceph. Of all the software we use within the Civo platform, Ceph is probably my favourite. It is one of (if not the) most fundamental pieces of a large and complex hosting platform. As with a football team, a lot of the "players" in our system are temperamental divas: a tiny misconfiguration here, a small loss in network connectivity there and they have a complete tantrum. Not Ceph.
No matter what you throw at it, Ceph quietly and consistently does an excellent job. For example last year, we needed to replace all of the disks from two of our OpenStack platforms for use elsewhere. This meant changing every single disk in the entire storage cluster. We had to remove a set of its OSD's (Object Storage Daemons), remove the disks, wait for it to recover, replace the disks, add in the new OSD's, rinse and repeat until we had done the entire cluster. Did it complain? Well, a little (we had some minor warnings) but other than that, it just got on with it, with almost zero noticeable degredation in service and with almost no supervision or intervention required.
Upgrading a distributed system to the latest version is more often than not a very time consuming, complex process that requires a fair amount of manual "hand holding" even with modern automation tooling. Again, not with Ceph. All of the upgrades have just worked beautifully with minimal fuss and without any major hiccups. You have to push it very hard in order for it to break. Even then, whilst it can go into a bad state it does so to protect your data and provided you give it some time and allow it to recover, eventually it will.
For those of you that don't know, Ceph is a high performance storage solution that provides three things: object storage, block storage and a distributed file system. So when you fire up an instance on Civo, the disk of your VM is backed by Ceph. If you create a volume and attach it to an instance, that volume is stored in Ceph. Ensuring availability and integrity of data within the platform is critical and Ceph takes several steps to ensure this. One such feature of Ceph is that it creates 3 copies of your data which are distributed within the cluster. Therefore should a disk fail (and they do) your data is safe, if a motherboard component or the motherboard itself fails your data is safe, if a rack power supply fails... you get the idea. What is even more amazing is that this is all done transparently.
When I speak about Ceph, I speak very highly. The engineering team behind it have done a superb job, it is a pleasure to use and I for one am grateful. We love Ceph, you should too.