We spoke with two members from the SRE team, Alex Blyth, Zulhilmi Zainudin, and Ian Banks, to learn more about their role at Civo. Through this series, we aim to provide you with an overview of the different roles we have at Civo and what advice our team has. You can discover more about our team in our “day in the life of a Go Dev” and “day in the life of an Intern” blog.
What has your journey with Civo looked like?
Alex: I started at Civo as a Cloud Engineer, which was a similar role to my previous SRE position. When I started at Civo, I spent a lot of time working on the hardware side of things, then I moved into the SRE role at the start of 2022.
Zulh: I’ve been a developer for many years and have tried various technologies including blockchain prior to joining Civo as a Developer before transitioning to a SRE. Being in an SRE position with development experience under my belt has made me like a pilot who knows which knobs can be toggled on or off, how much I need to ramp, and whether it’s safe for me to go manual or autopilot. It’s not always science. Sometimes it’s just a gut feeling, coming from past experiences.
Ian: I've been working at Civo since the start, initially as a Cloud Engineer and then I moved into the SRE team. As I have been with the team throughout every stage of the development, it makes me proud of what we have done and excited for what we have planned in the future. The early stages of being at Civo coincided with the beginning of the pandemic in 2020, when everyone transitioned to working at home. This was a big change, but one that has ultimately encouraged me to communicate with my colleagues more. Although spread around the world, I personally think that the company feels closer in some way than when we all worked in the same office. On a technical level, I have gained a great deal of knowledge and have been able to use past experience towards my current role.
What does the role of an SRE involve?
Alex: The SRE team is responsible for making sure the Civo services are running smoothly and any issues that come up are fixed. This could be reacting to alerts as they come in, or working with customers to help solve problems they are having. We also work closely with other teams in the company, such as helping the developers deploy new features, or working with the support team to solve customer issues.
Zulh: The SRE team are responsible for a range of tasks such as making sure all systems are operational, happy and healthy, regularly revising the remaining / spare hardware across all regions, and dealing with platform hardware/software upgrades and scheduled maintenance. Other responsibilities include writing documentation for ops-related processes so that we can have transferable knowledge across teammates (especially new joiners), or peer reviewing each other’s work and providing constructive feedback when needed to make sure only quality code gets pushed to production.
Ian: The main day to day responsibilities include monitoring and maintaining our internal systems and infrastructure along with general customer service. This could be responding to alerts, fixing issues, or planning and carrying out maintenance tasks such as upgrades. The SRE team will often work directly with enterprise customers to ensure that their services run smoothly and assist with any queries or issues they may have. Alongside this, we also work closely with other teams at Civo to investigate bugs and improvements for systems and services, then plan and execute the required changes on our production regions.
What does your typical working day look like?
Alex: I normally start the working day with a stand-up call along with the rest of the SRE team. We discuss what we're currently working on and if we're having any issues, or need any help with tasks from other engineers. Afterward, I'll often be involved with calls for one of our Enterprise customers, working directly with their engineers to plan projects involving Civo. If needed, I spend time working with a Junior Engineer or Intern to help them with any projects they're working on. The majority of the day is spent working on projects, which could include anything from deploying new products, to tuning alerts, or creating observability dashboards. Before finishing for the day, I'll look over the monitoring and make sure any alerts or issues are cleared down before the next shift takes over.
Zulh: When I start my work day, I’m in charge of all platform incidents raised during that period as “on-call SRE”. In the event that something really strange shows up, I’ll usually notify someone from the team via Pagerduty so we can team up to look at the issue together. This allows the SRE team to take advantage of the 8-hour time zone difference (my working day starts whilst the rest of my team in the UK are sleeping). If no incidents occur, I perform my work as usual, and sometimes, I will pair together with the development team to help them debug platform-related issues and give them some advice (from an ops point of view) to ensure our software is easy to operate and manage.
Ian: The SRE team are located around the world in different time zones, therefore communication and coordination is important. At the start of my working day the SRE team has a stand up call to discuss our work from the previous day, review support escalations and alerts which have arisen, and plan work for the coming day. To make sure we are always hitting our targets we have a list of priority goals, and will record progress towards them, discuss any blockers, share ideas, and possibly plan to collaborate during the day.
Throughout the day I will work on issues created in our internal Gitlab. These could relate to any of the tasks discussed previously such as developing new services, improving our observability systems, fixing bugs, or planning scheduled maintenance. Many of these tasks also involve producing documentation to share knowledge with the team. Before any work is committed into our main repositories it must be reviewed by team members, something I will do multiple times a day.
What are some current projects your team is working on?
Alex: The main project I'm personally working on is a major upgrade to an enterprise customers solution, which involves working closely with their engineering team. Other projects the team are working on involve deploying new features for Civo, including IPv6 support and a secret upcoming product we are working on.
Zulh: One of the main projects we are working on includes IPv6 support for Civo Instances and Kubernetes service. Whilst there are many more, I won’t reveal them just yet!
How does your team work together in a global business?
Alex: Working in a global business does have challenges, such as getting everyone together for meetings, but it also has a lot of advantages for teams like ours. Having people working in America, Europe, and Asia allows us to have engineers working at all times of the day, reducing the on-call stress.
Zulh: Slack is our main communication channel where every team has their own channel to chat. To avoid forgetting what we need to do, we track all our work in GitLab by tracking issues. For big changes, we are practicing the Request For Comments (RFC) methodology where everyone gets the chance to write their own RFC to add/improve something. Every RFC is linked to an issue and that’s where all the discussion happens for that specific RFC. And when the RFC is accepted, that becomes a team consensus to pull together to make it a reality and ship it. With that said, that means we practice asynchronous communication a lot. The values of remote work culture and the global team really shine here.
What have been some challenges you have faced in your role?
Alex: There is so much to learn! I started quite working at Civo near the very beginning, and everything was changing very quickly, but I was surrounded by a great team that helped me get up to speed.
Zulh: Time zone differences can be tough, however, my team is understanding and respects my time by not disturbing me outside of my typical working day. In the event that time zone is a challenge, I’ll write it down, post it on Slack, and let others post their inputs. Once I start working the following day, I pick up from there and post any follow-up questions that I might have. This means I am able to take the opportunity when everyone is online to ask questions that need instant decisions allowing my next day to flow well when they are not around.
Ian: I think the main challenge is the variety of technical knowledge required throughout the team. There is always something new to learn, but this also makes the role interesting and enjoyable. To ease this challenge we maintain good documentation knowledge through the team, and will often collaborate to tackle difficult technical tasks.
How have you found the 4-day work week has impacted your role?
Alex: I've enjoyed working a 4-day week since we introduced it. I feel a lot more focused Monday to Thursday, and a lot more rested at the weekend. Having 3 days off allows me to feel properly rested and get a lot more done over the weekend. Obviously, not all companies work a 4-day week, so our customers may need some support on Fridays, but I feel we manage that well within the team to ensure everyone still gets the same benefits.
Zulh: I have really enjoyed the 4-day work week so far, I get the chance to spend my Fridays exploring things that interest me and learning new skills. Sometimes I use it to spend time with my family or travel to visit friends and family. Then, if I have nothing planned on Sunday, I’ll make use of it to do some reading/research for work that I will need on Monday. No Monday blues means a productive start to the week!
Ian: When initially transitioning to a 4-day working week I felt I needed to try and fit in as much work as possible, and it began to feel overwhelming. However, once I adjusted to it, I found that I could be more focused and productive during my working week, avoiding burn out, and then during the 3-day weekend, I could get personal tasks done, relax, and enjoy time with my family.
Do you have any advice for someone who wants to become an SRE?
Alex: Learn about as many technologies as you can, ask a lot of questions, and don't be afraid to make mistakes. You learn a lot more trying to fix something than when everything is working perfectly.
Zulh: My first piece of advice would be to always stay hungry and foolish. Know how to learn and unlearn by reading or watching tutorials by others in the industry. But, when in doubt, don’t be afraid to ask for help. It might not be smooth sailing at first, but it will be, slowly but surely.
Ian: I think that one of the most important qualities for an SRE is the desire to learn. To be successful, you have to continuously learn new skills, be open to and willing to adopt new ideas and practices. One of the toughest things to overcome is the fear of making mistakes, you will often learn more through mistakes than any other way. These mistakes will ideally come in testing and development environments, making it important to be part of a team who realize that mistakes can happen, but what is important is that we learn from and do not repeat them.
What makes your role at Civo special?
Alex: The main thing which makes my role at Civo special is the people I’m working with. Everyone is willing to help you out and go the extra mile. Flexible working and the 4-day week ensure you can have a great work/life balance to enjoy other things outside work.
Zulh: When I was a kid, I dreamt of becoming a pilot as I thought it would be cool to see how a human can fly a plane using all the small knobs in-front. It never failed to amaze me how much engineering was involved behind the scene. This made me always keen to learn more about distributed systems like what’s inside an airplane and Kubernetes. Throughout my time at Civo, I have always looked forward to seeing the Civo passengers with smiles on their faces on the other side.
Ian: Having been at Civo since the start, I really feel like Civo is a family. Everyone across all teams embrace new starters and encourage everyone to succeed and help Civo thrive. Having worked for other companies under CEO Mark Boost, the company ethos and working environment have always been positive and supportive, but there is something extra around Civo, a sense of family.
I have seen the term ‘Imposter Syndrome' come up more than usual in the technical industry recently. It is something I have experienced, and it has been discussed widely at Civo. I believe we are pushing boundaries and striving to develop cutting edge systems, making these types of feelings inevitable. However, with the technical talent and supportive environment at Civo I feel confident I can find solutions to technical challenges and get great satisfaction out of learning new things all the time.
Want to learn more about joining our team of interns at Civo? Check out our careers page for more information on our latest roles.