How to keep track of Kubernetes events in your clusters?

Workloads running in the Kubernetes cluster are dynamic in nature. The pods, replicas, deployments in your cluster keep going on and off over the period of time due to their ephemeral nature. There are lot of situations when you want to check what happened in your cluster:

  • to debug historic incidents
  • to debug common tasks like:
    • finding info and events related to Kubernetes resources (like pods, replicasets, deployments, etc) that have been deleted, like:
    • finding info related to pods/replicasets that are replaced by newer pods/replicasets after a deployment update
    • getting details of pods evicted from a lost node
    • getting info about lost Kubernetes nodes, which no longer exist
    • knowing rollout details of older deployments
    • discovering hosts where pods a from previous deployment were running
    • retrieving timings of pod replacements and their health checks
    • long term behavioral analysis of your workloads running on your Kubernetes cluster
    • and so on...
               </li>
            </ul>
            <p>Basically, we may need information about all the events happening in a Kubernetes cluster.</p>
      <h3>What are Kubernetes Events?</h3>
    

    Kubernetes events is the answer to tackle the above problem. Kubernetes events are a great way to analyze past events in your cluster, since it captures all the events and resource state changes happening in your cluster. But there are a few drawbacks:

    • Kubernetes Events can generally only be be accessed using kubectl
    • The default retention period of kubernetes events is 1 hour.
    • The retention period can be increased using --event-ttl flag of kube-apiserver. But doing so can cause issues with the cluster's key-value store.
    • There is no way to visualize these events.

    Tools for Managing Kubernetes Events

    To address a few of the problems mentioned above, tools like Kubewatch, Eventrouter and Event-exporter have been developed.

    Kubewatch - Kubewatch is a tool to watch Kubernetes events and push notifications to available channels.

    Eventrouter - In Eventrouter, Kubernetes events are captured and routed to a backend sink. A sink can be anything like an Amazon S3 bucket or an Elasticsearch cluster where you can dump all your events. Later, you can create dashboards based on captured events using tools like Kibana or Grafana. It supports multiple sinks.

    Event-exporter - A Prometheus exporter to expose Kubernetes events in Prometheus format, which can then be stored in your Prometheus server, and you can then either create alerts using Alertmanager or create visualization dashboards using Grafana based on these collected events.

    The tools mentioned above are a good way to tackle most of the challenges posed by Kubernetes events. But these are not a standalone solution, you have a lot of work to do as an end user. You also need to configure other tools apart from these ones to store and visualize the events.

    Storing and visualizing Kubernetes events with Sloop

    Sloop in Use

    What is Sloop?

    Sloop is a standalone solution which can store and visualize Kubernetes events without needing as much effort from an end user perspective. Sloop monitors Kubernetes, recording histories of events and resource state changes, providing visualizations to aid in debugging past events. It was built by Salesforce.

    What are Sloop key features?

    • Allows you to find and inspect resources that no longer exist in your kubernetes cluster
    • Helps in answering almost all the queries mentioned at the beginning of this blog
    • Provides a timeline display that shows rollouts of related resources in updates to Deployments, ReplicaSets and StatefulSets
    • Helps in debugging transient and intermittent errors
    • Allows you to see changes over time in a Kubernetes application
    • Is a self-contained service with no dependencies on distributed storage

    How to install Sloop?

    Sloop can be installed using helm or as a standalone Docker container.

    All methods will require you to have a kubernetes cluster running, and the KUBECONFIG environment variable set up. If you have not yet signed up to Civo, you can sign up to apply for our managed Kubernetes beta to try this out for yourself!

    Helm

    $ git clone https://github.com/salesforce/sloop
    $ cd sloop/helm
    $ kubectl create ns sloop
    $ helm install sloop -n sloop ./sloop
    

    Docker container

    Refer to this document to run Sloop as a standalone docker container.

    then use kubectl's port-forward function to access the dashboard:

    kubectl port-forward -n sloop service/sloop 8080:80
    

    and visit http://localhost:8080/ to view the dashboard.

    Sloop Dashboard

    As you can see sloop provides a timeline of your kubernetes resources. It also provides different filters to visualize it.

    With Sloop, you can filter out Kubernetes resources based on the time range, the Kubernetes namespace, the kind of resource (like pods, pvc, node, etc), the resource name and also sort events based on different options. Selecting a particular Kubernetes resource in a specified timeline will show you different events occurring at that particular moment on that resource. This helps in capturing all past events that happened on that resource in your cluster.

    Sloop Debug Menu

    Sloop also exposes a debug menu where you can see its configuration, internal metrics and different settings. You can also query its internal data store, there are lot of things to tweak around here.

    Conclusion

    For more information, check out the Sloop project on GitHub.

    If you found this guide useful, let us know on Twitter at @civocloud! You can also reach me on Twitter at @milindchawre.