A feature of Prometheus that makes it a choice monitoring tool is its pull-based model for retrieving metrics. The Prometheus server pulls metrics from targeted services at pre-configured intervals, storing them in its time-series database, from where they can be queried. The Prometheus server does all the heavy lifting by pulling our app metrics from a designated endpoint, relieving our applications. When our applications get busy generating large volumes of metric data, we can throttle our Prometheus server or scale it horizontally without disturbing our application.

Additionally, by leveraging the Prometheus client libraries, we can create custom metrics, add labels, and expose these metrics via an endpoint. The Prometheus community offers libraries in various languages, including Go, Python, Rust, etc., so we can add instrumentation to our app code using the same programming language. There are also various third-party libraries written and maintained by multiple communities, and where none of these suffices, we can write our own library.

Exposing Prometheus metrics from applications

Our sample application, ‘Civoapp’, is a simple app written in Golang with two endpoints.

The first endpoint prints “Hello from inside a Civo Kubernetes Cluster” when a GET request is made to the homepage ( / ). The second endpoint prints “You shouldn’t be here” when a GET request is made to our error page ( /error ).

Our app is exposed on port 8080.

package main

import (
    "log"
    "math/rand"
    "net/http"
    "time"
)

func main() {

    homePageFunc := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        time.Sleep(time.Duration(rand.Intn(9)) * time.Second)
        w.WriteHeader(http.StatusOK)
        w.Write([]byte("Hello from inside a Civo Kubernetes Cluster"))
    })

    errorPageFunc := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        time.Sleep(time.Duration(rand.Intn(7)) * time.Second)
        w.WriteHeader(http.StatusOK)
        w.Write([]byte("You shouldn't be here"))
    })

    http.Handle("/", homePageFunc)
    http.Handle("/error", errorPageFunc)
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Now that we have our app, we can decide what kind of metrics we want to expose. We will start by installing the libraries using ‘go get’.

go get github.com/prometheus/client_golang/prometheus
go get github.com/prometheus/client_golang/prometheus/promauto
go get github.com/prometheus/client_golang/prometheus/promhttp

The Prometheus library is the core library used to instrument metrics and labels. It also allows us to register the metrics.

Using the promhttp library, we can expose our metrics at the /metrics endpoint.

We begin instrumenting metrics in our code with a combination of these handy libraries.

Instrumenting Prometheus in our app

First, we define the metrics we want to collect:

---
var onlineUsers = prometheus.NewGauge(prometheus.GaugeOpts{
    Name: "civoapp_online_users",
    Help: "Online users",
    ConstLabels: map[string]string{
        "app": "civoapp",
    },
})

var httpRequestsTotal = prometheus.NewCounterVec(prometheus.CounterOpts{
    Name: "civoapp_http_requests_total",
    Help: "Count of all HTTP requests for civoapp",
}, []string{})

var httpDuration = prometheus.NewHistogramVec(prometheus.HistogramOpts{
    Name: "civoapp_http_request_duration",
    Help: "Duration in seconds of all HTTP requests",
}, []string{"handler"})
---

onlineUsers is a gauge metric capturing the number of current users of our application and is named “civoapponlineusers.”

httpRequestsTotal is a counter metric capturing the number of requests made to the endpoints of our app and is named “civoapphttprequests_total”.

httpDuration is a histogram metric capturing the duration (in seconds) of requests made to our app’s endpoints and placing them in various buckets. It is named “civoapphttprequest_duration”.

After defining our metrics, we then register them using the Register function from the Prometheus library.

    ---
  r := prometheus.NewRegistry()
    r.MustRegister(onlineUsers)
    r.MustRegister(httpRequestsTotal)
    r.MustRegister(httpDuration)
---

Then we can begin instrumenting prometheus in our code.

    ...
  homePage := promhttp.InstrumentHandlerDuration(
        httpDuration.MustCurryWith(prometheus.Labels{"handler": "homePage"}),
        promhttp.InstrumentHandlerCounter(httpRequestsTotal, homePageFunc),
    )

    errorPage := promhttp.InstrumentHandlerDuration(
        httpDuration.MustCurryWith(prometheus.Labels{"handler": "errorPage"}),
        promhttp.InstrumentHandlerCounter(httpRequestsTotal, errorPageFunc),
    )
---

The promhttp.InstrumentHandlerDuration and promhttp.InstrumentHandlerCounter are middlewares that provide the instrumentation for our metrics.

promhttp.InstrumentHandlerDuration instruments our histogram metric , adding labels which identify the function handler, while promhttp.InstrumentHandlerCounter instruments our counter metric.

Finally, the promhttpHandler exposes our metrics on the /metrics endpoint.

---
http.Handle("/metrics", promhttp.HandlerFor(r, promhttp.HandlerOpts{}))
---

And our app is ready to collect metrics.

package main

import (
    "log"
    "math/rand"
    "net/http"
    "time"

    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promhttp"
)

var onlineUsers = prometheus.NewGauge(prometheus.GaugeOpts{
    Name: "civoapp_online_users",
    Help: "Online users",
    ConstLabels: map[string]string{
        "app": "civoapp",
    },
})

var httpRequestsTotal = prometheus.NewCounterVec(prometheus.CounterOpts{
    Name: "civoapp_http_requests_total",
    Help: "Count of all HTTP requests for civoapp",
}, []string{})

var httpDuration = prometheus.NewHistogramVec(prometheus.HistogramOpts{
    Name: "civoapp_http_request_duration",
    Help: "Duration in seconds of all HTTP requests",
}, []string{"handler"})

func main() {
    r := prometheus.NewRegistry()
    r.MustRegister(onlineUsers)
    r.MustRegister(httpRequestsTotal)
    r.MustRegister(httpDuration)

    go func() {
        for {
            onlineUsers.Set(float64(rand.Intn(5000)))
        }
    }()

    homePageFunc := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        time.Sleep(time.Duration(rand.Intn(9)) * time.Second)
        w.WriteHeader(http.StatusOK)
        w.Write([]byte("Hello from inside a Civo Kubernetes Cluster"))
    })

    errorPageFunc := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        time.Sleep(time.Duration(rand.Intn(7)) * time.Second)
        w.WriteHeader(http.StatusOK)
        w.Write([]byte("You shouldn't be here"))
    })

    homePage := promhttp.InstrumentHandlerDuration(
        httpDuration.MustCurryWith(prometheus.Labels{"handler": "homePage"}),
        promhttp.InstrumentHandlerCounter(httpRequestsTotal, homePageFunc),
    )

    errorPage := promhttp.InstrumentHandlerDuration(
        httpDuration.MustCurryWith(prometheus.Labels{"handler": "errorPage"}),
        promhttp.InstrumentHandlerCounter(httpRequestsTotal, errorPageFunc),
    )

    http.Handle("/", homePage)
    http.Handle("/error", errorPage)
    http.Handle("/metrics", promhttp.HandlerFor(r, promhttp.HandlerOpts{}))
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Now that our app is ready, we will containerize it for deployment using Docker by first giving build instructions in a dockerfile.

FROM golang:1.15-alpine as build

WORKDIR /app
COPY . /app/
RUN go build -o app

FROM alpine as runtime 
COPY --from=build /app/app /
CMD ./app

Next we will build our image and tag it using the following command. You would need to change the registry from "ehienabs" to your own account:

docker build -t ehienabs/civoapp:v1 .

Finally because we want to be able to share our image we will push into our repository. Once again, make sure you change the repository name to reflect your account.

docker push ehienabs/civoapp:v1

We are ready to deploy our application.

Deploying our application on Kubernetes

Kubernetes is a container orchestration platform. It is a portable, extensible, open-source platform for managing containerized applications. We will deploy our application in a Civo managed Kubernetes cluster.

Civo’s cloud-native infrastructure services are powered by Kubernetes and use the lightweight Kubernetes distribution K3s for superfast launch times.

Prerequisites

To get started, we will need the following:

After setting up the Civo command line with our API key using the instructions in the repository, we can create our cluster using the following command:

civo kubernetes create civo-cluster

Our Kubernetes cluster ‘civo-cluster’ is created.

Civo cluster dashboard page showing a running cluster

Installing Prometheus in our Kubernetes cluster

Next, we will prepare our cluster for monitoring by installing Prometheus. We will do so by installing the Prometheus Operator.

The Prometheus Operator helps simplify the creation, configuration, and management of the Prometheus on Kubernetes.

We will start by creating a namespace called ‘monitoring’ where all our monitoring resources will reside.

kubectl create ns monitoring 

Now we can install the Prometheus Operator.

kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/bundle.yaml -n monitoring

The Prometheus operator deploys the following Custom Resource Definitions:

  • Prometheus
  • Alertmanager
  • Service Monitor
  • Pod Monitor
  • Probes
  • Prometheusrules

Prometheus components listed in terminal output

We can now define our Prometheus deployment as a stateful set, specifying how we want our Prometheus to operate. For example, we can describe the namespace we want to pull metrics from, the service monitors to select, the service account to use, etc.

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  serviceAccountName: prometheus
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector: {}
  podMonitorSelector: {}
  resources:
    requests:
      memory: 400Mi

We need to assign some permissions to our deployment so Prometheus can operate undisturbed.

First, we will create a service account, serviceaccount.yaml:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus

Then we create some roles indicating the permissions we would like to assign to Prometheus. This is rbac.yaml:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get"]
- nonResourceURLs: ["/metrics", "/metrics/cadvisor"]
  verbs: ["get"]

Finally we will put it all together by binding the roles to our service account using ClusterRoleBinding. Save the following as rolebinding.yaml:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: monitoring

We create the service account, roles and rolebinding by running the following command, applying all of the above-created files in one go.

kubectl apply -f serviceaccount.yaml rbac.yaml rolebinding.yaml

Finally we can access our Prometheus User Interface for our web browser.

To expose our Prometheus service using port-forwarding, we will run the following command;

kubectl port-forward prometheus-prometheus-0 9090:9090

We can reach Prometheus at http://localhost:9090/

Prometheus dashboard page for our running cluster

Now that our Cluster is prepared to monitor our app, we can deploy it!

Deploying our application

Kubernetes deployment allows us to define the lifecycle, resources, update strategy, etc., for our application and ensures that the desired state matches the live state.

Our app deployment is as follows, which can be saved as deployment.yaml. Make sure you change the image: line to match the container image you created earlier.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: civoapp-deployment
  labels:
    app: civoapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app: civoapp
  template:
    metadata:
      labels:
        app: civoapp
    spec:
      containers:
      - name: civoapp
        image: ehienabs/civoapp:v1
        imagePullPolicy: Always
        ports:
        - containerPort: 8080

Next we expose our app using a load balancer service, so we can reach our application from outside the cluster. Save the following as service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: civoapp-service
  labels:
    app: civoappsvc
spec:
  selector:
    app: civoapp
  ports:
    - name: http
      port: 80
      targetPort: 8080
  type: LoadBalancer

Finally, we will create a namespace using the following command;

kubectl create ns civoapp

Then we apply our deployment manifests to our cluster using the following command;

kubectl apply -f deployment.yaml service.yaml -n civoapp

After our resources have been created, we can get our service IP address by using the following command:

kubectl get svc -n civoapp

Now we can reach our app using the service IP address.

web browser accessing the external IP of our cluster and seeing our app return the response

We can reach our error page at the /error endpoint.

Web browser showing "you shouldn't be here"

And finally our metrics at the /metrics endpoint.

Metrics in raw form when accessing through the cluster's public IP

Service discovery with Prometheus

We have our monitoring set up; we have also deployed our application and exposed metrics on the /metrics endpoint. Now we are tasked with letting Prometheus know from where to pull application metrics.

Service Discovery is Prometheus’ way of finding endpoints to scrape. It allows Prometheus to adapt to dynamic environments such as Kubernetes. Using a source of truth instead of manual configurations, Prometheus can mimic Kubernetes’ reconciliatory style of keeping resources updated.

Prometheus Service Discovery allows it to watch for changes in our applications and their environment and update its scrape configurations based on those changes.

Service monitor

We will often specify multiple application instances when we deploy them in Kubernetes. These pods are often temporary and are created and destroyed to match the desired state of our deployment. A Kubernetes Service provides a way of grouping similar applications, providing a stable IP address, port, name, and some routing policies.

We use the Prometheus Service Monitor to define the services whose endpoints objects (pods) we wish to scrape.

The Service Monitor is a configuration object with which we can describe scrape targets, intervals, port numbers, etc. The Prometheus Operator will automatically update its configurations by adding our specified endpoints. Service Monitor uses Labels for service discovery.

We set up a service monitor for our app with the following configuration. Our Service Monitor tells Prometheus to pull metrics from our app service by selecting its civoappsvc label. Save the following as servicemonitor.yaml:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: prometheus
  labels:
spec:
  selector:
    matchLabels:
      app: civoappsvc
  namespaceSelector:
    any: true
  endpoints:
    - port: http

Next we deploy it to our cluster, using the following command.

kubectl apply -f servicemonitor.yaml

If we go to our Prometheus UI we can see our that our app has been included amongst our discovered scrape target.

the /metrics endpoint data included in our Prometheus logs

And from the metrics explorer, we can see our metrics.

Prometheus metrics listing showing our custom app metrics

Using PromQL to run queries, we can get useful information about our application. From our metrics explorer, we can select metrics and view their data. E.g.

Prometheus query interface showing total HTTP requests

In this demo case, our app ‘civoapp’ has received a total of 923 requests to all its endpoints.

PromQL also let us make interesting queries by aggregating using labels. For example, we can find out how many requests were sent to our ‘homepage’ that took less than 0.1s by using their labels.

Combining query data using PromQL

106 of the requests to our ‘homepage’ took less than 0.1s.

Visualizing metric data with Grafana

Being able to query our metrics data and get results is beneficial in itself; however, our engineers’ energy is usually better spent working on features or improving our applications and infrastructure resilience.

Also, we are unlikely to be proactive about the health of our applications if we are only responding after things have gone wrong, defeating the entire purpose of having a monitoring system.

Grafana is an open-source visualization and analytics software. We can pull metrics from various sources, run queries against them, and visualize them, making it easy to gain insight and make decisions about our services.

We will install Grafana with Helm by first adding the repository.

helm repo add grafana https://grafana.github.io/helm-charts

We then install the chart using the following command (where ‘grafana’ is our release name).

helm install grafana grafana/grafana

We can access our Grafana service by port-forwarding using the following command, making sure you update the name of the pod to forward traffic into:

kubectl port-forward grafana-5874c8c6cc-h6q4s 3000:3000

We will reach our grafana service at http://localhost:3000 from our web browser.

Grafana login screen

We can log in using ‘admin’ as the username and retrieve our password with the following command.

kubectl get secret --namespace default grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

And we are logged in.

Grafana default main page

With Grafana, we can visualize data from a variety of sources by adding them as data sources using their URL.

From the settings, we click in ‘data source’.

Grafana settings page showing "add data source"

Grafana will show us a few data sources we can add.

Grafana available data sources

Since Prometheus runs in the same cluster as our Grafana service, they can communicate using their local DNS. We can add Prometheus as a data source using its local DNS name.

adding local DNS connection to Prometheus

After adding our data source, we visualize our data using Grafana panels and dashboards.

Grafana panels are the basic unit of visualizing data with Grafana. When we create a new panel, we can select a data source, the styling, formatting, and type of visualization. Panels also include query editors that we can use to manipulate our data to extract the kind of information we want.

For example, we may want to know the number of online users frequently, and we can use Grafana panels to visualize this data.

We will start by adding a new panel.

New grafana panel screen

Then we add our query into the query editor.

Specifying the query for Grafana in promQL, in this case civoapp_online_users{}

After adding our query, we can choose the type of visualization we want for our data.

Grafana visualisation options showing a speedometer-type reading

With Grafana Dashboards, we can group multiple panels, giving us the capacity to view more information about our app quickly. For example, we may want to know the trends in the usage of our application. Data trends allow us to view the past and predict future behaviors of our application.

For example, the popularity of our app can be measured by how frequently our app is requested. For this, we will use the rate function in our query.

First, we populate our dashboard by adding another panel.

Adding a new panel

Then we can add our query and visualization type.

Line graph showing a trend over time

After we apply our changes, we can see our dashboard with both panels.

Dual grafana dashboards side-by-side

Wrapping up

We will have instrumented custom metrics in our go application using the Prometheus client library by following this guide.

We have also containerized our application, deployed it to our Kubernetes cluster, and configured Prometheus to scrape our metrics from /metrics endpoint.

We also made our metric data valuable by running PromQL queries and visualizing the results using Grafana.