How to Mitigate Kubernetes Runtime Security Threats

It’s no secret that containers and Kubernetes are by far the most used cloud native technologies. According to stackoverflow’s latest survey, 53.9% of their 65,000 respondents use Docker, and 19% use Kubernetes as well.

With mass adoption of containers, a unique opportunity for attackers presents itself for attackers to take advantage of vulnerabilities at runtime. In a report by Sysdig, 91% of runtime scans are failing within organizations, indicating that many teams are still in reactive mode, identifying security issues after deployment rather than preventing them proactively.

In this article, we will discuss runtime security, what it is, and how you can mitigate it.

What is Runtime Security?

Runtime security is a proactive approach to securing hosts and applications. It involves continuous monitoring workloads for violations of user-defined security policies. As the name suggests, “runtime" refers to protecting your workloads while they are running rather than after the fact.

Why is it important?

When thinking about the impact of runtime security, it is essential to remember that attackers are constantly looking for vulnerabilities in your applications. As such, even if your code passes security scans, this does not guarantee it is safe once deployed.

Runtime security can help prevent various threats such as credential theft, where attackers attempt to extract sensitive information like API keys or sensitive credentials from running containers.

It also protects against malware injection attempts, where compromised containers might be used as entry points for introducing malicious code. Additionally, runtime security helps detect privilege escalation attempts where threat actors try to gain unauthorized access to sensitive resources.

Embracing runtime security enables you to take a proactive approach to security incidents.

Runtime Security

How does it work?

At a high level, an agent sits on each node in your Kubernetes cluster and monitors events such as the creation of pods, deployments or replicasets. For example, if a user attempts to create a privileged pod, the agent will block the event and log the action while sending a notification. Beyond pod creation, runtime security can monitor other events, such as network calls, file creation, or file access.

Beyond the examples above, runtime security can monitor other critical activities, such as network calls, file creation and access to sensitive files, as we will demonstrate in the hands-on example with Tetragon.

How eBPF is powering the Runtime Security space

Central to runtime security is a technology called eBPF (Extended Berkeley Packet Filter). For readers unfamiliar with it, eBPF is a technology that extends the Linux kernel to allow sandboxed programs to run within the operating system. It provides powerful capabilities like syscall monitoring and network tracing, which are critical for runtime security. Explaining ePBF can be tricky, checkout this article for a more in depth explanation.

Why is eBPF relevant?

eBPF is relevant to the runtime security space because of its ability to monitor syscalls and do network tracing. In more practical terms this means if a suspicious process tries to escalate privileges, runtime security agents are able to block and fire alerts about suspicious activity by monitoring the execve syscall. Later in this article we will touch on a few open source projects using eBPF.

eBPF also excels network monitoring. Through eXpress Data Path (XDP), eBPF is able to do high speed packet processing. This allows developers to add packet filtering and redirection with very minimal latency.

Top Runtime Security Tools

Tool	Primary Features	Use Cases	Performance Metrics
Tetragon	Real-time policy enforcement, eBPF-based observability, Low overhead monitoring	Protecting Kubernetes workloads at runtime, Enforcing real-time security policies	Minimal latency due to efficient eBPF-based tracing, Scalable to large clusters
Falco	Threat detection and compliance, Extensive integrations (50+ SIEM tools), Pre-defined rules library	Identifying anomalies in workloads, Ensuring compliance with security standards (e.g., PCI-DSS, GDPR)	Medium resource usage; rules can be optimized, Works well with SIEM tools for distributed environments
Tracee	eBPF-powered forensics, Event tracing with rich metadata, Focus on audit and investigation	Analyzing the cause of runtime incidents, Observing suspicious activity retrospectively	Optimized for event logging rather than real-time blocking, Suitable for environments requiring detailed forensic data
KubeArmor	Linux Security Modules (LSM) integration, eBPF-based syscall filtering, IoT and edge workload support	Securing containerized workloads and IoT devices, Extending security to edge computing environments	Low latency for edge use cases, Versatile for diverse workloads, including non-Kubernetes environments

Key Takeaways from the Comparison:

Tetragon is ideal for users who prioritize real-time security and observability with minimal resource overhead.
Falco stands out for its integrations and compliance focus, making it suitable for organizations with extensive regulatory requirements.
Tracee excels in post-event forensics and is valuable for teams focused on understanding and mitigating root causes of incidents.
KubeArmor is a versatile choice for environments with IoT or edge devices, offering extended runtime protection beyond Kubernetes workloads.

Hands-on with Tetragon

Taking things up a notch, let’s explore how you can protect your workloads using Tetragon. To follow along with this section of this post, you will need the following installed locally:

Create a cluster

If you haven’t already, create a new cluster using the civo CLI

civo k3s create --create-firewall --nodes 1 -m --save --switch --wait tetragon-lab

This will provision a one node cluster named tetragon-lab in your Civo account.

Install Tetragon

To install and deploy Tetragon, run the following commands:

Add the helm repo:

helm repo add cilium https://helm.cilium.io
helm repo update

Install the Tetragon chart:

helm install tetragon ${EXTRA_HELM_FLAGS[@]} cilium/tetragon -n kube-system

Verify the installation:

kubectl rollout status -n kube-system ds/tetragon -w

Output is similar to:

daemon set "tetragon" successfully rolled out

Deploy a sample application

To test Tetragon, we will need an application to test it on, thankfully Cilium has a deployment we can use:

kubectl create -f https://raw.githubusercontent.com/cilium/cilium/v1.15.3/examples/minikube/http-sw-app.yaml

Verify the pods were created:

kubectl get pods

Output is similar to:

NAME                                     READY   STATUS    RESTARTS   AGE
deathstar-b4b8ccfb5-dsrnt        1/1  Running  0           8s
deathstar-b4b8ccfb5-lt6fm        1/1  Running  0           8s
xwing                                        1/1        Running  0           7s
tiefighter                                   1/1        Running  0           7s

Protect sensitive files

The xwing pod contains a file called default.json which contains some sensitive information. To view the contents of the file, run the following command:

kubectl exec -ti xwing -- bash -c 'cat default.json'

Output is similar to:

{
  "private": [
{ "id": 1, "body": "secret information" }
  ],
  "public": [
{ "id": 1, "body": "public information" }
  ],
  "auth-header-required": [
{ "id": 1, "body": "super secret information" }
  ]
}

To protect access to this file, we can leverage a custom resource definition Tetragon provides called TracingPolicy, a tracing policy allows you to define custom, kernel-level tracing policies which can monitor specific system calls or kernel functions and set conditions for when these calls should be traced or acted upon, in our case, we want the system call to open the default.json to be acted upon, i.e., terminated.

Apply the following tracing policy

kubectl apply -f -
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
 name: "prevent-default-json-read"
spec:
 kprobes:
 - call: "security_file_permission"
   syscall: false
   return: true
   args:
   - index: 0
     type: "file"
   - index: 1
     type: "int"
   returnArg:
     index: 0
     type: "int"
   returnArgAction: "Post"
   selectors:
   - matchArgs:     
     - index: 0
       operator: "Equal"
       values:
       - "/default.json"
     - index: 1
       operator: "Equal"
       values:
       - "4" # MAY_READ
     matchActions:
     - action: Sigkill
EOF

With the policy applied, let’s try and access the file once more:

kubectl exec -ti xwing -- bash -c 'cat default.json'

Output is similar to:

command terminated with exit code 137

We can verify the command was indeed terminated by tetragon , by taking a look at the corresponding event generated.

kubectl logs -n kube-system -l app.kubernetes.io/name=tetragon -c export-stdout -f | jq

Output is similar to:

 "process_exec": {
    "process": {
    "exec_id": "azNzLXRldHJhZ29uLWxhYi1mZGVjLWY2NWM0Zi1ub2RlLXBvb2wtNTE5My1zamtsZToxMDM0MjU4MDYzMTE3MjE6MzIyMg==",
    "pid": 3222,
    "uid": 0,
    "cwd": "/",
    "binary": "/usr/bin/bash",
    "arguments": "-c \"cat default.json\"",
    "flags": "execve rootcwd clone",
    "start_time": "2024-09-15T11:39:11.959890661Z",
    "auid": 4294967295,
    "pod": {
        "namespace": "default",
        "name": "xwing",
        "container": {
        "id": "containerd://430b112a9a05082c7fb8d4c0c012e4e2cf40d147c68fca54e57fbbb78b5da1a1",
        "name": "spaceship",
        "image": {
            "id": "quay.io/cilium/json-mock@sha256:5aad04835eda9025fe4561ad31be77fd55309af8158ca8663a72f6abb78c2603",
            "name": "sha256:adcc2d0552708b61775c71416f20abddad5fd39b52eb4ac10d692bd19a577edb"
        },
        "start_time": "2024-09-14T17:39:24Z",
        "pid": 36
        },
        "pod_labels": {
        "app.kubernetes.io/name": "xwing",
        "class": "xwing",
        "org": "alliance"
        },
        "workload": "xwing",
        "workload_kind": "Pod"
    },
    "docker": "430b112a9a05082c7fb8d4c0c012e4e",
    "parent_exec_id": "azNzLXRldHJhZ29uLWxhYi1mZGVjLWY2NWM0Zi1ub2RlLXBvb2wtNTE5My1zamtsZTozODYzNDg1NTQxNjkzMDoyODU1Nw==",
    "tid": 3222
    }
  },
  "node_name": "k3s-tetragon-lab-fdec-f65c4f-node-pool-5193-sjkle",
  "time": "2024-09-15T11:39:11.959888807Z"
}

We get a rich response which includes information about the time, node, pod, and arguments which were used to execute the command.

Note: While protecting sensitive files using tracing policies is effective, it's crucial to incorporate broader security practices to ensure comprehensive protection. Regularly scan for misconfigurations or insecure container images in your environment. Tools like Checkov or Kube-hunter can help identify vulnerabilities in configurations, while Trivy can scan container images for known vulnerabilities. Integrating these tools into your CI/CD pipeline ensures proactive identification and remediation of risks before runtime.

Clean up

If you followed this section of the tutorial, you might want to delete some of the resources we provisioned

Uninstall Tetragon:

helm uninstall tetragon -n kube-system

Delete the Kubernetes cluster:

civo k3s rm tetragon-lab

Conclusion

Runtime security is reshaping how teams protect their workloads, emphasizing proactive measures to mitigate threats in real-time. Tools like Tetragon, Falco, Tracee, and KubeArmor each bring unique strengths to runtime security, allowing developers to tailor their strategies based on their environment and priorities. Whether you’re securing Kubernetes workloads, addressing compliance requirements, or analyzing post-incident forensics, these tools provide robust solutions to modern runtime security challenges.

For those exploring container security further, technologies like eBPF are at the forefront of innovation, powering advanced solutions not only in runtime security but also across service meshes and supply chain security.

Interested in learning more about container security? Here are some ideas:

Check out this article on how eBPF is helping out in the service mesh space
Supply security is a big threat to containers, learn how you can defend your images using sigstore tooling

What is Runtime Security?

Why is it important?

How does it work?

How eBPF is powering the Runtime Security space

Why is eBPF relevant?

Top Runtime Security Tools

Key Takeaways from the Comparison:

Hands-on with Tetragon

Create a cluster

Install Tetragon

Deploy a sample application

Protect sensitive files

Apply the following tracing policy

Clean up

Conclusion

Boemo Mmopelwa

Further reading

These may also be of interest

Progressive Delivery of Applications on Kubernetes with Argo Rollouts and Argo CD

Benchmarking Kubernetes storage using Kubestr

Kubectl commands - a comprehensive guide

How to mitigate Kubernetes runtime security threats

What is Runtime Security?

Why is it important?

How does it work?

How eBPF is powering the Runtime Security space

Why is eBPF relevant?

Top Runtime Security Tools

Key Takeaways from the Comparison:

Hands-on with Tetragon

Create a cluster

Install Tetragon

Deploy a sample application

Protect sensitive files

Apply the following tracing policy

Clean up

Conclusion

Boemo Mmopelwa

Further reading

These may also be of interest

Progressive Delivery of Applications on Kubernetes with Argo Rollouts and Argo CD

Benchmarking Kubernetes storage using Kubestr

Kubectl commands - a comprehensive guide