So you’re interested in extending the functionality of Kubernetes? One powerful way to do this is by using the Kubernetes API to introduce your own custom resources.

Enter the world of CustomResourceDefinitions (CRDs). CRDs allow you to define new types of resources that your Kubernetes cluster can manage, much like the built-in resources such as Pods, Services, and Deployments.

Once you've defined a CRD, you can create, get, list, watch, update, patch, and delete instances of your custom resource just as you would with native Kubernetes resources. This capability is particularly useful when you need Kubernetes to manage objects that aren't natively supported, be it a database, a configuration component, or any other service or application component. While Kubernetes offers a rich set of features out of the box, CRDs ensure that you're not restricted to just what Kubernetes natively supports.

Throughout this tutorial, you’ll be taken through a comprehensive tutorial to extending the Kubernetes API with CRDs, giving you the knowledge and tools to create powerful custom resources tailored to your needs.

Prerequisites

To follow along with this tutorial, you will need the following things:

Note: This tutorial assumes basic Kubernetes knowledge with concepts such as pods, deployments, and services. It may also be beneficial to have docker familiarity and an understanding of Git.

How to create CustomResourceDefinitions

Creating a CRD is pretty straightforward. You'll need to define a YAML file with kind: CustomResourceDefinition and information about spec.names, spec.scope, and spec.validation.

For example:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  # The name of the CustomResourceDefinition, following the format: <plural-name>.<group-name>
  name: crontabs.stable.example.com
spec:
  # The API group that will be used for the CustomResourceDefinition: /apis/<group-name>/<version>/
  group: stable.example.com
  versions:
    - name: v1
      # Indicates whether this version should be served via the API
      served: true
      # Indicates whether this version should be used for storing resources
      storage: true
  # Specifies the scope of the CustomResourceDefinition: "Namespaced" or "Cluster"
  scope: Namespaced
  names:
    # The plural name used in the URL: /apis/<group-name>/<version>/<plural-name>/
    plural: crontabs
    # The singular name used as an alias on the CLI and for display
    singular: crontab
    # The CamelCased singular type for the resource
    kind: CronTab
    # Shorter string(s) to match the resource on the CLI
    shortNames:
      - ct

Once you create the CRD, you can create custom resources that match that definition. Kubernetes will now recognize this new CronTab resource type. The CronTab resources will have the standard Kubernetes API semantics like any other built-in resource.

Advantages of CustomResourceDefinitions

CRDs offer numerous advantages when extending the Kubernetes API, below are three of the key advantages:

Feature Description
Flexibility and Customization CRDs provide the ability to create unique resource types within Kubernetes clusters, offering flexibility and customization.
API Support CRDs define full API support, including creating, getting, listing, watching, updating, patching, and deleting custom resources. This allows for modeling any API using CRDs, extending Kubernetes capabilities to create powerful, custom resources aligned with application and infrastructure needs.
Effortless Scaling CRDs scale effortlessly with Kubernetes clusters, ensuring seamless compatibility and efficient resource management. The adoption of CRDs has led to a vibrant ecosystem within the Kubernetes community.

In short, CRDs are a powerful tool for customizing and extending the Kubernetes API. Their flexibility, comprehensive API support, and endless possibilities empower shaping Kubernetes clusters to meet unique needs.

Defining CustomResourceDefinitions

Defining your own custom resources is what really unlocks the power of Kubernetes. Once you get the hang of it, you'll extend the Kubernetes API in no time.

How to define the structure of a CRD manifest?

To define a CRD, you'll need to create a YAML manifest. The most important fields are:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name:
spec:
  group:
  version:
  scope: Namespaced # or Cluster
  names:
    plural:
    singular:
    kind:

How to specify the schema and validation rules for custom resources?

You'll also want to define a schema to validate your custom resources. Use the validation field and OpenAPI v3 validation schema. For example:

validation:
  openAPIV3Schema:
    properties:
      apiVersion:
        type: string
      kind:
        type: string
      ...
      spec:
        type: object
        properties:
          size:
            type: integer
            minimum: 1

This will ensure any objects of this CRD have a spec.size field that is an integer of at least 1.

Once you've defined and applied your CRD, you can create custom resources like any other Kubernetes object! Use kubectl apply -f and a YAML file with apiVersion, kind, and metadata fields.

How to leverage annotations and labels in CRDs?

Leverage annotations and labels to enhance and tailor your Custom Resource Definitions (CRDs) in Kubernetes. Annotations add metadata to provide additional information about the CRDs, while labels help in organizing and categorizing the CRDs.

Annotations can document:

  • Purpose
  • Owner
  • Version

This gives context to other developers working with the CRDs.

Labels help you organize CRDs based on properties that matter to you. For example, use labels to filter and sort CRDs by environment or application, making it easier to manage resources in large clusters.

So, use annotations to document and labels to categorize. Then, be consistent with your annotation and label names so everyone understands them. Use clear, descriptive values to make sure they actually provide value.

To add annotations or labels to your CRDs, simply include the respective fields in the YAML manifest alongside other important fields, such as:

  • apiVersion
  • kind
  • metadata

Below is an example demonstrating how to add annotations and labels to a CRD YAML file:

metadata:
  annotations:
    mycompany.com/description: "This CRD defines custom resources for managing XYZ"
    mycompany.com/owner: "Rahul"
  labels:
    environment: production
    app: frontend

In this example, we've added annotations to describe the CRD and specify the owner. Additionally, we've assigned labels to indicate that the resources belong to the production environment and are related to the front-end application.

By using annotations and labels effectively, you can enhance the manageability and organization of your CRDs, making it easier for teams to collaborate and maintain a clear understanding of the custom resources in your Kubernetes clusters.

For more detailed information on each field in the CRD YAML file, refer to the official Kubernetes documentation.

Using CustomResourceDefinitions

Once you've created a CRD, you can start creating instances of your custom resource. This is where the fun begins! For example, let's say you have a CRD for a "Fruit" resource. Here's how you can create an instance of it:

apiVersion: "examples.com/v1"
kind: Fruit
metadata:
  name: apple
spec:
  color: red
  sweetness: 10

Save this to a file like apple.yaml and run:

kubectl apply -f apple.yaml

This will create your first Fruit resource! You can view it with kubectl get fruit:

NAME AGE

apple 12s

Check out the full details with kubectl describe fruit apple:

Name: apple
Namespace: default
Labels:
Annotations:
API Version: examples.com/v1
Kind: Fruit
Metadata:
  Creation Timestamp: 2020-07-15T17:26:51Z
  Generation: 1
  Resource Version: 74502
  Self Link: /apis/examples.com/v1/namespaces/default/fruits/apple
  UID: 08b0e19c-4f2b-4337-9b57-1d558a7e9ed3
Spec:
  Color: red
  Sweetness: 10
Events:

As you can see, this gives you the full details of your custom resource, including the spec fields you defined.

To update the resource, simply modify the YAML file and apply the changes:

kubectl apply -f apple.yaml

Likewise, to delete it:

kubectl delete -f apple.yaml

This covers the basics of creating, viewing, updating, and deleting custom resources. By defining a CRD, you've extended the Kubernetes API and can now manage your own custom resources just like built-in ones!

If you encounter issues while working with CRDs, consider checking the Kubernetes troubleshooting guide for more information.

Validation and Management of Custom Resources

So, you've created your custom resource definition and deployed it in your Kubernetes cluster. Now what? Your CRD is only helpful if you can properly validate, manage, and secure the custom resources (CRs) that use it.

Specifying validation rules

When you define a CRD, you can specify validation rules for the custom resources that use it. For example, you might want to require certain fields, validate that values are in a given range, have a specific format, or match a regular expression. This example assumes a custom resource that uses a spec.replicaCount field.

To add validation to your CRD (applicable to Kubernetes v1.16 and above), you'll use the validation field. For example, to require the spec.replicaCount field and validate that its value is between 1 and 10, you'd use:

validation:
  openAPIV3Schema:
    required: ["spec.replicaCount"]
    properties:
      spec:
        properties:
          replicaCount:
            type: integer
            minimum: 1
            maximum: 10

Now, if you try to create a CR without the replicaCount field or with a value outside the 1-10 range, the API server will reject it.

Specifying default values

You can also specify default values for specific fields in your CRD. For example, you might want to set a default spec.replicaCount of 3. To do so, use the default field nested under the openAPIV3Schema:

openAPIV3Schema:
  default:
    spec:
      replicaCount: 3

Now, if a custom resource is created without a spec.replicaCount value, it will default to 3.

Applying RBAC rules

Like with built-in Kubernetes resources, you'll want to apply role-based access control (RBAC) rules to your custom resources. You can create ClusterRoles and ClusterRoleBindings to grant permissions to create, read, update, and delete the CRs that use your CRD.

For example, to grant full access to your Widget CRD, you'd create a ClusterRole like this:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: widget-admin
rules:
- apiGroups: ["widgets.example.com"]
  resources: ["widgets"]
  verbs: ["*"]

Then create a ClusterRoleBinding to bind that role to users, groups, or service accounts:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: widget-admins
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: widget-admin
subjects:
- kind: User
  name: rahul@abc.com

Tools and frameworks for managing and deploying CRDs

Managing and deploying CRDs in Kubernetes can be made easier using various tools and frameworks. Here are some of the options:

Tool Description
Kustomize Allows you to customize and manage your CRDs through simple configuration files. It simplifies deploying multiple resources with customizations declared in a centralized way. Kustomize is built into Kubernetes itself.
Helm Is a package manager for Kubernetes that lets you define, install, and upgrade CRDs as packages called charts. It handles dependencies between resources and provides an easy upgrade path for custom resources.
The Operator SDK Helps you build Kubernetes operators - programs that manage CRDs. It provides tools and libraries to simplify operator development, deployment, and management. Operators automate tasks involving your custom resources.
KubeBuilder Is a framework for building custom controllers for your CRDs. It generates the boilerplate code needed for your controllers and provides an API for creating, updating, and deleting custom resources. This simplifies building controllers for your own CRDs.
Kubectl The Kubernetes command line tool, allows you to deploy and manage your CRDs using commands like create, get, update and delete. It provides a simple interface for working with your custom resources directly from the command line.

Building Custom Kubernetes Controllers with CRDs

So you've created a CRD and want to do something useful with it, like deploy an application. Kubernetes controllers are the key. Controllers watch for changes to resources and then act on them. They're how you'll build functionality around your new CRD.

For example, say you've created a CRD to represent CoffeeShops. You'll want a controller to actually deploy a CoffeeShop application when a CoffeeShop resource is created. Here's how you can build a controller to do that:

First, you'll need a deployment template for your application, like a Deployment and Service. Define that in a YAML file.

Next, build the controller. It will watch for CoffeeShop resources and deploy the template when one is created. Here's a basic controller in Go:

import (
    "context"
    "fmt"
    "time"

    // Import the necessary Kubernetes packages
    "k8s.io/apimachinery/pkg/api/errors"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/apimachinery/pkg/runtime"
    "k8s.io/apimachinery/pkg/util/wait"
    "k8s.io/client-go/kubernetes"
    "k8s.io/client-go/tools/clientcmd"
    "k8s.io/client-go/tools/record"
    "k8s.io/client-go/util/workqueue"

    // Import the CRD API and types
    // Please replace with the actual API group and version of your CRD
    samplev1 "your-api-group/your-api-version"

    // Import the necessary informer and controller packages
    "k8s.io/client-go/tools/cache"
    "k8s.io/client-go/tools/record"
)

// CoffeeShopController reconciles CoffeeShop objects
type CoffeeShopController struct {
    clientset  *kubernetes.Clientset
    queue      workqueue.RateLimitingInterface
    informer   cache.SharedIndexInformer
    controller cache.Controller
}

// NewCoffeeShopController creates a new instance of the CoffeeShopController
func NewCoffeeShopController(kubeconfigPath string) (*CoffeeShopController, error) {
    config, err := clientcmd.BuildConfigFromFlags("", kubeconfigPath)
    if err != nil {
        return nil, err
    }

    clientset, err := kubernetes.NewForConfig(config)
    if err != nil {
        return nil, err
    }

    // Create an informer to watch for changes to CoffeeShop resources
    // Replace `samplev1` with the correct API group/version of your CRD
    informer := cache.NewSharedIndexInformer(
        &cache.ListWatch{
            ListFunc: func(options metav1.ListOptions) (runtime.Object, error) {
                return clientset.SampleV1().CoffeeShops(metav1.NamespaceAll).List(context.TODO(), options)
            },
            WatchFunc: func(options metav1.ListOptions) (watch.Interface, error) {
                return clientset.SampleV1().CoffeeShops(metav1.NamespaceAll).Watch(context.TODO(), options)
            },
        },
        &samplev1.CoffeeShop{},
        0, // resyncPeriod, set to 0 to disable periodic resync
        cache.Indexers{},
    )

    controller := cache.NewControllerInformer(informer, cache.ResourceEventHandlerFuncs{
        AddFunc: func(obj interface{}) {
            coffeeShop := obj.(*samplev1.CoffeeShop)
            // Handle the creation of a new CoffeeShop resource
            fmt.Printf("CoffeeShop created: %s/%s\n", coffeeShop.Namespace, coffeeShop.Name)
            // Deploy the coffeeshop application using a template (Deployment, Service, etc.)
            // You can define the template in a separate YAML file and create it here using clientset.AppsV1().Deployments(namespace).Create()
            err := r.deployCoffeeShopApplication(coffeeShop)
            if err != nil {
                // Handle the error
                return
            }
        },
        // Add more event handlers for update and delete operations if needed
    })

    controller.Run(wait.NeverStop)
    return &CoffeeShopController{
        clientset:  clientset,
        queue:      workqueue.NewRateLimitingQueue(workqueue.DefaultControllerRateLimiter()),
        informer:   informer,
        controller: controller,
    }, nil
}

// Run starts the CoffeeShop controller
func (r *CoffeeShopController) Run(stopCh <-chan struct{}) {
    defer runtime.HandleCrash()

    // Start the informer and wait until the cache is synced
    go r.informer.Run(stopCh)
    if !cache.WaitForCacheSync(stopCh, r.informer.HasSynced) {
        runtime.HandleError(fmt.Errorf("failed to sync informer cache"))
        return
    }

    // Start the worker goroutines to process the items in the queue
    for i := 0; i < 2; i++ {
        go wait.Until(r.runWorker, time.Second, stopCh)
    }

    <-stopCh
}

// runWorker processes items from the queue
func (r *CoffeeShopController) runWorker() {
    for r.processNextItem() {
    }
}

// processNextItem retrieves the next item from the queue and reconciles the CoffeeShop resource
func (r *CoffeeShopController) processNextItem() bool {
    key, quit := r.queue.Get()
    if quit {
        return false
    }
    defer r.queue.Done(key)

    // Convert the key to a namespace/name object
    namespace, name, err := cache.SplitMetaNamespaceKey(key.(string))
    if err != nil {
        r.queue.Forget(key)
        runtime.HandleError(fmt.Errorf("unable to split key %s: %v", key, err))
        return true
    }

    // Retrieve the CoffeeShop resource from the informer's cache
    obj, exists, err := r.informer.GetIndexer().GetByKey(key.(string))
    if err != nil {
        r.queue.AddRateLimited(key)
        runtime.HandleError(fmt.Errorf("failed to get CoffeeShop %s/%s from cache: %v", namespace, name, err))
        return true
    }

    // If the CoffeeShop resource does not exist anymore, handle deletion
    if !exists {
        r.handleDeletedCoffeeShop(namespace, name)
        r.queue.Forget(key)
        return true
    }

    // Otherwise, reconcile the CoffeeShop resource
    r.handleCoffeeShop(obj.(*samplev1.CoffeeShop))
    r.queue.Forget(key)
    return true
}

// handleDeletedCoffeeShop handles the deletion of a CoffeeShop resource
func (r *CoffeeShopController) handleDeletedCoffeeShop(namespace, name string) {
    fmt.Printf("CoffeeShop deleted: %s/%s\n", namespace, name)
    // Handle the deletion of the CoffeeShop resource
}

// handleCoffeeShop handles the reconciliation of a CoffeeShop resource
func (r *CoffeeShopController) handleCoffeeShop(obj interface{}) {
    coffeeShop := obj.(*samplev1.CoffeeShop)
    fmt.Printf("Reconciling CoffeeShop: %s/%s\n", coffeeShop.Namespace, coffeeShop.Name)
    // Reconcile the CoffeeShop resource (e.g., update or validate its status)
    // Add your custom logic here
}

// deployCoffeeShopApplication deploys the CoffeeShop application using a template
func (r *CoffeeShopController) deployCoffeeShopApplication(coffeeShop *samplev1.CoffeeShop) error {
    // Deploy the coffeeshop application using a template (e.g., Deployment, Service, etc.)
    // You can define the template in a separate YAML file and create it here using clientset.AppsV1().Deployments(namespace).Create()
    fmt.Printf("Deploying coffeeshop application for CoffeeShop: %s/%s\n", coffeeShop.Namespace, coffeeShop.Name)
    return nil
}

This controller will watch for new CoffeeShop resources and deploy your coffeeshop application when one is created. It will ensure the desired state (a deployment exists) matches the observed state. The controller achieves this by reconciling the CoffeeShop resource, which involves validating its specifications, creating or updating the corresponding deployment, and managing the application's lifecycle.

How to reconcile the desired state with the observed state?

To ensure your resources align with your desired configuration, you'll compare the current state of the resources in the cluster with the desired state outlined in your CoffeeShop resource.

// handleCoffeeShop handles the reconciliation of a CoffeeShop resource
func (r *CoffeeShopController) handleCoffeeShop(coffeeShop *samplev1.CoffeeShop) {
    fmt.Printf("Reconciling CoffeeShop: %s/%s\n", coffeeShop.Namespace, coffeeShop.Name)

    // Get the current deployment for the CoffeeShop resource, if it exists
    deployment, err := r.clientset.AppsV1().Deployments(coffeeShop.Namespace).Get(context.TODO(), coffeeShop.Name, metav1.GetOptions{})
    if err != nil {
        if errors.IsNotFound(err) {
            // Deployment does not exist, create it
            err = r.createDeployment(coffeeShop)
            if err != nil {
                // Handle the error
                return
            }
            return
        }
        // Handle the error
        return
    }

    // Compare the desired state (spec) with the observed state (current deployment)
    if !r.isDeploymentUpdated(coffeeShop, deployment) {
        // Desired state is already achieved, nothing to do
        return
    }

    // Update the deployment if necessary
    err = r.updateDeployment(coffeeShop, deployment)
    if err != nil {
        // Handle the error
        return
    }
}

// createDeployment creates a new deployment for the CoffeeShop resource
func (r *CoffeeShopController) createDeployment(coffeeShop *samplev1.CoffeeShop) error {
    // Create a new deployment using the desired state defined in the CoffeeShop resource
    // You can define the deployment template in a separate YAML file and create it here using clientset.AppsV1().Deployments(namespace).Create()
    fmt.Printf("Creating deployment for CoffeeShop: %s/%s\n", coffeeShop.Namespace, coffeeShop.Name)
    return nil
}

// updateDeployment updates the existing deployment for the CoffeeShop resource
func (r *CoffeeShopController) updateDeployment(coffeeShop *samplev1.CoffeeShop, deployment *appsv1.Deployment) error {
    // Update the existing deployment to match the desired state defined in the CoffeeShop resource
    // You can modify the deployment spec and use clientset.AppsV1().Deployments(namespace).Update() to apply the changes
    fmt.Printf("Updating deployment for CoffeeShop: %s/%s\n", coffeeShop.Namespace, coffeeShop.Name)
    return nil
}

// isDeploymentUpdated compares the desired state with the observed state of the deployment
func (r *CoffeeShopController) isDeploymentUpdated(coffeeShop *samplev1.CoffeeShop, deployment *appsv1.Deployment) bool {
    // Compare the desired state (spec) of the CoffeeShop resource with the observed state (current deployment)
    // You can check various fields of the deployment to determine if an update is needed
    // Return true if the deployment needs to be updated, false otherwise
    return false
}

Step 1: Check for Existing Deployment

Begin by determining if a deployment for the coffeeshop app already exists. If not, create one based on the CoffeeShop resource specifications.

Step 2: Compare Current vs. Desired State

Assess the current deployment against the CoffeeShop resource's specifications. This includes checking the number of app instances, the image version in use, and any associated labels or environment variables.

Step 3: Update if Necessary

If the current state matches the desired state, no action is needed.

If discrepancies exist, update the deployment to align with the CoffeeShop resource's specifications. This ensures any modifications, such as a new image version, are applied.

Step 4: Behind-the-Scenes Functions

The provided functions (createDeployment, updateDeployment, and isDeploymentUpdated) handle the detailed tasks, such as interfacing with the Kubernetes API.

By consistently reconciling the actual deployment with the desired state, this process guarantees the desired configuration is maintained, accommodating any changes over time.

How to handle error handling and retries in custom controllers?

When building custom controllers in Kubernetes, error handling and retries are essential to make your controller robust. Here are some tips:

  • Use a retry library: Kubernetes itself uses an exponential backoff retry strategy for many of its operations. You should do the same for your controller. There are good retry libraries available in most languages to handle this for you. Some of the most popular retrying libraries are:

    In Python: retrying is a Python library that allows you to add retry logic to your functions or methods. Access GitHub Repo.

    tenacity is another Python library for retrying code until a condition is met using configurable decorators or a context manager. Access GitHub Repo.

    In JavaScript: The node-retry package in Node.js simplifies the process of retrying a function until it succeeds. It supports customizable retry strategies, including exponential backoff. Access GitHub Repo.

    async-retry: If you're working with asynchronous JavaScript, async-retry is a popular library that provides retries with exponential backoff for asynchronous functions. Access GitHub Repo.

  • Handle specific errors: Certain errors may require different actions. For example, if an API call returns a "not found" error, you may want to try creating the resource instead of retrying. Handle these specific error cases in your logic.
  • Set retry limits: Don't retry forever. Set an upper limit on the number of retries to avoid getting stuck in an infinite loop. After exhausting all retries, log an error and move on.
  • Add retry logic at the right level: You can retry at the inner API call level, or at a higher level after a full reconcile loop. Decide based on your specific use case.
  • Catch all errors: Have a catch-all handler for unexpected errors. Log the error and retry, or move to the next reconciled item, depending on your needs. This ensures no error brings down your entire controller.

Advanced Topics in CRDs

Once you have your CRD designed and implemented, it’s time to consider some of the more advanced topics around CRD usage. There are a few key areas you’ll want to keep in mind as your CRDs become a core part of your Kubernetes infrastructure.

Operator Frameworks

Operator frameworks, like Operator Framework and Kubebuilder, make building Kubernetes operators - and by extension Custom Resources and CRDs - significantly easier.

These frameworks handle many of the boilerplate involved in building operators and come with useful tools for generating CRDs, handling versioning, building UIs, and more. Using an operator framework is highly recommended if you plan to build multiple operators or want to simplify your development process.

CRD Versioning

As your CRDs become increasingly used, you’ll need a strategy for versioning them to handle API changes and evolution. There are a few common versioning strategies:

  1. Release a new CRD (e.g. mycrd.example.com/v2) and update clients to use the new version. This requires updating all clients to the new version.
  2. Use Kubernetes API versioning and release a new API version (e.g. mycrd.example.com/v1beta2). Clients can update at their own pace.
  3. Use a "sidecar" CRD and release a new CRD (e.g. mycrd2.example.com) alongside the original. Again, clients can update at their own pace.

The preferred strategy depends on how often you need to make breaking changes and how many clients need to be updated. In general, using API versioning and sidecar CRDs allows for more gradual upgrades and less disruption.

Cross-Namespace/Cross-Cluster Usage

If your CRDs will be used across namespaces or clusters, there are a few additional considerations:

  • RBAC: Ensure you have appropriate ClusterRoles and RoleBindings in place to grant access to the CRD across namespaces.
  • Finalizers: Use finalizers to handle the cleanup of resources on deletion. Finalizers will ensure your CR is fully cleaned up, even if it's deleted in a different namespace or cluster.
  • Webhooks: When using webhooks, be aware that the webhook server needs to be accessible from all namespaces/clusters using the CRD.
  • Monitoring: Set up monitoring and alerts to track CR usage across namespaces and clusters. Errors or issues could arise in any namespace, so monitoring needs to account for that.

CRD Security

Ensuring CRD security and proper access control is crucial for a safe and stable Kubernetes cluster. Here are a few key steps you can take:

  • Use RBAC to control access to your CRDs: Create specific roles and role bindings that grant only the necessary permissions for users and service accounts to access and modify your CRDs. This fine-grained access control helps limit the impact of compromised credentials.
  • Implement admission webhooks: Admission webhooks allow you to validate CRD objects when they are created or updated. You can enforce rules and schemas to ensure only valid data is stored in your CRDs.
  • Use a Secure Etcd: Store your CRDs in a secure etcd backend to prevent unauthorized access to the CRD data at rest. Encrypting etcd data provides an additional layer of security.
  • Leverage TLS: Enable TLS for all API communication to encrypt data in transit and authenticate clients. Make sure your CRD API servers require TLS.
  • Audit CRD access: Monitor and audit all CRD access through your audit logs. Review the logs periodically to detect any abnormal or unauthorized access patterns.

With these security practices, you can safely expose your CRDs through the Kubernetes API and enable users and applications to interact with your custom resources in a controlled and monitored manner.

Monitoring and metrics for CRDs: Why and how to set them up

Monitoring and metrics are essential for any application, and Custom Resource Definitions are no exception. Setting up proper monitoring and metrics for your CRDs can help you:

  • Detect issues early: By tracking metrics like request counts, error rates, and latency, you can spot performance problems or outages as they start to happen. This allows you to fix issues before users are impacted.
  • Track usage: Metrics on things like the total number of resources and API requests give you insight into how your CRDs are being used. This data can help guide optimization efforts.
  • Optimize performance: Performance metrics expose bottlenecks and excessive resource usage, pointing you towards areas for optimization.
  • Understand impact of changes: Comparing metrics before and after changes to your CRDs (like schema updates) shows how those changes affect performance and usage.

To instrument your CRDs, you'll need to:

  • Add metrics to your CRD API server: Export metrics for things like request count, latency, and error rate from your API server code.
  • Install a monitoring agent: Use an agent like Prometheus to scrape the metrics from your API server and store the data.
  • Set up alerts: Configure alerting rules in your monitoring system to notify you about issues based on your important metrics.
  • Expose metrics in the CRD spec: Consider including metrics in your CRD spec so users can monitor individual resources they create.

With the proper metrics and monitoring in place, you'll have the insight you need to ensure your Custom Resource Definitions are performing as expected and meeting the needs of your users.

Best Practices and Tips for CRDs

So you’ve designed your first CRD and are ready to deploy it. Congratulations! Now it’s time to think about best practices to ensure your CRD is robust, scalable, and easy to maintain. Here are some tips to keep in mind:

Design for Backward Compatibility

Think about versioning your CRD from the start. Adding a version field to your spec allows clients to specify what version of the CRD they want to use. This way, you can make non-breaking changes to your CRD without worrying about upgrading all clients at once.

For example, you might have:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: crontabs.stable.example.com
spec:
  group: stable.example.com
  version: v1
  versions:
  - name: v1
    served: true
    storage: true
  - name: v2
    served: true
    storage: false

Here, both v1 and v2 are served (the API serves them), but only v1 is the storage version (objects of this version are persisted to storage).

Write Resilient Controllers

Your CRD is only as good as the controller managing it. Some tips for writing resilient controllers:

  • Handle CRD errors gracefully. Wrap CRD calls in retry.RetryOnConflict to handle update conflicts.
  • Add validation to your CRD and handle invalid objects. Return a 422 Unprocessable Entity response for invalid requests.
  • Add monitoring and alerting. Monitor your controller pods and CRD API calls. Alert if there are errors or latency.
  • Add logging. Log CRD events, errors, and latency to aid debugging.
  • Handle CRD deletion. Have a plan for handling CRD deletion to avoid orphaned resources.

Test, Test, Test

Thoroughly testing your CRD and controller is critical. Some testing strategies:

  • Write unit tests for your controller logic. Test edge cases and error paths.
  • Create e2e tests using a tool like Kubebuilder to spin up a test cluster and exercise your CRD API.
  • Enable the CRD Validation webhook and write tests to ensure invalid CRs are rejected.
  • Perform load and chaos testing to uncover scaling issues before deploying to production.

Summary

So there you have it, a comprehensive guide to extending the Kubernetes API with CRDs. You now have the knowledge and tools to create powerful custom resources tailored to your needs. Whether you want to represent a new object in your application or integrate it with an external service, CRDs provide a straightforward way to enhance Kubernetes for your use case.

With all this newfound knowledge, what will you build? The possibilities are endless.

Now go forth and extend that API!

Additional resources

If you want to know more about Custom Resource Definitions and extending the Kubernetes API, take a look at these resources: