Part 5

Custom Resource Definitions

Custom Resource Definitions (CRDs) are a way to extend Kubernetes with our own Resources. We've used a large number of them already, e.g. in part 4 we used Application with ArgoCD.

They're so integral part of using Kubernetes that it's a good idea to learn how to make one ourselves.

Before we can get started we need to figure out what we want to create. So let's create a resource that can be used to create countdowns. The resource will be called "Countdown". It will have some length and some delay between executions. The execution - what happens each time the delay has elapsed - is left up to an image. So that someone using our CRD can create a countdown that e.g. posts a message to Twitter each time has ticked down.

As a template I'll use one provided by the docs.

resourcedefinition.yaml

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  # name must match the spec fields below, and be in the form: <plural>.<group>
  name: countdowns.stable.dwk
spec:
  # group name to use for REST API: /apis/<group>/<version>
  group: stable.dwk
  # either Namespaced or Cluster
  scope: Namespaced
  names:
    # kind is normally the CamelCased singular type. Your resource manifests use this.
    kind: Countdown
    # plural name to be used in the URL: /apis/<group>/<version>/<plural>
    plural: countdowns
    # singular name to be used as an alias on the CLI and for display
    singular: countdown
    # shortNames allow shorter string to match your resource on the CLI
    shortNames:
    - cd
  # list of versions supported by this CustomResourceDefinition
  versions:
    - name: v1
      # Each version can be enabled/disabled by Served flag.
      served: true
      # One and only one version must be marked as the storage version.
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                length:
                  type: integer
                delay:
                  type: integer
                image:
                  type: string
      additionalPrinterColumns:
        - name: Length
          type: integer
          description: The length of the countdown
          jsonPath: .spec.length
        - name: Delay
          type: integer
          description: The length of time (ms) between executions
          jsonPath: .spec.delay

Now we can create our own Countdown:

countdown.yaml

apiVersion: stable.dwk/v1
kind: Countdown
metadata:
  name: doomsday
spec:
  length: 20
  delay: 1200
  image: jakousa/dwk-app10:sha-84d581d

And then:

$ kubectl apply -f countdown.yaml
  countdown.stable.dwk/doomsday created

$ kubectl get cd
  NAME        LENGTH   DELAY
  doomsday    20       1200

Now we have a new resource. Next, let's create a new custom controller that'll start a pod that runs a container from the image and makes sure countdowns are destroyed. This will require some coding.

For the implementation, I decided to use a Job, a resource familiar to us from part 2. Pods created by Jobs are intended to run once until completion. However, neither the completed Jobs nor the Pods are removed automatically. Those are preserved so that the execution logs can be reviewed after job execution.

Our controller has to do 3 things:

  • Create a Job from a Countdown
  • Reschedule Jobs until the number of executions defined in Countdown (the length) has been completed.
  • Clean all Jobs and Pods after the execution

To implement the controller we need to do some low-level stuff and access the Kuberneter directly using the REST APIs.

By listening to the Kubernetes API at /apis/stable.dwk/v1/countdowns?watch=true we will receive an ADDED for every Countdown object in the cluster. Then creating a job can be done by parsing the data from the message and POSTing a valid payload to /apis/batch/v1/namespaces/<namespace>/jobs.

For jobs, we'll listen to /apis/batch/v1/jobs?watch=true and wait for MODIFIED event where the success state is set to true and update the labels for the jobs to store the status. To delete a Job and its Pod we can send a delete request to /api/v1/namespaces/<namespace>/pods/<pod_name> and /apis/batch/v1/namespaces/<namespace>/jobs/<job_name>

And finally, we can remove the countdown with a delete request to /apis/stable.dwk/v1/namespaces/<namespace>/countdowns/<countdown_name>.

A version of this controller can be found here. It has a readily built image jakousa/dwk-app10-controller:sha-4256579. We cannot just deploy it as it won't have access to the APIs. For this, we will need to define suitable access.

RBAC

RBAC (Role-based access control) is an authorization method that allows us to define access for individual users, service accounts or groups by giving them roles. For our use case, we will define a ServiceAccount resource.

serviceaccount.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: countdown-controller-account

and then specify the serviceAccountName for the deployment

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: countdown-controller-dep
spec:
  replicas: 1
  selector:
    matchLabels:
      app: countdown-controller
  template:
    metadata:
      labels:
        app: countdown-controller
    spec:
      serviceAccountName: countdown-controller-account
      containers:
        - name: countdown-controller
          image: jakousa/dwk-app10-controller:sha-4256579

Next is defining the role and its rules. There are two types of roles: ClusterRole and Role. Roles are namespace-specific whereas ClusterRoles can access all of the namespaces - in our case, the controller will access all countdowns in all namespaces so a ClusterRole will be required.

The rules are defined with the apiGroup, resource and verbs. For example, the jobs was /apis/batch/v1/jobs?watch=true so it's in the apiGroup "batch" and resource "jobs" and the verbs see documentation. Core API group is an empty string "" like in the case of pods.

clusterrole.yaml

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: countdown-controller-role
rules:
- apiGroups: [""]
  # at the HTTP level, the name of the resource for accessing Pod
  # objects is "pods"
  resources: ["pods"]
  verbs: ["get", "list", "delete"]
- apiGroups: ["batch"]
  # at the HTTP level, the name of the resource for accessing Job
  # objects is "jobs"
  resources: ["jobs"]
  verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: ["stable.dwk"]
  resources: ["countdowns"]
  verbs: ["get", "list", "watch", "create", "delete"]

And finally bind the ServiceAccount and the role. There are two types of bindings as well: ClusterRoleBinding and RoleBinding. If we used a RoleBinding with a ClusterRole we would be able to restrict access to a single namespace. For example, if permission to access secrets is defined to a ClusterRole and we gave it via RoleBinding to a namespace called "test", they would only be able to access secrets in the namespace "test" - even though the role is a "ClusterRole".

In our case ClusterRoleBinding is required since we want the controller to access all of the namespaces from the namespace it's deployed in, in this case, namespace "default".

clusterrolebinding.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: countdown-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: countdown-controller-role
subjects:
- kind: ServiceAccount
  name: countdown-controller-account
  namespace: default

After deploying all of that we can check logs after applying a countdown. (You may have to delete the pod to have it restart in case it didn't have access and it got stuck)

$ kubectl logs countdown-controller-dep-7ff598ffbf-q2rp5
  > app10@1.0.0 start /usr/src/app
  > node index.js

  Scheduling new job number 20 for countdown doomsday to namespace default
  Scheduling new job number 19 for countdown doomsday to namespace default
  ...
  Countdown ended. Removing countdown.
  Doing cleanup
You have reached the end of this section! Continue to the next section: