Part 1

Introduction to Storage

There are two things that are known to be difficult with Kubernetes. First is networking. Thankfully we can avoid most of the networking difficulties unless we were going to setup our own cluster. If you're interested you can watch this Webinar on "Kubernetes and Networks: Why is This So Dang Hard?" but we'll skip most of the topics discussed in the video. The other of the most difficult things is storage.

In part 1 we will look into a very basic method of using storage and return to this topic later. Where almost everything else in Kubernetes is very much dynamic, moving between nodes and replicating with ease, storage does not have the same possibilities. "Why Is Storage On Kubernetes So Hard?" provides us a wide angle on the difficulties and the different options we have to overcome them.

There are multiple types of volumes and we'll get started with two of them.

Simple Volume

Where in docker and docker-compose it would essentially mean that we had something persistent, here that is not the case. emptyDir volumes are shared filesystems inside a pod, this means that their lifecycle is tied to a pod. When the pod is destroyed the data is lost. In addition, simply moving the pod from another node will destroy the contents of the volume as the space is reserved from the node the pod is running on. Even with the limitations it may be used as a cache as it persists between container restarts or it can be used to share files between two containers in a pod.

Before we can get started with this, we need an application that shares data with another application. In this case, it will work as a method to share simple log files with each other. We'll need to develop the apps:

App 1 will check if /usr/src/app/files/image.jpg exists and if not download a random image and save it as image.jpg. Any HTTP request will trigger a new image generation.

App 2 will check for /usr/src/app/files/image.jpg and show it if it is available.

They share a deployment so that both of them are inside the same pod. My version available here. The example includes ingress and service to access the application.

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: images-dep
spec:
  replicas: 1
  selector:
    matchLabels:
      app: images
  template:
    metadata:
      labels:
        app: images
    spec:
      volumes: # Define volume
        - name: shared-image
          emptyDir: {}
      containers:
        - name: image-finder
          image: jakousa/dwk-app3-image-finder:e11a700350aede132b62d3b5fd63c05d6b976394
          volumeMounts: # Mount volume
          - name: shared-image
            mountPath: /usr/src/app/files
        - name: image-response
          image: jakousa/dwk-app3-image-response:e11a700350aede132b62d3b5fd63c05d6b976394
          volumeMounts: # Mount volume
          - name: shared-image
            mountPath: /usr/src/app/files

As the display is dependant on the volume we can confirm that it works by accessing the image-response and getting the image. The provided ingress used the previously opened port 8081 http://localhost:8081

Note that all data is lost when the pod goes down.

Persistent Volumes

This type of storage is what you probably had in mind when we started talking about volumes. Unfortunately, we're quite limited with the options here and will return to PersistentVolumes briefly in Part 2 and again in Part 3 with GKE.

The reason for the difficulty is because you should not store data with the application or create a dependency on the filesystem by the application. Kubernetes supports cloud providers very well and you can run your own storage system. During this course, we are not going to run our own storage system as that would be a huge undertaking and most likely "in real life" you are going to use something hosted by a cloud provider. This topic would probably be a part of its own, but let's scratch the surface and try something you can use to run something at home.

A local volume is a PersistentVolume that binds a path from the node to use as a storage. This ties the volume to the node.

For the PersistentVolume to work you first need to create the local path in the node we are binding it to. Since our k3d cluster runs via docker let's create a directory at /tmp/kube in the k3d-k3s-default-agent-0 container. This can simply be done via docker exec k3d-k3s-default-agent-0 mkdir -p /tmp/kube

persistentvolume.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-pv
spec:
  storageClassName: manual
  capacity:
    storage: 1Gi # Could be e.q. 500Gi. Small amount is to preserve space when testing locally
  volumeMode: Filesystem # This declares that it will be mounted into pods as a directory
  accessModes:
  - ReadWriteOnce
  local:
    path: /tmp/kube
  nodeAffinity: ## This is only required for local, it defines which nodes can access it
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - k3d-k3s-default-agent-0

As this is bound into that node avoid using this in production.

The type of local we're using now can not be dynamically provisioned. A new PersistentVolume needs to be defined only rarely, for example to your personal cluster once a new physical disk is added. After that, a PersistentVolumeClaim is used to claim a part of the storage for an application. If we create multiple PersistentVolumeClaims the rest will stay in Pending state, waiting for a suitable PersistentVolume.

persistentvolumeclaim.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: image-claim
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Modify the previously introduced deployment to use it:

deployment.yaml

...
    spec:
      volumes:
        - name: shared-image
          persistentVolumeClaim:
            claimName: image-claim
      containers:
        - name: image-finder
          image: jakousa/dwk-app3-image-finder:e11a700350aede132b62d3b5fd63c05d6b976394
          volumeMounts:
          - name: shared-image
            mountPath: /usr/src/app/files
        - name: image-response
          image: jakousa/dwk-app3-image-response:e11a700350aede132b62d3b5fd63c05d6b976394
          volumeMounts:
          - name: shared-image
            mountPath: /usr/src/app/files

And apply it

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-hy/material-example/master/app3/manifests/deployment-persistent.yaml

With the previous service and ingress we can access it from http://localhost:8081. To confirm that the data is persistent we can run

$ kubectl delete -f https://raw.githubusercontent.com/kubernetes-hy/material-example/master/app3/manifests/deployment-persistent.yaml
  deployment.apps "images-dep" deleted
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes-hy/material-example/master/app3/manifests/deployment-persistent.yaml
  deployment.apps/images-dep created

And the same file is available again.

If you are interested in learning more about running your own storage you can check out.

:
Loading...
:
Loading...

Login to view the exercise

You have reached the end of this section! Continue to the next section: