Kubernetes Persistent volume to store persistent data

In one of our projects we are running Rails application on Kubernetes cluster. It is proven tool for managing and deploying docker containers in production.

In kubernetes containers are managed using deployments and they are termed as pods. deployment holds the specification of pods. It is responsible to run the pod with specified resources. When pod is restarted or deployment is deleted then data is lost on pod. We need to retain data out of pods lifecycle when the pod or deployment is destroyed.

We use docker-compose during development mode. In docker-compose linking between host directory and container directory works out of the box. We wanted similar mechanism with kuberentes to link volumes. In kubernetes we have various types of volumes to use. We chose persistent volume with AWS EBS storage. We used persistent volume claim as per the need of application.

As per the Persistent Volume's definition (PV) Cluster administrators must first create storage in order for Kubernetes to mount it.

Our Kubernetes cluster is hosted on AWS. We created AWS EBS volumes which can be used to create persistent volume.

Let's create a sample volume using aws cli and try to use it in the deployment.

1
2aws ec2 create-volume --availability-zone us-east-1a --size 20 --volume-type gp2
3

This will create a volume in us-east-1a region. We need to note VolumeId once the volume is created.

1
2$ aws ec2 create-volume --availability-zone us-east-1a --size 20 --volume-type gp2
3{
4    "AvailabilityZone": "us-east-1a",
5    "Encrypted": false,
6    "VolumeType": "gp2",
7    "VolumeId": "vol-123456we7890ilk12",
8    "State": "creating",
9    "Iops": 100,
10    "SnapshotId": "",
11    "CreateTime": "2017-01-04T03:53:00.298Z",
12    "Size": 20
13}
14
15

Now let's create a persistent volume template test-pv to create volume using this EBS storage.

1kind: PersistentVolume
2apiVersion: v1
3metadata:
4  name: test-pv
5  labels:
6    type: amazonEBS
7spec:
8  capacity:
9    storage: 10Gi
10  accessModes:
11    - ReadWriteMany
12  awsElasticBlockStore:
13    volumeID: <your-volume-id>
14    fsType: ext4

Once we had template to create persistent volume, we used kubectl to launch it. Kubectl is command line tool to interact with Kubernetes cluster.

1
2$ kubectl create -f  test-pv.yml
3persistentvolume "test-pv" created
4

Once persistent volume is created you can check using following command.

1
2$ kubectl get pv
3NAME       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS      CLAIM               REASON    AGE
4test-pv     10Gi        RWX           Retain          Available                                7s
5

Now that our persistent volume is in available state, we can claim it by creating persistent volume claim policy.

We can define persistent volume claim using following template test-pvc.yml.

1kind: PersistentVolumeClaim
2apiVersion: v1
3metadata:
4  name: test-pvc
5  labels:
6    type: amazonEBS
7spec:
8  accessModes:
9    - ReadWriteMany
10  resources:
11    requests:
12      storage: 10Gi

Let's create persistent volume claime using above template.

1$ kubectl create -f  test-pvc.yml
2
3persistentvolumeclaim "test-pvc" created
4

After creating the persistent volume claim, our persistent volume will change from available state to bound state.

1
2$ kubectl get pv
3NAME       CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS     CLAIM               REASON    AGE
4test-pv    10Gi        RWX           Retain          Bound      default/test-pvc              2m
5
6$kubectl get pvc
7NAME        STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
8test-pvc    Bound     test-pv   10Gi        RWX           1m
9

Now we have persistent volume claim available on our Kubernetes cluster, Let's use it in deployment.

Deploying Kubernetes application

We will use following deployment template as test-pv-deployment.yml.

1apiVersion: extensions/v1beta1
2kind: Deployment
3metadata:
4  name: test-pv
5  labels:
6    app: test-pv
7spec:
8  replicas: 1
9  template:
10    metadata:
11      labels:
12        app: test-pv
13        tier: frontend
14    spec:
15      containers:
16        - image: <your-repo>/<your-image-name>:latest
17          name: test-pv
18          imagePullPolicy: Always
19          env:
20            - name: APP_ENV
21              value: staging
22            - name: UNICORN_WORKER_PROCESSES
23              value: "2"
24          volumeMounts:
25            - name: test-volume
26              mountPath: "/<path-to-my-app>/shared/data"
27          ports:
28            - containerPort: 80
29      imagePullSecrets:
30        - name: registrypullsecret
31      volumes:
32        - name: test-volume
33          persistentVolumeClaim:
34            claimName: test-pvc

Now launch the deployment using following command.

1
2$ kubectl create -f  test-pvc.yml
3deployment "test-pv" created
4

Once the deployment is up and running all the contents on shared directory will be stored on persistent volume claim. Further when pod or deployment crashes for any reason our data will be always retained on the persistent volume. We can use it to launch the application deployment.

This solved our goal of retaining data across deployments across pod restarts.

If you liked this blog, you might also like the other blogs we have written. Check out the full archive.