Intro

In Kubernetes, managing storage is a distinct problem from managing compute. This is because Kubernetes pods are ephemeral, they come and go quite often. Therefore, on-disk files in a container are also ephemeral. So what happens if you have data that you must persist even though the pod itself goes down? Well, Kubernetes exposes the Persistent Volume and Persistent Volume Claim APIs to solve this problem.

In this article, I am going to tell you a story of how Kubernetes lost data for one of our enterprise clients. We will focus on how we fixed it and what we all learned from this failure so that you don't have to repeat the same mistakes.

The Setup

So we’ve seen that you can persist data in Kubernetes using Persistent Volumes and Persistent Volume Claims. The idea is to move data outside of the pod in what we call volumes so that it can exist independently of any pod. Even though we are calling it a volume, the benefit which it gives you is state.

With Kubernetes, a cluster admin has to create Persistent Volumes in a pool. So a Persistent Volume is a base abstraction for a piece of storage. This means it has a size and it is backed up by something like NFS, Kubernetes on AWS ElasticBlockStorage, Azure Managed Disk, Google Cloud Storage Buckets e.t.c.

A developer who needs some storage will submit a Persistent Volume Claim which is a request for a certain amount of storage and then they will reference that claim in their pod. Now Kubernetes can then match up that volume with whatever the pod needs.

Here is the manifest for a Persistent Volume Claim:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myclaim
spec:
accessModes:
Read Write Once
resources:
requests:
storage: 10Gi
StorageClassName: slow
selector:
matchLabels:
release: "stable"
matchExpressions:
{ key: environment, operator: In, values: [dev]}

Here is the manifest for an NFS backed Persistent Volume:

apiVersion: v1
kind: Persistent Volume
metadata:
name: pv001
spec:
capacity:
storage: 10Gi
accessModes:
- Read Write Once
persistentVolumeReclaimPolicy: Recycle
storageClassName: slow
mountOptions:
- hard
- nfsvers = 4.2
nfs:
path: /tmp
server: 178.22.0.7

The Challenge

The first thing that users usually learn the hard way is that they should not mess up with Persistent Volumes and Persistent Volume Claims.

In this case, the user tried to migrate some data from their testing cluster to the production cluster. They literally took YAML manifests of the PV and PVC, and they restored them on the new cluster.

The User moves PV and PVC objects from Testing to Production clusters

$ kubectl get pv -o yaml > pvs.yaml
$ kubectl get pvc -o yaml > pvcs.yaml

On the production cluster

$ kubectl apply -f pvs.yaml
$ kubectl apply -f pvcs.yaml

The Event

What followed, and it was surprising to them, was that Kubernetes deleted all their data on the storage backend and it also deleted the PV object.

The Root Cause

Why did this happen?


They had a PVC bound to a PV that points to some storage in the storage backend. Then they brought the PV object and restored it in the new cluster.

Now the PV was bound, and it was bound to a PVC that did not exist at that time since they hadn't restored it yet.

So what Kubernetes does to volumes that are bound to claims that do not exist is that it executes the reclaim policy.

In this case, the reclaim policy was delete and Kubernetes deleted the data because it appeared exactly like there was a Persistent Volume Claim but somebody had deleted the Claim.

It's exactly the same situation in the API server and Kubernetes has no way of distinguishing whether it’s a failed migration or a user has deleted a PVC.

The Fix

How to fix this data loss pattern?

Well, you can't fix it!

This is how Kubernetes works. If there is a Persistent Volume bound to a Persistent Kubernetes Volume Claim that does not exist, Kubernetes executes the reclaim policy.

So what you should do as users to avoid this error when doing backups is that you should not play with your Persistent Volumes. You should instead use some dedicated tools for migration, such as Ark or Velero from VMware. These can migrate objects between clusters.

But just in case you want to play with Persistent Volumes, please use a Retain reclaim policy. So that Kubernetes will not delete data and you can restore it later. The reclaim policy tells the cluster what to do with the volume after it has been released of it’s claim. Volumes can either be retained, recycled or deleted.

Conclusion

What we learned is that you should not play with Persistent Volumes unless you know exactly what you are doing to avoid the risk of losing all your data. But if you must do it, use a retain reclaim policy. We also recommend using Kalc to auto-discover issues in your cluster. Kalc calculates the probability of failure in your Kubernetes cluster and provides intelligent reports. You can then use this information to resolve any offending configuration.