Intro

Every company is looking for a digital transformation by optimizing application development for the cloud. Kubernetes patterns are widely preferred for creating cloud-native apps with Kubernetes as a runtime platform. They eliminate most of the manual processes involved in deploying and scaling containerized applications. It automates the process of scheduling containers across a cluster, scaling those containers, and managing the health of those containers using the same toolset on-premises and in the cloud.

The Setup

In Kubernetes, when deploying an application to a cluster, you must first create an image and then push it to a private/public registry before referencing it in a Kubernetes Pod. Kubernetes has native support for the following registries:

  • Amazon Elastic Container Registry (ECR)
  • Azure Container Registry (ACR)
  • Google Container Registry (GCR)
  • IBM Cloud Container Registry
  • Oracle Cloud Infrastructure Registry (OCIR)

Here is a Pod spec that needs access to your Docker credentials:

apiVersion: v1
kind: Pod
metadata:
name: nginx-image
spec:
containers:
- name: nginx
image: <your-private-image>
imagePullPolicy: Always
imagePullSecrets:
- name: <secret-name>

The Challenge

The main reason why Kubernetes is so complex is that troubleshooting what went wrong requires many levels of information gathering. At some point, you'll run into an issue where Kubernetes fails to pull a container image. The reason for this error could be that:

  • Kubernetes doesn’t have permissions to pull that image
  • The image doesn’t exist
  • The image tag is incorrect
  • There are Network connectivity issues

The Event

During a webinar themed "Your Application Deserves Better than Kubernetes Ingress: Istio vs. Kubernetes", Andrew Lee, a Technical Instructor at Mirantis was preparing to demonstrate how to expose an app using NodePort. After applying his deployment file:

$ kubectl apply -f goapp-deployment.yaml

Lee went on to inspect his Pods only to see that all Pods had the status of ImagePullBackOff.

$ kubectl get pods
NAME                                READY         STATUS
goapp-deployment-pod-1  0/1    ImagePullBackOff
goapp-deployment-pod-2  0/1    ImagePullBackOff
goapp-deployment-pod-3  0/1    ImagePullBackOff

The Root Cause

This was a weird error because this same deployment spec had worked multiple times before this demo. The most likely cause for the ImagePullBackOff error is a problem with Kubernetes trying to pull an image from a wrong repository. Another unlikely scenario is where Docker Hub could be down.

After the Webinar, he decided to recheck his cluster:

$ kubectl get pods
NAME                               READY  STATUS
goapp-deployment-pod-1  1/1    Running
goapp-deployment-pod-2  1/1    Running
goapp-deployment-pod-3  1/1    Running

Everything was running normally. He checked the Docker Hub Status and there he found the root cause.

Docker Hub Registry and Docker Hub Builds had gone down during the exact time when he was conducting the webinar.

The Fix

To fix this issue, he had two options:

  1. Re-build the image from the source
  2. Use a backup environment to show the end-state

Andrew went for the second option because it was easier and would take up less time. This allowed him to keep on topic within the stipulated time.

Conclusion

Failures occur where you least expect it. From this event, we learn that it's better to use a local docker repository or an HA docker repository to mitigate the risk of failure. You should also have a backup image prepared just in case you run into any issue.

If you are looking for a Kubernetes solution that will ease the burden of resolving compliance issues in your cluster, look no further than Kalc. Kalc is an invaluable tool for Kubernetes Admins/Operators. It uses AI/Machine Learning to alert you about any bad configuration in your cluster that can cause service degradation or in a worst-case scenario, a total service outage.