Intro

Among the issues most users encounter every day with Kubernetes, the most common ones include poor performance, slow response, server breakdown, and network connection failure. However, the issue that seems insignificant but racks our brains is deletion failure, for example, process ending failure, file deletion failure, and driver uninstallation failure. The logic of such issues is more complex than one would expect.

In this post, I'd like us to go over an open issue related to Kubernetes cluster namespaces as discussed here: https://github.com/kubernetes/kubernetes/issues/60807

The Setup

A namespace is a storage mechanism for Kubernetes cluster resources. In Kubernetes you're are supposed to store related resources in the same namespace to prevent unnecessary impact on unrelated resources. A namespace can hold pods, services, replication controllers, etc. It is basically a virtual cluster backed by the physical cluster.

When you provision a Kubernetes cluster, it will initialize with three namespaces:

  1. default - The default namespace for objects with no defined namespace
  2. kube-system - The namespace for objects created by the Kubernetes system
  3. kube-public - This namespace is mostly reserved for cluster usage

A namespace is meant to be a way to divide cluster resources between multiple users.

You should delete the namespaces that are no longer in use.

The Challenge

As your Kubernetes services start to grow in number, simple tasks start to get more complicated. For example; teams are not able to create services or deployments with the same name. Also, if you have thousands of Kubernetes Pods, just listing them all would take some time let alone actually administering them.

The default namespace is great for getting started in small production clusters. However, it’s not ideal for use in a large production cluster. This is because it’s very easy for a team to accidentally overwrite or disrupt another team without even realizing it. Instead, you should create multiple namespaces and use them to segment your services into manageable chunks.

The Event

Currently, Kubernetes users are encountering an issue whereby while deleting a namespace (for example dev), with the following command:

$ kubectl delete namespace dev

The namespace gets stuck in the Terminating state but cannot be deleted.

When it's time to clear the project out of your Kubernetes cluster, the namespace is stuck in the terminating life cycle phase and stays there for days, perhaps even weeks and it raises this error:

Error from server (Conflict): Operation cannot be fulfilled on namespaces "dev": The system is ensuring all content is removed from this namespace. Upon completion, this namespace will automatically be purged by the system.

The Fix

Solution 1

You can dig deeper into your cluster to try to unearth the reason why the namespace could be stuck. This command will show you what resources remain in the namespace:

$ kubectl api-resources --verbs=list --namespaced -o name \
| xargs -n 1 kubectl get --show-kind --ignore-not-found -n <namespace>

Once you find those and resolve or remove them, the namespace will be cleaned up.

Solution 2

A simpler fix is to remove the finalizer for kubernetes. To get rid of the bar namespace, I've put together a little script to force a namespace deletion.

#!/bin/bash
# Take the first argument from the command invocation and assign it to a local variable
k8s_delete_ns=$1
# Get the current namespace configuration
kubectl get namespaces -o json | grep "${k8s_delete_ns}"
# Write it to bar.json
kubectl get namespace ${k8s_delete_ns} -o json > bar.json
# Wait for the editor to open
wait 3
# Open the bar.json configuration file using vi
vi bar.json
# Remove any items listed in the finalizers array
curl -H "Content-Type: application/json" -X PUT --data-binary @bar.json http://127.0.0.1:8080/api/v1/namespaces/${k8s_delete_ns}/finalize
# Wait for namespace deletion to process
wait 12
# Get namespaces will output the existing namespace after the update
kubectl get namespaces
echo "...done."

If all goes as expected the namespace should now be absent from your namespaces list.

The key thing to note here is the resource you are modifying, in our case, it is for namespaces, it could be pods, deployments, services, etc. This can use this script to delete other resources stuck in the Terminating state.

Conclusion

In this article, we have learned that as the number of microservices and teams using Kubernetes in your organization starts to increase, it’s recommended that you use namespaces to make Kubernetes more manageable. We have also seen that sometimes when cleaning up namespaces with the command:

$ kubectl delete namespace dev

This command might not always work as expected and so I showed you a simple fix for this bug.

At Kalc, we hold the firm belief that agility requires safety. To move fast, you need safety mechanisms to help you catch issues in your cluster before they even have a chance to wreak havoc. This is why we came up with the Kubernetes calculator, which is a Swiss Army Knife for validating your cluster manifests. The tool has functions for validating the correctness of Kubernetes objects and reporting what is wrong with them if they aren’t valid.