How does it work
Request a demo
Overcoming a Data Center Outage
In today's cloud world, you are no longer just worried about machine failure but you also have to account for data center failure. UPS system failure, cybercrime, human error, natural disasters, and faulty generators all rank as the top 5 culprits of data center outages. In November 2019, AWS, Microsoft Azure and Google Cloud all experienced ...
February 12, 2020
How To Delete A Namespace Stuck At Terminating State
Among the issues most users encounter every day with Kubernetes, the most common ones include poor performance, slow response, server breakdown, and network connection failure. However, the issue that seems insignificant but racks our brains is deletion failure, for example, process ending failure, file deletion failure, and driver ...
February 8, 2020
I Can't Believe I have All The Node's Resources
AWS and other leading public cloud companies have made infrastructure as a service and it’s subscription based model quite attractive. As more and more companies continue to buy into the benefits of this business model, they are making the next step which is to migrate their applications from datacenters composed of ...
February 3, 2020
How ARP Cache Overflows and DNS Timeouts Brought Tinder Down
Kubernetes is the latest and greatest technology for taking containers to the next level. If you are in a situation where you've been using docker kubernetes for a little while and your website or application makes it to the big leagues and is suddenly driving a lot of traffic your way, you need a way to scale up really fast.
January 28, 2020
A Postmortem of a Service Mesh Speeding Accident
Cloud-native applications are often designed as a batch of distributed microservices, which run in Containers. Today Kubernetes has become the go-to solution for deploying and orchestrating containerized applications. It has a rich set of APIs that abstract away the underlying hardware infrastructure by acting as a distributed operating system
January 24, 2020
Data on Persistent Volume wiped after kubelet restart
Kubernetes adoption is exploding and this is due to it being a great platform for running your applications. Kubernetes itself is a stateful platform - so there is data associated with that. As with all data processing applications, you need to provide some data protection capabilities.
January 20, 2020
A Strange Case of Data Loss During Migration
In Kubernetes, managing storage is a distinct problem from managing compute. This is because pods are ephemeral, they come and go quite often. Therefore, on-disk files in a container are also ephemeral. So what happens if you have data that you must persist even though the pod itself goes down?
January 14, 2020
How a Cluster Autoscaler might actually save your life?
In Kubernetes we have this concept of Pods. A Pod is a grouping of Containers (such as Docker containers) on the same host. It can be one or more. A good example of this is when you an application server in a Pod and you also have monitoring and logging containers co-located on the same host.
January 10, 2020
Having DNS lookup failures for services in your cluster?
When we look at customers and the problems they encounter using Kubernetes, one of the most prominent issues they run into is related to CoreDNS (DNS Server). Before CoreDNS came in we had kube-dns. CoreDNS was GA in 1.11 so if you created clusters after 1.11 by default you're getting CoreDNS.
January 5, 2020
Is your pod status shown as pending?
When we talk to customers about containerizing, modernizing their applications we always ask them why they want to use Kubernetes. Most of the time Kubernetes ends up being the answer but we want to emphasize that Kubernetes is not a golden hammer.In this article, I intend to cover the number one way in which Kubernetes on AWS EKS has failed.
December 30, 2019
What Happens When Something Goes Wrong With EKS IAM Roles
The rapid adoption of Kubernetes has led to an increase in outages covering entire company operations. Recently SourceClear experienced a Kubernetes outage which lasted two days and affected multiple teams. SourceClear is a software security company which uses data-science and machine-learning to help developers use open-source safely by analyzing the libraries
December 11, 2019
Where Did All My Pods Go?
Enterprises using Kubernetes often need to autoscale their resources based on more than just CPU usage—for example concurrent persistent connections or queue length.This post walks you through an incident where one of our customers enabled autoscaling for their application and one day all their Pods disappeared.
November 15, 2019
CoreDNS Autopath Failure For External Name Services
Datadog is a monitoring service for cloud-based workflows offering Kubernetes insights through metrics, traces, logs, dashboards, etc. They are positioned as a Cloud Native service provider. Through their SaaS-based analytics platform,
November 4, 2019
How To Break a Cassandra Cluster
Apache Cassandra is a highly scalable free and open source NoSQL database, achieving great performance on multi-node setups with no single point of failure. Cassandra supports replication across multiple data centers and offers lower latency for users and the ability to survive regional outages.
December 24, 2019
Kubernetes Jobs and the Sidecar Problem
Imagine that, you have a large computation to perform, and once the computation is done, you want Kubernetes to stop Pods automatically. Simply put, we are talking about running Pods temporarily until a Job is completed
October 31, 2019
Job being constantly recreated despite RestartPolicy: Never
Universe.com, a division within Ticketmaster, is shaping the future of the event industry using Kubernetes. They provide meaningful, real-life experiences to people around the globe through a world-class event ticketing platform.
October 19, 2019
The Case of the Infected Cluster
Today's distributed systems need to be resilient. Resilient, in short, is a way that ideally a user does not notice at all if a random failure takes place or that the user at least can continue to use the degraded application. On Monday 9 July 2018,
November 10, 2019
Debugging DNS Failure On Pods Looking Up External Resources
Docker makes building containers remarkably easy. The downside of this simplicity is that it's easy to build huge containers full of things you don't need - including security holes. By using a smaller, specialized base image such as Alpine, you can significantly minimize the attack surface.
October 13, 2019
Challenges With Running PostgreSQL On Kubernetes
Containers have become the next big thing in infrastructure software. However, for you to take full advantage of containers you need to be conversant on how to turn them into production services. This is where Kubernetes shines — as an orchestrator of your containerized applications.
October 8, 2019
The Developer Guide to Taking a Kubernetes Cluster Down
At Kalc we have a lot of experience, gained either from customers or from our time at Fortune 500 Companies, and we are concerned about all the mistakes you might make with Kubernetes. In some ways, this is motivated by the fact that we have all these new schedulers, this immutably styled infrastructure that we are all striving towards
October 3, 2019
Managing Kubernetes Clusters on AWS Using Kops
Containers are a well-established way of packaging an application. Kubernetes has also gotten out of the early-adopters phase. Today it is a widely held view that Kubernetes is a cost-effective, ready-made solution that enterprise customers can trust.
September 27, 2019
Setting Up Your EKS Cluster for Scale
Many organizations are modernizing their existing applications to become more agile and innovate faster. Architectural patterns like microservices enable teams to independently test services and continuously push applications to delivery environments.
July 25, 2019
How does it work
Request a Quote
3000 El Camino Real, Building 4, Suite 200
Palo Alto, California, 94306, USA
Tel: +1 (650) 388-9499