Reliability and high availability are key features of Kubernetes, but even the most resilient systems can fail. Applications crash, hardware breaks and nodes can go offline. These failures can have damaging and unpredictable consequences for organizations, especially those that are unprepared.
We’ll explore how to improve the availability and reliability of Kubernetes clusters using the discipline of Chaos Engineering. You will also have an opportunity to ask questions of our experts during our live Q&A segment.
In this webinar:
Ana Margarita is currently working as a senior chaos engineer at Gremlin, helping companies avoid outages by running proactive chaos engineering experiments. Before Gremlin, she has worked at various-sized companies including Google, Uber, SFEFCU and Miami-based startup. Ana is an internationally recognized speaker and has spoken at: AWS re:Invent, KubeCon, DockerCon, DevOpDays, AllDayDevOps, Write/Speak/Code and many others.