<img src="https://certify.alexametrics.com/atrk.gif?account=Zpb+p1uhUo20dG" style="display:none" height="1" width="1" alt="">
Copy of AWS _ PagerDuty
powered by:
Techstrong Learning
Debug Faster by Capturing Crash States on AWS Fargate & Amazon EKS
On Demand

Amazon EKS and AWS Fargate simplify cloud infrastructure management, allowing developers to focus on application development. However, these abstractions also make debugging production issues more challenging, particularly in regulated environments. Debugging becomes even more difficult with complex distributed systems and event-driven architectures, as the hands-off management and ephemeral nature of these frameworks limit visibility and remove evidence of issues.

To address and improve application and debugging in EKS and Fargate, it's recommended that developers:

  • Learn more about Kubernetes (so much for that abstraction and time savings!)
  • Implement pervasive logging and centralized collection
  • Create container-level health checks
  • Implement tracing and correlation IDs to map those cross-system interactions

What’s missing, however, is a way for developers to gain access to the state of problematic containers so they can view heap, thread and TCP dumps and stack traces.

In this session, we’ll show how you can expedite debugging of applications running on EKS and Fargate with PagerDuty Process Automation. We’ll demonstrate how to automate preservation of ephemeral container states during incident response. You’ll see how incident responders can collect necessary information for developers to create permanent fixes as part of their remediation efforts.

Key Takeaways:

  • Advantages Amazon EKS and AWS Fargate offer developers creating and deploying complex distributed applications
  • How these same advantages also cause challenges when it comes time to debug issues.
  • How PagerDuty Process Automation helps operators quickly triage, diagnose and remediate incidents in applications running on Amazon EKS and AWS Fargate.
  • How automated diagnosis of incidents can also automatically capture and persist ephemeral container states to help developers rapidly identify root causes and create a fix faster.

Jeremiah Lodise
Sr Solutions Consultant - PagerDuty
Jeremiah is a field engineer focused on helping PagerDuty customers apply automation and AIOps to speed up operations, resolve incidents faster, and improve the lives of engineers by automating away toil. Before PagerDuty, he worked as a field engineer helping customers eliminate risk in identity and fraud in online transactions, and spent several years working with GIS systems.