Cloud problems are no longer handled only by operations teams.
Today, developers, SREs, and platform engineers all share the responsibility of keeping systems reliable. When something goes wrong, teams need faster and smarter ways to find and fix issues.
This session introduces HolmesGPT, an open-source AI assistant for cloud and Kubernetes troubleshooting.
It analyzes logs, metrics, and alerts to help teams quickly identify root causes and suggest possible fixes just like a 24/7 on-call AI teammate.
For Beginners
Understand what HolmesGPT is and how this open-source AI tool helps analyze cloud and Kubernetes problems in a simple way.
Learn how to connect basic monitoring tools and start using open-source HolmesGPT to investigate issues automatically.
For Maintainers
Learn how HolmesGPT reduces manual troubleshooting effort and supports on-call engineering teams.
Explore ways to integrate HolmesGPT into existing monitoring and incident response workflows using open standards and tools.
For Everyone
Understand how open-source AI agents are transforming cloud operations and problem-solving.
Access community-driven resources, examples, and demos to practice AI-powered troubleshooting with FOSS tools.