From Chaos to Clarity: AI-Driven Kubernetes Log Analysis

Rejected

Session Description

If you’re an SRE wrestling with Kubernetes or OpenShift, you know logs are a double-edged sword, it can be both a lifeline and a nightmare. Everyone sets up Prometheus or some basic alerting, but then what? You’re stuck wading through a swamp of pod logs, node errors, and event dumps, trying to figure out why your cluster’s freaking out your #ops slack channel is filled with messages.

In this talk, I’ll show you how I tackled that chaos by building an AI-powered Kubernetes Log Analyzer. This session unveils a fresh way to tackle Kubernetes logs, letting you query cluster issues in plain English—like asking an AI assistant, "what went wrong with abc deployment or xyz pod". It skips the slog of manual log hunting and aims to deliver fast and clear answers.

Technical Insights:

- How I build an LLM model with Kubernetes log patterns, e.g. “memory limit hit” or “node failure” to catch issues.

- Turning logs into vectors with a retrieval system, stored in a vector database for fast, cluster-specific searches.

- Using generation to blend retrieved log data with metrics, customized for Kubernetes logs.

Takeaways: With this talk I will walk you through the concepts of transformers using hugging face libraries, sharing why I picked certain techniques (RAG) for specializing the LLM model for log analysis. Expect a live demo and a GitHub repo with the code, yours to grab and tweak.

Key Takeaways

None

Reviews

100 %

Approvability

2

Approvals

0

Rejections

1

Not Sure

This talk seems to be covering a good use case of using LLMs for a real-world use-case such as log parsing which is generally a trivial thing to figure out.

Reviewer #1

Approved

No reference project related to the proposal mentioned in bio or reference link.

Reviewer #2

Not Sure

Reviewer #3

Approved

From Chaos to Clarity: AI-Driven Kubernetes Log Analysis

Bharat Rajani