KubeGraf

Name: KubeGraf
Availability: OnlineOnly
Author: Orkastor

Autonomous AI SRE platform for Kubernetes that detects incidents, performs evidence-backed root cause analysis, and delivers dry-run validated SafeFix™ remediations — all running locally without cloud dependency.

Visit Website

At a Glance

Pricing

Free tier available

Forever free, no account needed.

Pro: $229/yr

Team: $567/yr

Enterprise: Custom/contact

Engagement

Available On

Windows

macOS

Linux

Web

API

OrkastorLondonEst. 2026

Listed Mar 2026

About KubeGraf

KubeGraf is an autonomous, always-on AI SRE platform built exclusively for Kubernetes. It detects incidents like CrashLoopBackOff, OOMKilled, and probe failures, correlates multi-source signals (logs, metrics, traces, events), and delivers evidence-backed SafeFix™ remediations with dry-run validation and human-in-the-loop approval. Powered by OrkasAI, KubeGraf runs entirely local-first — your cluster data never leaves your environment.

SafeFix™ Remediation — Generates YAML diff previews, blast radius analysis, confidence scores, and one-command rollback for every recommended fix before you apply anything.
Evidence-Based Root Cause Analysis — Correlates logs, Kubernetes events, metrics, traces, and recent deployments into a reproducible evidence chain with confidence scores — not a black box.
Dry-Run Validation — Simulates every fix using kubectl diff integration before execution; shows exact changes and potential side effects with zero risk.
Anomaly Fingerprinting — Detects recurring failure patterns and builds fingerprints to auto-recognize similar incidents, cutting diagnosis time on repeat failures.
Multi-Cluster Management — Investigate and remediate incidents across multiple clusters from a single interface without losing investigation context.
BYOK AI Engine — Bring your own API key from OpenAI, Anthropic, Gemini, or Ollama; AI calls go directly from your machine to your provider — KubeGraf never sees your key or queries.
Terminal UI + Web Dashboard — Use the keyboard-driven TUI during live incidents and the browser-based web dashboard for post-mortems and trend analysis.
Knowledge Bank — Local SQLite database stores all incident history; search by pod, namespace, error type, or fix; export reports for post-mortems.
RBAC-Aware Operations — Respects your cluster's RBAC policies; suggested fixes adapt to what your user can actually apply.
Full Audit Trail — Every analysis, recommendation, and applied fix is logged with timestamps and user context for compliance and post-mortems.
GitOps Integration — Sync fixes to Git via ArgoCD or Flux; supports Helm, Istio, Cilium, Nginx, and all major cloud Kubernetes providers (EKS, GKE, AKS, OpenShift, K3s).

Community Discussions

Be the first to start a conversation about KubeGraf

Share your experience with KubeGraf, ask questions, or help others learn from your insights.

Pricing

FREE

Free

Forever free, no account needed.

Full kubectl terminal + all workload views
SafeFix Engine + graph-based incident detection
Knowledge Bank + custom app deployment
GitOps — sync fixes to Git (ArgoCD / Flux)
25 BYOK AI investigations / mo

Pro

Popular

1 seat, billed $229/year ($19/mo equivalent). Save 34%.

$229/yr

billed annually

Everything in Free · 1 seat
Unlimited BYOK AI investigations
ML Insights — anomaly predictions & timeline
DB export & import (portable encrypted backup)

Team

3-seat minimum at $189/seat/yr ($567/yr base). Save 34%.

$567/yr

billed annually

Everything in Pro · 3–50 seats
Priority email support

Enterprise

Custom pricing — tailored to your needs.

Custom

contact sales

Unlimited seats
Self-hosted or cloud deploy
Custom BYOK AI volume + key management
Custom SLA + priority escalation
Dedicated success manager

View official pricing

Capabilities

Key Features

Autonomous incident detection (CrashLoopBackOff, OOMKilled, ImagePullBackOff, probe failures)
SafeFix™ dry-run validated remediations with YAML diff preview
Evidence-based root cause analysis with confidence scores
Multi-source signal correlation (logs, metrics, traces, events)
Anomaly fingerprinting for recurring failure patterns
BYOK AI engine (OpenAI, Anthropic, Gemini, Ollama)
Terminal UI and web dashboard
Local-first architecture — zero data exfiltration
Knowledge Bank with SQLite incident history
Multi-cluster management
RBAC-aware operations
Full audit trail
GitOps integration (ArgoCD, Flux)
One-command rollback
Human-in-the-loop approval for all changes

Integrations

AWS EKS

Google GKE

Azure AKS

Rancher

OpenShift

K3s

Helm

ArgoCD

Flux

Istio

Cilium

Nginx

Prometheus

OpenTelemetry

Grafana

OpenAI

Anthropic

Gemini

Ollama

API Available

View Docs

Demo Video

Watch on YouTube

Back to all tools

KubeGraf

Autonomous Systems

Visit Website

At a Glance

Pricing

Free tier available

Forever free, no account needed.

Pro: $229/yr

Team: $567/yr

Enterprise: Custom/contact

Engagement

8views

Discussions

Available On

Windows

macOS

Linux

Web

API

Resources

Website Docs GitHub llms.txt

Topics

Autonomous Systems Container Orchestration Observability Platforms

Alternatives

Metoro Plurai Future AGI

Developer

OrkastorLondonEst. 2026

Listed Mar 2026

About KubeGraf

SafeFix™ Remediation — Generates YAML diff previews, blast radius analysis, confidence scores, and one-command rollback for every recommended fix before you apply anything.
Evidence-Based Root Cause Analysis — Correlates logs, Kubernetes events, metrics, traces, and recent deployments into a reproducible evidence chain with confidence scores — not a black box.
Dry-Run Validation — Simulates every fix using kubectl diff integration before execution; shows exact changes and potential side effects with zero risk.
Anomaly Fingerprinting — Detects recurring failure patterns and builds fingerprints to auto-recognize similar incidents, cutting diagnosis time on repeat failures.
Multi-Cluster Management — Investigate and remediate incidents across multiple clusters from a single interface without losing investigation context.
BYOK AI Engine — Bring your own API key from OpenAI, Anthropic, Gemini, or Ollama; AI calls go directly from your machine to your provider — KubeGraf never sees your key or queries.
Terminal UI + Web Dashboard — Use the keyboard-driven TUI during live incidents and the browser-based web dashboard for post-mortems and trend analysis.
Knowledge Bank — Local SQLite database stores all incident history; search by pod, namespace, error type, or fix; export reports for post-mortems.
RBAC-Aware Operations — Respects your cluster's RBAC policies; suggested fixes adapt to what your user can actually apply.
Full Audit Trail — Every analysis, recommendation, and applied fix is logged with timestamps and user context for compliance and post-mortems.
GitOps Integration — Sync fixes to Git via ArgoCD or Flux; supports Helm, Istio, Cilium, Nginx, and all major cloud Kubernetes providers (EKS, GKE, AKS, OpenShift, K3s).

Community Discussions

Be the first to start a conversation about KubeGraf

Share your experience with KubeGraf, ask questions, or help others learn from your insights.

Pricing

FREE

Free

Forever free, no account needed.

Full kubectl terminal + all workload views
SafeFix Engine + graph-based incident detection
Knowledge Bank + custom app deployment
GitOps — sync fixes to Git (ArgoCD / Flux)
25 BYOK AI investigations / mo

Pro

Popular

1 seat, billed $229/year ($19/mo equivalent). Save 34%.

$229/yr

billed annually

Everything in Free · 1 seat
Unlimited BYOK AI investigations
ML Insights — anomaly predictions & timeline
DB export & import (portable encrypted backup)

Team

3-seat minimum at $189/seat/yr ($567/yr base). Save 34%.

$567/yr

billed annually

Everything in Pro · 3–50 seats
Priority email support

Enterprise

Custom pricing — tailored to your needs.

Custom

contact sales

Unlimited seats
Self-hosted or cloud deploy
Custom BYOK AI volume + key management
Custom SLA + priority escalation
Dedicated success manager

View official pricing

Capabilities

Key Features

Autonomous incident detection (CrashLoopBackOff, OOMKilled, ImagePullBackOff, probe failures)
SafeFix™ dry-run validated remediations with YAML diff preview
Evidence-based root cause analysis with confidence scores
Multi-source signal correlation (logs, metrics, traces, events)
Anomaly fingerprinting for recurring failure patterns
BYOK AI engine (OpenAI, Anthropic, Gemini, Ollama)
Terminal UI and web dashboard
Local-first architecture — zero data exfiltration
Knowledge Bank with SQLite incident history
Multi-cluster management
RBAC-aware operations
Full audit trail
GitOps integration (ArgoCD, Flux)
One-command rollback
Human-in-the-loop approval for all changes

Integrations

AWS EKS

Google GKE

Azure AKS

Rancher

OpenShift

K3s

Helm

ArgoCD

Flux

Istio

Cilium

Nginx

Prometheus

OpenTelemetry

Grafana

OpenAI

Anthropic

Gemini

Ollama

API Available

View Docs

Demo Video

Watch on YouTube

Back to all tools