Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Podcasts
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    EveryDev.ai
    Sign inSubscribe
    Home
    Tools

    2,320+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1228
    • Coding1045
    • Infrastructure455
    • Marketing414
    • Design374
    • Projects340
    • Analytics319
    • Research306
    • Testing200
    • Data171
    • Integration169
    • Security169
    • MCP164
    • Learning146
    • Communication131
    • Prompts122
    • Extensions120
    • Commerce116
    • Voice107
    • DevOps92
    • Web73
    • Finance19
    1. Home
    2. Tools
    3. KubeGraf
    KubeGraf icon

    KubeGraf

    Autonomous Systems

    AI SRE platform for Kubernetes that detects incidents, performs automated root cause analysis, and applies HMAC-signed safe fixes in under 5 minutes.

    Visit Website

    At a Glance

    Pricing
    Free tier available
    Trial available

    Forever free plan for a single cluster with basic monitoring. No credit card required.

    14-day free trial on all paid Cloud plans. No credit card required.

    Starter: $66/mo
    Starter (Annual): $66/mo
    Pro: $207/mo
    +4 more plans

    Engagement

    Available On

    Web
    CLI
    API

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Autonomous SystemsDevOps InfrastructureMonitoring Tools

    Alternatives

    KubewallCloudShip AIReadout
    Developer
    OrkastorLondonEst. 2026

    Updated May 2026

    About KubeGraf

    KubeGraf is an AI SRE platform for Kubernetes built by Orkastor, designed to detect, diagnose, and fix cluster incidents autonomously. It combines an in-cluster Go agent with a SaaS control plane and a six-agent AI pipeline to take incidents from detection to remediation in a median of 4 minutes and 21 seconds. The platform is available as both a cloud-hosted SaaS and a self-hosted binary, with an enterprise tier supporting air-gapped deployments.

    What It Is

    KubeGraf sits above your existing observability stack — Prometheus, Grafana, OpenTelemetry — and acts as an autonomous SRE layer. Rather than surfacing more dashboards and alerts, it reasons over telemetry, proposes evidence-backed remediations, and applies them under a human approval gate. The core product is an in-cluster agent (22 MB, no privileged access required, no inbound ports) that reads cluster state in read-only mode by default, normalizes it into a knowledge graph, and streams allowlisted snapshots to the control plane. Raw secrets and config map values are never transmitted.

    The Six-Agent AI Pipeline

    The AI engine — branded OrkasAI — runs six specialized agents in sequence when an incident is detected:

    • Topology — maps the service graph and blast radius
    • RootCause — forms and ranks hypotheses
    • LogReasoner — extracts panic lines and error signatures from container logs
    • TraceWalker — follows distributed traces across services
    • CodeAware — identifies the PR or deploy that introduced the regression
    • Remediation — drafts a YAML patch with evidence citations

    The pipeline reportedly delivers root cause analysis with 94% confidence and 3–7 evidence citations in a 4–7 second median window.

    SafeFix™ and the Approval Model

    SafeFix™ is KubeGraf's remediation mechanism. Every patch is HMAC-SHA256 signed end-to-end, dry-run validated against policy and RBAC, and applied to a 10% canary first. If metrics degrade after the canary rollout, the fix auto-reverts within 30 seconds. Approval can be triggered from Slack, the web dashboard, or the CLI — the engineer sees the exact YAML diff and the evidence chain before clicking. The platform positions this as "the right amount of human in the loop" compared to fully autonomous auto-remediation tools.

    Deployment Model and Security

    KubeGraf ships in two modes. The Cloud product deploys a Helm agent into the cluster that reports to app.kubegraf.io, providing a hosted dashboard, incident management, SLOs, and multi-cluster federation. The Local product is a self-hosted binary where all data stays on the operator's infrastructure. The Enterprise tier ships the full control plane as a Helm chart for on-prem or air-gapped clusters, with tokens and HMAC secrets stored in the customer's KMS. LLM provider is configurable — Anthropic Claude and OpenAI ship by default, with AWS Bedrock and Azure OpenAI available, and enterprise customers can point at a private model endpoint.

    Integrations and Platform Support

    KubeGraf connects natively to Amazon EKS, Azure AKS, Google GKE, Helm, Prometheus, GitHub, OpenTelemetry, Slack, and ArgoCD. GitOps sync via ArgoCD and Flux is available on higher tiers. PagerDuty and Opsgenie integrations are available for larger teams. The terminal UI and web dashboard provide two interaction surfaces, and the CLI supports one-click approval workflows.

    Current Status

    KubeGraf is actively developed under Orkastor and is in production availability as of late 2025. The pricing page notes a comparison to Komodor's 2024 removal of their free tier, and the comparison table references publicly available product information dated December 2025. Documentation is versioned under docs-next, signaling active iteration. The platform supports a 14-day free trial on all paid cloud plans with no credit card required.

    KubeGraf - 1

    Community Discussions

    Be the first to start a conversation about KubeGraf

    Share your experience with KubeGraf, ask questions, or help others learn from your insights.

    Pricing

    FREE

    Free

    Forever free plan for a single cluster with basic monitoring. No credit card required.

    • 1 cluster · 5 nodes · 3 members
    • Cluster health & pod metrics
    • 7-day metric retention
    • kubectl terminal
    • Community support
    TRIAL

    14-Day Free Trial

    14-day free trial on all paid Cloud plans. No credit card required.

    • Full access to selected paid plan features
    • No credit card required
    • Cancel anytime

    Starter

    Incident detection, Slack alerts, and 30-day retention for small teams.

    $66/mo
    billed annually
    $79/mo monthly
    • 3 clusters · 30 nodes · 10 members
    • Everything in Free
    • Incident detection & timeline
    • Slack, email & webhook alerts
    • 30-day metric retention
    • Email support

    Starter (Annual)

    Incident detection, Slack alerts, and 30-day retention — billed annually.

    $66/mo
    billed annually
    $80/mo monthly
    • 3 clusters · 30 nodes · 10 members
    • Everything in Free
    • Incident detection & timeline
    • Slack, email & webhook alerts
    • 30-day metric retention
    • Email support

    Pro

    Popular

    AI root cause analysis, SLOs, auto-remediation, and GitOps for growing teams.

    $207/mo
    billed annually
    $249/mo monthly
    • 10 clusters · unlimited nodes · 25 members
    • Everything in Starter
    • AI root cause analysis
    • Auto-remediation (SafeFix™)
    • SLO monitoring & burn-rate alerts
    • GitOps — ArgoCD / Flux sync
    • 90-day metric retention
    • Priority support

    Pro (Annual)

    AI root cause analysis, SLOs, auto-remediation, and GitOps — billed annually.

    $207.5/mo
    billed annually
    $250/mo monthly
    • 10 clusters · unlimited nodes · 25 members
    • Everything in Starter
    • AI root cause analysis
    • Auto-remediation (SafeFix™)
    • SLO monitoring & burn-rate alerts
    • GitOps — ArgoCD / Flux sync
    • 90-day metric retention
    • Priority support

    Business

    Multi-cluster federation, SCIM, PagerDuty, and unlimited scale.

    $498/mo
    billed annually
    $599/mo monthly
    • Unlimited clusters · members · workspaces
    • Everything in Pro
    • Multi-cluster federation dashboard
    • SCIM / directory sync
    • PagerDuty & Opsgenie integration
    • 1-year metric retention
    • Audit logs & compliance reports
    • Dedicated Slack channel support

    Business (Annual)

    Multi-cluster federation, SCIM, PagerDuty, and unlimited scale — billed annually.

    $499/mo
    billed annually
    $602/mo monthly
    • Unlimited clusters · members · workspaces
    • Everything in Pro
    • Multi-cluster federation dashboard
    • SCIM / directory sync
    • PagerDuty & Opsgenie integration
    • 1-year metric retention
    • Audit logs & compliance reports
    • Dedicated Slack channel support

    Enterprise

    Dedicated infra, SSO/SAML, custom SLA, on-prem and air-gapped deployment. Custom pricing.

    Custom
    contact sales
    • Everything in Business
    • Dedicated database & infra
    • Single Sign-On (SSO / SAML)
    • Custom data retention & residency
    • 99.9% uptime SLA
    • On-prem / air-gapped deployment
    • Dedicated success manager
    • Private LLM / BYOK
    • 1-hour P1 SLA + TAM
    View official pricing

    Capabilities

    Key Features

    • Automated root cause analysis with 6-agent AI pipeline
    • SafeFix™ HMAC-SHA256 signed patch generation
    • Canary-first rollout with auto-revert on metric degradation
    • 200ms anomaly detection scanning 100 signals per second
    • One-click approval from Slack, dashboard, or CLI
    • Knowledge graph across logs, traces, deploys, and code
    • SLO monitoring and burn-rate alerts
    • Multi-cluster federation dashboard
    • Air-gapped and on-prem deployment support
    • Terminal UI and web dashboard
    • kubectl terminal access
    • Automated post-mortem generation
    • GitOps integration with ArgoCD and Flux
    • SCIM/directory sync
    • Audit logs and compliance reports
    • Configurable LLM provider (Claude, OpenAI, Bedrock, Azure OpenAI)
    • Local-first mode with zero data exfiltration
    • RBAC-aware dry-run validation before fix application

    Integrations

    Amazon EKS
    Azure AKS
    Google GKE
    Helm
    Prometheus
    GitHub
    OpenTelemetry
    Slack
    ArgoCD
    Flux
    PagerDuty
    Opsgenie
    Anthropic Claude
    OpenAI
    AWS Bedrock
    Azure OpenAI
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate KubeGraf and help others make informed decisions.

    Developer

    Orkastor

    Orkastor builds KubeGraf, an autonomous AI SRE platform for Kubernetes that delivers evidence-backed root cause analysis and SafeFix™ remediations without cloud dependency. The team powers KubeGraf with OrkasAI, a multi-model reasoning engine designed for production-grade incident response. Orkastor focuses on local-first, zero-data-exfiltration tooling for SRE and platform engineering teams.

    Founded 2026
    167-169 Great Portland Street, W1W 5PF
    5 employees
    Read more about Orkastor
    WebsiteGitHubLinkedInX / Twitter
    1 tool in directory

    Similar Tools

    Kubewall icon

    Kubewall

    A single binary Kubernetes dashboard that lets you manage multiple clusters from a single place with no dependencies or configuration required.

    CloudShip AI icon

    CloudShip AI

    CloudShip AI turns DevOps and FinOps infrastructure complexity into autonomous AI agents, delivering structured operational insights and executive-level clarity from your own infrastructure.

    Readout icon

    Readout

    Readout is a macOS dashboard that consolidates your dev environment — repos, costs, sessions, dependencies, and config — into one view instead of multiple terminal tabs.

    Browse all tools

    Related Topics

    Autonomous Systems

    AI agents that can perform complex tasks with minimal human guidance.

    200 tools

    DevOps Infrastructure

    Platforms and tools for CI/CD pipelines and DevOps practices.

    53 tools

    Monitoring Tools

    AI-enhanced monitoring solutions that provide real-time visibility into system performance, anomaly detection, and predictive alerting for proactive issue resolution.

    65 tools
    Browse all topics
    Back to all tools
    Explore AI Tools
    • AI Coding Assistants
    • Agent Frameworks
    • MCP Servers
    • AI Prompt Tools
    • Vibe Coding Tools
    • AI Design Tools
    • AI Database Tools
    • AI Website Builders
    • AI Testing Tools
    • LLM Evaluations
    Follow Us
    • X / Twitter
    • LinkedIn
    • Reddit
    • Discord
    • Threads
    • Bluesky
    • Mastodon
    • YouTube
    • GitHub
    • Instagram
    Get Started
    • About
    • Editorial Standards
    • Corrections & Disclosures
    • Community Guidelines
    • Advertise
    • Contact Us
    • Newsletter
    • Submit a Tool
    • Start a Discussion
    • Write A Blog
    • Share A Build
    • Terms of Service
    • Privacy Policy
    Explore with AI
    • ChatGPT
    • Gemini
    • Claude
    • Grok
    • Perplexity
    Agent Experience
    • llms.txt
    Theme
    With AI, Everyone is a Dev. EveryDev.ai © 2026
    15views
    Discussions