EveryDev.ai
Subscribe
Home
Tools

2,928+ AI tools

  • New
  • Trending
  • Featured
  • Compare
  • Arena
Categories
  • Agents2063
  • Coding1441
  • Infrastructure665
  • Marketing524
  • Projects470
  • Research437
  • Design408
  • Analytics371
  • MCP268
  • Security265
  • Testing255
  • Data249
  • Integration183
  • Prompts183
  • Communication172
  • Learning166
  • Extensions163
  • Voice146
  • Commerce132
  • DevOps115
  • Web84
  • Finance24
AI Tools by Topic
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
    1. Home
    2. Tools
    3. KubeGraf
    KubeGraf icon

    KubeGraf

    Autonomous Systems

    AI SRE platform for Kubernetes that detects incidents, performs automated root cause analysis, and applies HMAC-signed SafeFix patches in under 5 minutes.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Get started with KubeGraf at no cost

    14-day free trial of the Pro plan. No credit card required.

    Pro: $382/mo
    Enterprise: Custom/contact

    Engagement

    Available On

    Windows
    macOS
    Linux
    Web
    API

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Autonomous SystemsDevOps InfrastructureMonitoring Tools

    Alternatives

    Radar by SkyhookGitLab DuoninoxAI
    Developer
    OrkastorLondonEst. 2026

    Updated May 2026

    About KubeGraf

    KubeGraf is an AI Site Reliability Engineering (SRE) platform built for Kubernetes teams that need to detect, diagnose, and resolve cluster incidents fast. Developed by Orkastor, it combines an in-cluster Go agent with a SaaS control plane and a six-agent AI pipeline to take incidents from detection to safe remediation in a median of 4 minutes 21 seconds. The tool is available as both a SaaS product and a self-hosted/air-gapped enterprise deployment, with v1.0.0 released on March 24, 2026.

    What It Is

    KubeGraf sits above your existing observability stack — Prometheus, Grafana, OpenTelemetry — and adds an autonomous reasoning and remediation layer. Rather than surfacing dashboards and alerts, it reasons over telemetry, proposes a cryptographically-signed remediation with evidence, and applies it only after a human approves. The core product is licensed under Apache 2.0 and the source code is publicly available on GitHub.

    How the Six-Agent Pipeline Works

    When an incident occurs, KubeGraf's AI pipeline activates six specialized agents in sequence:

    • Topology — maps the service graph and blast radius
    • RootCause — forms and ranks hypotheses across logs, traces, and deploys
    • LogReasoner — extracts the panic line or OOM event from container logs
    • TraceWalker — follows the failure across distributed traces
    • CodeAware — identifies the PR or commit that introduced the regression
    • Remediation — drafts a YAML patch with a dry-run preview and HMAC-SHA256 signature

    The pipeline delivers a root cause with 3–7 evidence citations in a vendor-stated median of 4–7 seconds. Every SafeFix™ patch is applied to a 10% canary first and auto-reverted within 30 seconds if metrics degrade.

    Security and Data Architecture

    KubeGraf's agent runs inside the cluster and makes only outbound calls — no inbound ports are required. The agent pushes allowlisted snapshots (never raw secrets or config-map values) to the control plane. HMAC signing keys stay in the customer's KMS (AWS, GCP, Azure, or HashiCorp Vault). The enterprise tier ships the full control plane as a Helm chart for on-premises or air-gapped deployments, with support for bring-your-own LLM endpoints (Anthropic Claude, OpenAI, AWS Bedrock, Azure OpenAI, or any OpenAI-compatible private endpoint).

    Deployment and Setup Path

    Installation follows a single Helm chart drop into any Kubernetes cluster — no privileged access, no hard dependency on Prometheus, Istio, or OpenTelemetry. The agent registers with the control plane within approximately 60 seconds. KubeGraf also ships a CLI binary (available via Homebrew on macOS, curl on Linux, and Scoop on Windows) that launches either a web dashboard at localhost:3000 or a full-featured terminal UI suitable for SSH sessions. Authentication works with all standard Kubernetes auth methods: client certificates, bearer tokens, OIDC, GKE, EKS, AKS, and exec-based credential plugins.

    Integrations and Platform Support

    KubeGraf connects natively to the tools the vendor describes as the standard Kubernetes observability stack:

    • Cloud providers: Amazon EKS, Azure AKS, Google GKE
    • Package management: Helm
    • Metrics: Prometheus
    • Tracing: OpenTelemetry
    • GitOps: ArgoCD, Flux
    • Alerting: Slack, PagerDuty, Opsgenie, email, webhooks
    • Version control: GitHub

    Update: v1.0.0 Launch

    The GitHub repository shows v1.0.0 was published on March 24, 2026, following a pre-launch announcement in the README that listed the release date as March 23, 2026. The repository was created in November 2025 and last pushed in March 2026. The Apache 2.0 license covers the core agent and CLI codebase. The SaaS control plane at app.kubegraf.io is a separate commercial offering layered on top of the open-source agent.

    KubeGraf - 1

    Community Discussions

    Be the first to start a conversation about KubeGraf

    Share your experience with KubeGraf, ask questions, or help others learn from your insights.

    Pricing

    TRIAL

    Pro Trial

    14-day free trial of the Pro plan. No credit card required.

    • Full Pro plan access
    • 3 clusters
    • Up to 1,500 pods
    • 20 deep investigations per month
    • 200 normal investigations per month

    Pro

    Popular

    AI-powered Kubernetes reliability for production teams. Self-serve signup, 14-day free trial included.

    $382/mo
    billed annually
    $449/mo monthly
    • 3 clusters
    • Up to 1,500 pods
    • Up to 10 team members
    • 20 deep investigations per month
    • 200 normal investigations per month
    • 90-day data retention
    • Agentic AI root cause analysis
    • AI investigation chat
    • Auto-remediation (SafeFix™)
    • GitOps — ArgoCD & Flux sync
    • SLO & burn-rate alerts
    • Slack, email & webhook alerts
    • PagerDuty & Opsgenie integration
    • kubectl terminal & live exec
    • Multi-cluster dashboards
    • API access & webhooks
    • Priority email support (NBD)
    • Self-serve onboarding

    Enterprise

    Built for regulated & mission-critical Kubernetes at scale. Procurement-ready, dedicated success team, deployment flexibility.

    Custom
    contact sales
    • Unlimited clusters
    • 10,000+ pods, no cap
    • Unlimited team members
    • Unlimited deep investigations
    • Unlimited normal investigations
    • Custom data retention (up to 7 years)
    • SSO / SAML
    • SCIM directory sync
    • On-prem / air-gapped deployment
    • Dedicated database & infrastructure
    • Custom data residency (EU / US / APAC)
    • 7-year WORM audit logs export
    • Premium technical support (24/7, <1hr P1)
    • Dedicated Customer Success Manager
    • Included onboarding & training
    • 99.9% uptime SLA & P1 response SLA
    • Marketplace billing available
    View official pricing

    Capabilities

    Key Features

    • Automated incident detection (200ms anomaly detection)
    • Six-agent AI root cause analysis pipeline
    • SafeFix™ HMAC-SHA256 signed remediation patches
    • Canary-first rollout with auto-revert on metric degradation
    • One-click approval from Slack, dashboard, or CLI
    • Terminal UI for SSH and low-bandwidth environments
    • Web dashboard for visual monitoring
    • Multi-cluster management and correlation
    • GitOps integration (ArgoCD, Flux)
    • kubectl terminal and live exec
    • SLO and burn-rate alerting
    • AI investigation chat
    • Knowledge graph across logs, traces, and code
    • Air-gapped and on-premises enterprise deployment
    • Bring-your-own LLM endpoint support
    • SSO/SAML and SCIM 2.0 (Enterprise)
    • Audit logs with tamper-evident chain
    • Security posture assessment

    Integrations

    Amazon EKS
    Azure AKS
    Google GKE
    Helm
    Prometheus
    OpenTelemetry
    ArgoCD
    Flux
    GitHub
    Slack
    PagerDuty
    Opsgenie
    AWS Bedrock
    Azure OpenAI
    Anthropic Claude
    OpenAI
    HashiCorp Vault
    Splunk
    Datadog
    Elastic
    API Available
    View Docs

    Ratings & Reviews

    No ratings yet

    Be the first to rate KubeGraf and help others make informed decisions.

    Developer

    Orkastor

    Orkastor builds KubeGraf, an autonomous AI SRE platform for Kubernetes that delivers evidence-backed root cause analysis and SafeFix™ remediations without cloud dependency. The team powers KubeGraf with OrkasAI, a multi-model reasoning engine designed for production-grade incident response. Orkastor focuses on local-first, zero-data-exfiltration tooling for SRE and platform engineering teams.

    Founded 2026
    167-169 Great Portland Street, W1W 5PF
    5 employees
    Read more about Orkastor
    WebsiteGitHubLinkedInX / Twitter
    1 tool in directory

    Similar Tools

    Radar by Skyhook icon

    Radar by Skyhook

    An open-source Kubernetes UI that provides topology, events, Helm, GitOps, image inspection, audits, and MCP for AI agents — all in a single binary or self-hosted in your cluster.

    GitLab Duo icon

    GitLab Duo

    GitLab Duo is an AI-powered assistant built into the GitLab DevSecOps platform, providing code suggestions, agentic automation, and security insights across the entire software lifecycle.

    ninoxAI icon

    ninoxAI

    Open-source, local-first, read-only AI SRE that clusters alert storms into incidents, investigates root cause over live systems, and proposes human-gated fixes.

    Browse all tools

    Related Topics

    Autonomous Systems

    AI agents that can perform complex tasks with minimal human guidance.

    306 tools

    DevOps Infrastructure

    Platforms and tools for CI/CD pipelines and DevOps practices.

    64 tools

    Monitoring Tools

    AI-enhanced monitoring solutions that provide real-time visibility into system performance, anomaly detection, and predictive alerting for proactive issue resolution.

    70 tools
    Browse all topics
    Back to all toolsSuggest an edit
    ratings
    discussions
    22views