Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    EveryDev.ai
    Sign inSubscribe
    Home
    Tools

    2,085+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1181
    • Coding1018
    • Infrastructure446
    • Marketing412
    • Design362
    • Projects332
    • Analytics318
    • Research303
    • Testing197
    • Data169
    • Integration166
    • Security166
    • MCP158
    • Learning145
    • Communication129
    • Extensions119
    • Commerce115
    • Prompts114
    • Voice106
    • DevOps91
    • Web73
    • Finance19
    1. Home
    2. Tools
    3. Regent
    Regent icon

    Regent

    LLM Evaluations

    Regent proxies every LLM call in your app to detect behavioral regressions between code versions, posting detailed diff reports directly to pull requests.

    Visit Website

    At a Glance

    Pricing
    Free

    Get started for free with Regent's core LLM regression testing features.

    Engagement

    Available On

    Web
    API
    CLI

    Resources

    Websitellms.txt

    Topics

    LLM EvaluationsAutomated TestingAI Infrastructure

    Alternatives

    AshrSkillsBenchAtla AI
    Developer
    RegentRegent builds LLM regression testing infrastructure for prod…

    Listed Apr 2026

    About Regent

    Regent is an LLM regression testing tool built for production teams shipping AI-powered applications. It proxies every LLM call inside your app — including nested chains, multi-step agents, and parallel calls — and automatically diffs outputs between your current branch and the main branch baseline. Results are posted directly as PR comments, so your team knows exactly what changed before merging.

    • Automatic baseline capture: Regent captures your main branch as the golden standard on setup and re-captures it automatically on every merge, keeping baselines always up to date.
    • Full call chain visibility: Unlike tools that only inspect final outputs, Regent captures every intermediate LLM call in a request — nested chains, multi-step agents, and parallel calls are all tracked.
    • PR comments out of the box: Every pull request automatically receives a detailed diff report as a comment, showing exactly which LLM calls drifted and how outputs changed.
    • Zero production traffic: Regent only proxies CI test runs — your live production traffic is never touched or routed through Regent.
    • Scenario-based testing: Define the API endpoints you want to monitor, and Regent reruns those scenarios on every PR to compare against the baseline.
    • Drift detection: Regent highlights specific field-level changes in LLM outputs (e.g., tone, confidence, word count, department routing) so regressions are immediately actionable.

    To get started, connect your GitHub repository, add the Regent workflow file, and define the API endpoint scenarios you want to test. Regent handles baseline capture and diff reporting automatically from there.

    Regent - 1

    Community Discussions

    Be the first to start a conversation about Regent

    Share your experience with Regent, ask questions, or help others learn from your insights.

    Pricing

    FREE

    Free

    Get started for free with Regent's core LLM regression testing features.

    • LLM call proxying in CI
    • Automatic baseline capture
    • PR diff comments
    • Full call chain visibility
    • Scenario-based testing

    Capabilities

    Key Features

    • LLM call proxying
    • Automatic baseline capture on main branch
    • Full call chain visibility (nested chains, multi-step agents)
    • PR diff comments posted automatically
    • Field-level output drift detection
    • Zero production traffic impact
    • Scenario-based endpoint testing
    • GitHub integration

    Integrations

    GitHub
    API Available

    Reviews & Ratings

    No ratings yet

    Be the first to rate Regent and help others make informed decisions.

    Developer

    Regent Team

    Regent builds LLM regression testing infrastructure for production AI teams. The product proxies LLM calls across entire agent chains and automatically surfaces behavioral drift between code versions. Regent integrates directly into CI/CD workflows via GitHub, posting diff reports to pull requests without touching production traffic.

    Read more about Regent Team
    Website
    1 tool in directory

    Similar Tools

    Ashr icon

    Ashr

    Ashr is an AI agent evaluation platform that mimics production environments and user behavior to catch agent failures before they reach real users.

    SkillsBench icon

    SkillsBench

    An open-source evaluation framework that benchmarks how well AI agent skills work across diverse, expert-curated tasks in high-GDP-value domains.

    Atla AI icon

    Atla AI

    Atla AI is an AI evaluation platform that helps teams assess and improve the quality of large language model outputs.

    Browse all tools

    Related Topics

    LLM Evaluations

    Platforms and frameworks for evaluating, testing, and benchmarking LLM systems and AI applications. These tools provide evaluators and evaluation models to score AI outputs, measure hallucinations, assess RAG quality, detect failures, and optimize model performance. Features include automated testing with LLM-as-a-judge metrics, component-level evaluation with tracing, regression testing in CI/CD pipelines, custom evaluator creation, dataset curation, and real-time monitoring of production systems. Teams use these solutions to validate prompt effectiveness, compare models side-by-side, ensure answer correctness and relevance, identify bias and toxicity, prevent PII leakage, and continuously improve AI product quality through experiments, benchmarks, and performance analytics.

    61 tools

    Automated Testing

    AI-powered platforms that automate end-to-end testing processes with intelligent test case generation, execution, and reporting for faster, more reliable software delivery.

    86 tools

    AI Infrastructure

    Infrastructure designed for deploying and running AI models.

    198 tools
    Browse all topics
    Back to all tools
    Explore AI Tools
    • AI Coding Assistants
    • Agent Frameworks
    • MCP Servers
    • AI Prompt Tools
    • Vibe Coding Tools
    • AI Design Tools
    • AI Database Tools
    • AI Website Builders
    • AI Testing Tools
    • LLM Evaluations
    Follow Us
    • X / Twitter
    • LinkedIn
    • Reddit
    • Discord
    • Threads
    • Bluesky
    • Mastodon
    • YouTube
    • GitHub
    • Instagram
    Get Started
    • About
    • Editorial Standards
    • Corrections & Disclosures
    • Community Guidelines
    • Advertise
    • Contact Us
    • Newsletter
    • Submit a Tool
    • Start a Discussion
    • Write A Blog
    • Share A Build
    • Terms of Service
    • Privacy Policy
    Explore with AI
    • ChatGPT
    • Gemini
    • Claude
    • Grok
    • Perplexity
    Agent Experience
    • llms.txt
    Theme
    With AI, Everyone is a Dev. EveryDev.ai © 2026
    Discussions