Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    EveryDev.ai
    Sign inSubscribe
    Home
    Tools

    2,147+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1228
    • Coding1045
    • Infrastructure455
    • Marketing414
    • Design374
    • Projects340
    • Analytics319
    • Research306
    • Testing200
    • Data171
    • Integration169
    • Security169
    • MCP164
    • Learning146
    • Communication131
    • Prompts122
    • Extensions120
    • Commerce116
    • Voice107
    • DevOps92
    • Web73
    • Finance19
    1. Home
    2. Tools
    3. AgentDoG
    AgentDoG icon

    AgentDoG

    Application Security

    A risk-aware evaluation and guardrail framework for autonomous agents that analyzes full execution trajectories to detect safety risks in AI agent systems.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully free and open-source under Apache 2.0 License. Download models from Hugging Face or ModelScope and deploy locally.

    Engagement

    Available On

    API
    CLI
    SDK

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Application SecurityAutonomous SystemsLLM Evaluations

    Alternatives

    General AnalysispromptfooCodeWall
    Developer
    AI45LabAI45Lab builds research frameworks and tools for AI safety a…

    Listed May 2026

    About AgentDoG

    AgentDoG is a diagnostic guardrail framework for AI agent safety and security, focusing on trajectory-level risk assessment of autonomous agents. Unlike single-step content moderation or final-output filtering, AgentDoG analyzes the full execution trace of tool-using agents to detect risks that emerge mid-trajectory. It provides fine-grained risk labels across three dimensions—risk source, failure mode, and real-world harm—and outperforms existing approaches on R-Judge, ASSE-Safety, and ATBench benchmarks.

    • Trajectory-Level Monitoring: Evaluates multi-step agent executions spanning observations, reasoning, and actions to catch risks at any point during execution.
    • Taxonomy-Guided Diagnosis: Provides fine-grained risk labels (risk source, failure mode, and real-world harm) with 8 risk-source categories, 14 failure modes, and 10 real-world harm categories.
    • ATBench Dataset: Includes a released benchmark of 500 trajectories (250 safe / 250 unsafe) with ~8.97 turns per trajectory and 1575 unique tools for evaluation.
    • Multiple Model Variants: Fine-tuned guard models available on Hugging Face based on Qwen3-4B, Qwen2.5-7B, and Llama3.1-8B for both binary and fine-grained classification tasks.
    • Flexible Deployment: Supports SGLang and vLLM for OpenAI-compatible API endpoints, as well as direct Transformers inference.
    • Agentic XAI Attribution: Hierarchical framework for explaining internal drivers behind agent actions, decomposing trajectories into pivotal components and fine-grained textual evidence.
    • State-of-the-Art Performance: AgentDoG-4B achieves 91.8% on R-Judge, 80.4% on ASSE-Safety, and 92.8% on ATBench, outperforming LlamaGuard, Qwen3-Guard, and ShieldAgent.
    • Customizable Prompts and Taxonomy: Edit prompt templates and taxonomy labels to adapt the framework to custom agent safety requirements.
    AgentDoG - 1

    Community Discussions

    Be the first to start a conversation about AgentDoG

    Share your experience with AgentDoG, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully free and open-source under Apache 2.0 License. Download models from Hugging Face or ModelScope and deploy locally.

    • All guard model variants (4B, 7B, 8B)
    • ATBench benchmark dataset
    • Prompt templates and taxonomy
    • SGLang and vLLM deployment scripts
    • Agentic XAI Attribution framework

    Capabilities

    Key Features

    • Trajectory-level safety evaluation (binary safe/unsafe classification)
    • Fine-grained risk diagnosis (Risk Source, Failure Mode, Real-World Harm)
    • ATBench benchmark dataset with 500 annotated trajectories
    • Multiple fine-tuned guard models (4B, 7B, 8B parameters)
    • SGLang and vLLM deployment support
    • OpenAI-compatible API endpoint
    • Direct Transformers inference support
    • Agentic XAI Attribution framework
    • Interactive HTML heatmap visualization
    • Customizable prompt templates and taxonomy labels
    • Three-stage taxonomy-guided data synthesis pipeline

    Integrations

    Hugging Face
    ModelScope
    SGLang
    vLLM
    Transformers
    Qwen3
    Qwen2.5
    Llama3.1
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate AgentDoG and help others make informed decisions.

    Developer

    AI45Lab

    AI45Lab builds research frameworks and tools for AI safety and autonomous agent evaluation. The lab develops AgentDoG, a diagnostic guardrail framework for trajectory-level risk assessment in agentic systems. Their work focuses on fine-grained safety taxonomy, benchmark datasets, and explainability for tool-using AI agents.

    Read more about AI45Lab
    WebsiteGitHub
    1 tool in directory

    Similar Tools

    General Analysis icon

    General Analysis

    AI security platform that trains adversarial models to break agentic systems through automated red-teaming and vulnerability forecasting.

    promptfoo icon

    promptfoo

    Promptfoo is an AI security testing platform that helps developers and enterprises find and fix vulnerabilities in LLM applications through automated red teaming, guardrails, and evaluations.

    CodeWall icon

    CodeWall

    AI-powered autonomous pentesting platform that continuously attacks your infrastructure, chains real exploits, and delivers verified remediation.

    Browse all tools

    Related Topics

    Application Security

    AI tools for securing software applications and identifying vulnerabilities.

    61 tools

    Autonomous Systems

    AI agents that can perform complex tasks with minimal human guidance.

    173 tools

    LLM Evaluations

    Platforms and frameworks for evaluating, testing, and benchmarking LLM systems and AI applications. These tools provide evaluators and evaluation models to score AI outputs, measure hallucinations, assess RAG quality, detect failures, and optimize model performance. Features include automated testing with LLM-as-a-judge metrics, component-level evaluation with tracing, regression testing in CI/CD pipelines, custom evaluator creation, dataset curation, and real-time monitoring of production systems. Teams use these solutions to validate prompt effectiveness, compare models side-by-side, ensure answer correctness and relevance, identify bias and toxicity, prevent PII leakage, and continuously improve AI product quality through experiments, benchmarks, and performance analytics.

    63 tools
    Browse all topics
    Back to all tools
    Explore AI Tools
    • AI Coding Assistants
    • Agent Frameworks
    • MCP Servers
    • AI Prompt Tools
    • Vibe Coding Tools
    • AI Design Tools
    • AI Database Tools
    • AI Website Builders
    • AI Testing Tools
    • LLM Evaluations
    Follow Us
    • X / Twitter
    • LinkedIn
    • Reddit
    • Discord
    • Threads
    • Bluesky
    • Mastodon
    • YouTube
    • GitHub
    • Instagram
    Get Started
    • About
    • Editorial Standards
    • Corrections & Disclosures
    • Community Guidelines
    • Advertise
    • Contact Us
    • Newsletter
    • Submit a Tool
    • Start a Discussion
    • Write A Blog
    • Share A Build
    • Terms of Service
    • Privacy Policy
    Explore with AI
    • ChatGPT
    • Gemini
    • Claude
    • Grok
    • Perplexity
    Agent Experience
    • llms.txt
    Theme
    With AI, Everyone is a Dev. EveryDev.ai © 2026
    Discussions