TrueFoundry

Name: TrueFoundry
Availability: OnlineOnly
Author: Ensemble Labs Inc (TrueFoundry)

Enterprise-ready AI Gateway and agentic deployment platform for governing, deploying, scaling, and tracing AI workloads across VPC, on-prem, hybrid, or public cloud environments.

Visit Website

At a Glance

Pricing

Free tier available

For explorers and early builders experimenting with AI workflows. Ideal for prototyping and testing ideas.

Pro: $499/mo

Pro Plus: $2999/mo

Enterprise: Custom/contact

Engagement

Available On

Web

API

Ensemble Labs Inc (TrueFoundry)San Francisco, CAEst. 2021$21.3M raised

Listed Mar 2026

About TrueFoundry

TrueFoundry is an enterprise-grade AI Gateway and agentic deployment platform that enables organizations to govern, deploy, scale, and trace AI workloads with full security and compliance. It provides a unified control layer for managing LLMs, MCP servers, AI agents, and model fine-tuning across any infrastructure — on-prem, VPC, air-gapped, or multi-cloud. Named in the Gartner® 2025 Market Guide for AI Gateways, TrueFoundry is trusted by teams at NVIDIA, ResMed, Whatfix, Innovaccer, and others to accelerate AI production timelines while reducing infrastructure costs.

AI Gateway — Centralize LLM access with universal API, virtual models, RBAC, semantic caching, weight/latency/priority-based routing, fallbacks, rate limiting, and budget controls across all model providers.
MCP Gateway — Register, discover, and govern MCP servers with schema validation, access control, metrics, and support for advanced authentication and self-hosted MCPs.
Prompt Lifecycle Management — Version, manage, and monitor prompts with guardrails and partner integrations to ensure repeatable, high-quality agent behavior.
AI Deploy Platform — Host any LLM or custom model using vLLM, TGI, or Triton backends; deploy agents built with LangGraph, CrewAI, AutoGen, or custom frameworks in fully containerized, production-ready environments.
Training & Fine-Tuning — Launch fine-tuning jobs on your own data, track experiments, and push updated checkpoints directly to production in one unified flow.
Full Agent Observability — Trace every step from prompt to tool/model execution with metrics, latency, and outcomes; integrates with Grafana, Datadog, Prometheus via OpenTelemetry.
GPU Orchestration & Autoscaling — Automatically schedule and scale GPU workloads, support NVIDIA MIG and time slicing for fractional GPU sharing, and continuously rightsize infrastructure to reduce cloud waste.
Enterprise Security & Compliance — SOC 2, HIPAA, and GDPR compliant with SSO, granular RBAC, immutable audit logging, real-time policy enforcement, and org-level multi-tenant management.
Flexible Deployment Modes — Deploy as SaaS, VPC/on-prem, air-gapped, or hybrid; data never leaves your domain, ensuring complete sovereignty.
Integrations — Framework-agnostic support for LangGraph, CrewAI, AutoGen, vLLM, TGI, Triton, Grafana, Datadog, Prometheus, and more.

Community Discussions

Be the first to start a conversation about TrueFoundry

Share your experience with TrueFoundry, ask questions, or help others learn from your insights.

Pricing

FREE

Developer

For explorers and early builders experimenting with AI workflows. Ideal for prototyping and testing ideas.

50K requests per month
3 users
Universal API
RBAC on models
Virtual models

Pro

For small teams ready to ship real AI features with usage-based predictability. Unlocks higher limits, better performance, and essential governance tools.

$499

per month

1M requests per month
10 users
Universal API
RBAC on models
Virtual models
Self-hosted models
Playground
Multiple gateway endpoints
Simple caching
Logs
Traces
Custom Metadata
MCP Servers (up to 25)
1M tool calls per month
RBAC on MCPs
Virtual MCP Servers
Unlimited saved prompts
Versioning & Variables
SOC2
SaaS deployment
Production Support
Standard SLA

Pro Plus

Designed for teams that need stricter data controls, advanced account management, and priority SLAs without self-hosting.

$2999

per month

1M requests per month
25 users
Universal API
RBAC on models
Virtual models
Self-hosted models
Playground
Multiple gateway endpoints
Simple caching
Semantic Caching
Control Center
Weight-based Routing
Latency-based Routing
Priority-based Routing
Fallbacks with advanced features
Budget limiting
Rate limiting
Logs with custom retention
Traces
Export to custom storage buckets
Feedback on traces
Custom pricing/model
Cost per team/user/model/application
Metadata filtering
Alerts
Export to other monitoring platforms
MCP Servers (up to 50)
5M tool calls per month
Virtual MCP Servers
Support for advanced authentication
Self-hosted MCPs
Unlimited saved prompts
Partner Guardrails integration
SSO
GDPR, HIPAA Compliance Certificates
Org management
Audit logs
SaaS deployment
Production Support
Standard SLA

Enterprise

For medium and large organizations running AI at scale with strict compliance. Built for advanced governance, security, custom deployment, and mission-critical reliability.

Custom

contact sales

Custom requests per month
Custom users
All Pro Plus features
Custom MCP Servers
Custom tool calls per month
Custom Guardrail Hooks
VPC / On-prem deployment
Air-gapped deployment
Deployment customization
Data Lake Export
Connect multiple storage buckets
Multiple gateway planes
Gitops (Infrastructure as code)
Dedicated Onboarding
Priority Support
Enterprise-Grade SLA

View official pricing

Capabilities

Key Features

AI Gateway with universal API
MCP Gateway and agents registry
Prompt lifecycle management
Model hosting with vLLM, TGI, Triton
Training and fine-tuning
Agent deployment (LangGraph, CrewAI, AutoGen)
Full agent observability and tracing
GPU orchestration and autoscaling
Fractional GPU support (MIG and time slicing)
RBAC and SSO
Immutable audit logging
SOC 2, HIPAA, GDPR compliance
Semantic and simple caching
Weight/latency/priority-based routing
Fallbacks and rate limiting
Budget limiting
VPC, on-prem, air-gapped deployment
OpenTelemetry integration
Multi-tenant org management
Real-time policy enforcement

Integrations

LangGraph

CrewAI

AutoGen

vLLM

TGI

Triton

Grafana

Datadog

Prometheus

OpenTelemetry

AWS

GCP

Azure

Kubernetes

OpenAI

Anthropic

Hugging Face

API Available

View Docs

Back to all tools Suggest an edit

About TrueFoundry

AI Gateway — Centralize LLM access with universal API, virtual models, RBAC, semantic caching, weight/latency/priority-based routing, fallbacks, rate limiting, and budget controls across all model providers.
MCP Gateway — Register, discover, and govern MCP servers with schema validation, access control, metrics, and support for advanced authentication and self-hosted MCPs.
Prompt Lifecycle Management — Version, manage, and monitor prompts with guardrails and partner integrations to ensure repeatable, high-quality agent behavior.
AI Deploy Platform — Host any LLM or custom model using vLLM, TGI, or Triton backends; deploy agents built with LangGraph, CrewAI, AutoGen, or custom frameworks in fully containerized, production-ready environments.
Training & Fine-Tuning — Launch fine-tuning jobs on your own data, track experiments, and push updated checkpoints directly to production in one unified flow.
Full Agent Observability — Trace every step from prompt to tool/model execution with metrics, latency, and outcomes; integrates with Grafana, Datadog, Prometheus via OpenTelemetry.
GPU Orchestration & Autoscaling — Automatically schedule and scale GPU workloads, support NVIDIA MIG and time slicing for fractional GPU sharing, and continuously rightsize infrastructure to reduce cloud waste.
Enterprise Security & Compliance — SOC 2, HIPAA, and GDPR compliant with SSO, granular RBAC, immutable audit logging, real-time policy enforcement, and org-level multi-tenant management.
Flexible Deployment Modes — Deploy as SaaS, VPC/on-prem, air-gapped, or hybrid; data never leaves your domain, ensuring complete sovereignty.
Integrations — Framework-agnostic support for LangGraph, CrewAI, AutoGen, vLLM, TGI, Triton, Grafana, Datadog, Prometheus, and more.

TrueFoundry