# TensorZero

> An open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation in a single self-hosted stack.

TensorZero is an open-source LLMOps platform built in Rust and licensed under Apache 2.0. It unifies five capabilities — LLM gateway, observability, evaluation, optimization, and experimentation — into a single self-hosted stack that teams can adopt incrementally. The homepage notes that TensorZero is no longer actively maintained, though the repository remains publicly available on GitHub.

## What It Is

TensorZero is a self-hosted infrastructure layer that sits between your application and every major LLM provider. Rather than requiring teams to stitch together separate tools for routing, logging, fine-tuning, and A/B testing, TensorZero provides all of these as a unified platform. The gateway is written in Rust and the project claims sub-1ms p99 latency overhead at 10,000+ QPS. It exposes an OpenAI-compatible API, so any existing OpenAI SDK (Python, Node, Go, etc.) can point to it with a single `base_url` change.

## Core Architecture

TensorZero is deployed as a single Docker container (the TensorZero Gateway) backed by a user-owned database. The five pillars of the platform are:

- **Gateway:** A unified API that routes to Anthropic, AWS Bedrock, AWS SageMaker, Azure, DeepSeek, Fireworks, GCP Vertex AI, Google AI Studio, Groq, Hyperbolic, Mistral, OpenAI, OpenRouter, SGLang, TGI, Together AI, vLLM, xAI (Grok), and any OpenAI-compatible endpoint (e.g. Ollama).
- **Observability:** Inferences and feedback (metrics, human edits) are stored in the user's own database. OpenTelemetry (OTLP) and Prometheus export are supported.
- **Evaluation:** Supports inference evaluations (unit-test style) and workflow evaluations (integration-test style) via heuristics or LLM judges, runnable from a UI or CLI.
- **Optimization:** Supervised fine-tuning, RLHF, automated prompt engineering (GEPA algorithm), and dynamic in-context learning (DICL) turn production data into a learning flywheel.
- **Experimentation:** Built-in adaptive A/B testing, routing, fallbacks, retries, and load balancing.

## TensorZero Autopilot

The README describes TensorZero Autopilot as an "automated AI engineer" add-on powered by TensorZero. According to the project, Autopilot analyzes LLM observability data, sets up evaluations, optimizes prompts and models, and runs A/B tests automatically. The README states it "dramatically improves the performance of LLM agents across diverse tasks." Autopilot is described as a complementary paid product, while the core TensorZero platform is free and self-hosted.

## Team and Backing

According to the README, the TensorZero team includes a former Rust compiler maintainer, machine learning researchers from Stanford, CMU, Oxford, and Columbia, and the former chief product officer of a decacorn startup. The project announced a $7.3M seed round and received coverage from VentureBeat. The README states TensorZero "is used by companies ranging from frontier AI startups to the Fortune 10 and fuels ~1% of global LLM API spend today" — this is a vendor-published claim.

## Current Status: Archived

The TensorZero website states: "TensorZero remains available on GitHub but is no longer maintained." The GitHub repository is marked as ARCHIVED with a last push date of June 2026. The most recent release was version 2026.6.0, published June 4, 2026. Despite the archival, the full source code, documentation, and examples remain publicly accessible under the Apache 2.0 license.

## Features
- Unified LLM gateway with OpenAI-compatible API
- Sub-1ms p99 latency overhead at 10k+ QPS (Rust-based)
- Support for 18+ LLM providers including Anthropic, OpenAI, AWS Bedrock, GCP Vertex AI, and more
- Structured outputs (JSON), tool use, batch inference, embeddings, multimodal (images, files), and caching
- Routing, retries, fallbacks, and load balancing for high availability
- Self-hosted observability: store inferences and feedback in your own database
- OpenTelemetry (OTLP) and Prometheus metrics export
- Inference and workflow evaluations via heuristics or LLM judges
- Supervised fine-tuning (SFT) and RLHF optimization
- Automated prompt engineering with GEPA algorithm
- Dynamic in-context learning (DICL)
- Adaptive A/B testing and experimentation
- TensorZero Autopilot: automated AI engineer add-on
- GitOps-friendly configuration
- Interactive Playground UI
- Dataset building for optimization and evaluation workflows
- Custom rate limiting with granular scopes
- Auth setup to allow clients to access models without sharing provider API keys

## Integrations
OpenAI SDK, OpenTelemetry, Prometheus, Anthropic, AWS Bedrock, AWS SageMaker, Azure, DeepSeek, Fireworks, GCP Vertex AI, Google AI Studio (Gemini API), Groq, Hyperbolic, Mistral, OpenRouter, SGLang, TGI (Text Generation Inference), Together AI, vLLM, xAI (Grok), Ollama (OpenAI-compatible), Docker

## Platforms
API, CLI, DEVELOPER_SDK

## Pricing
Open Source

## Version
2026.6.0

## Links
- Website: https://www.tensorzero.com
- Documentation: https://www.tensorzero.com/docs
- Repository: https://github.com/tensorzero/tensorzero
- EveryDev.ai: https://www.everydev.ai/tools/tensorzero