TruLens

Name: TruLens
Availability: OnlineOnly
Author: TruEra

LLM Evaluations

Open-source library for evaluating and tracking LLM applications with feedback functions and observability tools.

Visit Website

At a Glance

Pricing

Open Source

Free open-source library for LLM evaluation and tracking

Engagement

Available On

SDK

API

TruEraRedwood City, CAEst. 2019$42.3M raised

Listed Feb 2026

About TruLens

TruLens is an open-source Python library designed to evaluate, track, and improve Large Language Model (LLM) applications. It provides developers with tools to measure the quality of LLM outputs through customizable feedback functions, enabling systematic evaluation of AI applications throughout the development lifecycle. The library integrates seamlessly with popular LLM frameworks and offers comprehensive observability features.

Key Features:

Feedback Functions - Define and run custom evaluation metrics to assess LLM outputs for qualities like groundedness, relevance, coherence, and harmlessness. These functions can be powered by LLMs, traditional NLP models, or custom logic.
Instrumentation & Tracing - Automatically capture detailed traces of LLM application execution, including prompts, responses, latencies, and intermediate steps for debugging and analysis.
Evaluation Dashboard - Visualize evaluation results, compare experiments, and track performance metrics over time through an interactive web-based dashboard.
RAG Triad Evaluation - Built-in evaluation framework specifically designed for Retrieval-Augmented Generation (RAG) applications, measuring context relevance, groundedness, and answer relevance.
Framework Integrations - Works with LangChain, LlamaIndex, OpenAI, and other popular LLM frameworks out of the box, requiring minimal code changes to instrument existing applications.
Leaderboard & Benchmarking - Compare different model configurations, prompts, and retrieval strategies to identify the best-performing setups for your use case.
Guardrails Support - Implement safety checks and content moderation through feedback functions that can flag or filter problematic outputs.

To get started with TruLens, install the package via pip with pip install trulens. Import the library into your Python project, wrap your LLM application with TruLens instrumentation, define feedback functions for the metrics you want to track, and run your application to collect evaluation data. The dashboard can be launched locally to explore results and iterate on improvements. TruLens supports both development-time evaluation and production monitoring workflows.

Community Discussions

Be the first to start a conversation about TruLens

Share your experience with TruLens, ask questions, or help others learn from your insights.

Pricing

OPEN SOURCE

Open Source

Free open-source library for LLM evaluation and tracking

Full library access
Feedback functions
Instrumentation and tracing
Evaluation dashboard
RAG Triad evaluation

Capabilities

Key Features

Feedback functions for LLM evaluation
Automatic instrumentation and tracing
Interactive evaluation dashboard
RAG Triad evaluation framework
Framework integrations (LangChain, LlamaIndex, OpenAI)
Leaderboard and benchmarking
Guardrails and safety checks
Custom evaluation metrics
Production monitoring
Experiment tracking

Integrations

LangChain

LlamaIndex

OpenAI

Hugging Face

Anthropic

Bedrock

Vertex AI

API Available

View Docs

Back to all tools

TruLens