# Ragas

> Ragas is an open-source framework for evaluating and testing LLM applications, helping teams measure retrieval-augmented generation (RAG) pipeline quality with automated metrics.

Ragas is an open-source evaluation framework purpose-built for LLM applications, with a strong focus on retrieval-augmented generation (RAG) pipelines. It provides a suite of automated metrics that measure faithfulness, answer relevancy, context precision, and more — enabling teams to objectively assess and improve their AI systems. Ragas integrates with popular LLM frameworks and supports both unit-test-style evaluations and continuous monitoring in production. It is widely used by AI engineers and researchers who need reliable, reproducible quality signals for their LLM-powered products.

- **RAG Evaluation Metrics**: *Automatically score RAG pipelines on faithfulness, answer relevancy, context recall, context precision, and more using reference-free and reference-based metrics.*
- **LLM-as-a-Judge**: *Leverage LLMs to evaluate generated outputs against ground truth or without reference, reducing the need for manual annotation.*
- **Test Dataset Generation**: *Synthetically generate evaluation datasets from your documents to bootstrap testing without manual labeling.*
- **Integration with LLM Frameworks**: *Works seamlessly with LlamaIndex, LangChain, and other popular orchestration frameworks to evaluate pipelines end-to-end.*
- **CI/CD-Ready Evaluations**: *Run evaluations as part of automated pipelines to catch regressions before they reach production.*
- **Observability & Monitoring**: *Track evaluation metrics over time to monitor model and pipeline quality in production environments.*
- **Customizable Metrics**: *Define and extend custom metrics tailored to your specific use case and domain requirements.*
- **Open Source**: *Freely available on GitHub, with an active community and transparent development.*

To get started, install Ragas via pip, connect it to your LLM provider, and run evaluations on your RAG pipeline outputs using the built-in metric suite or your own custom metrics.

## Features
- RAG pipeline evaluation
- LLM-as-a-Judge scoring
- Synthetic test dataset generation
- Faithfulness metric
- Answer relevancy metric
- Context precision and recall metrics
- CI/CD integration
- Production monitoring
- Custom metric support
- LangChain integration
- LlamaIndex integration

## Integrations
LlamaIndex, LangChain, OpenAI, Hugging Face, AWS Bedrock, Azure OpenAI

## Platforms
API, DEVELOPER_SDK

## Pricing
Open Source

## Links
- Website: https://www.ragas.io
- Documentation: https://docs.ragas.io
- Repository: https://github.com/explodinggradients/ragas
- EveryDev.ai: https://www.everydev.ai/tools/ragas