# Ragas > Ragas is an open-source framework for evaluating and testing LLM applications, helping teams measure retrieval-augmented generation (RAG) pipeline quality with automated metrics. Ragas is an open-source evaluation framework purpose-built for LLM applications, with a strong focus on retrieval-augmented generation (RAG) pipelines. It provides a suite of automated metrics that measure faithfulness, answer relevancy, context precision, and more — enabling teams to objectively assess and improve their AI systems. Ragas integrates with popular LLM frameworks and supports both unit-test-style evaluations and continuous monitoring in production. It is widely used by AI engineers and researchers who need reliable, reproducible quality signals for their LLM-powered products. - **RAG Evaluation Metrics**: *Automatically score RAG pipelines on faithfulness, answer relevancy, context recall, context precision, and more using reference-free and reference-based metrics.* - **LLM-as-a-Judge**: *Leverage LLMs to evaluate generated outputs against ground truth or without reference, reducing the need for manual annotation.* - **Test Dataset Generation**: *Synthetically generate evaluation datasets from your documents to bootstrap testing without manual labeling.* - **Integration with LLM Frameworks**: *Works seamlessly with LlamaIndex, LangChain, and other popular orchestration frameworks to evaluate pipelines end-to-end.* - **CI/CD-Ready Evaluations**: *Run evaluations as part of automated pipelines to catch regressions before they reach production.* - **Observability & Monitoring**: *Track evaluation metrics over time to monitor model and pipeline quality in production environments.* - **Customizable Metrics**: *Define and extend custom metrics tailored to your specific use case and domain requirements.* - **Open Source**: *Freely available on GitHub, with an active community and transparent development.* To get started, install Ragas via pip, connect it to your LLM provider, and run evaluations on your RAG pipeline outputs using the built-in metric suite or your own custom metrics. ## Features - RAG pipeline evaluation - LLM-as-a-Judge scoring - Synthetic test dataset generation - Faithfulness metric - Answer relevancy metric - Context precision and recall metrics - CI/CD integration - Production monitoring - Custom metric support - LangChain integration - LlamaIndex integration ## Integrations LlamaIndex, LangChain, OpenAI, Hugging Face, AWS Bedrock, Azure OpenAI ## Platforms API, DEVELOPER_SDK ## Pricing Open Source ## Links - Website: https://www.ragas.io - Documentation: https://docs.ragas.io - Repository: https://github.com/explodinggradients/ragas - EveryDev.ai: https://www.everydev.ai/tools/ragas