Ragas
Ragas is an open-source evaluation framework that enables developers to systematically test and monitor LLM-based RAG applications, replacing qualitative assessments with data-driven metrics.
At a Glance
- AI Startups
- Enterprise Data Science Teams
- LLM Developers
AI Tools by Ragas
(1)Ragas
LLM App Evaluation Framework
Discussions
No discussions yet
Be the first to start a discussion about Ragas
Latest News
Ragas joins Y Combinator Winter 2024 batch
LlamaIndex + Ragas Cookbook released for advanced RAG evaluation
OpenAI recommends Ragas for RAG evaluation at DevDay
Joint webinar with LangChain: LangSmith and Ragas integration
Products & Services
An open-source Python framework for evaluating Retrieval Augmented Generation (RAG) pipelines using automated metrics.
Enterprise-grade evaluation and production monitoring services for LLM applications (contact founders for access).
Market Position
Ragas is the leading open-source standard for RAG evaluation, recommended by OpenAI and integrated with the major LLM orchestration frameworks (LangChain, LlamaIndex).
Leadership
Founders
Shahul ES
Applied AI researcher and Kaggle Grandmaster. Focuses on NLP and RAG. Co-authored the foundational RAGAs research paper.
Jithin James
Chief maintainer of Ragas. Former Software Engineer at BentoML and Trell. Experienced in MLOps and scalable AI infrastructure.
Executive Team
Shahul ES
Co-founder
Applied AI researcher, Kaggle Grandmaster.
Jithin James
Co-founder
Chief maintainer, former engineer at BentoML.
Founding Story
Ragas was started by Shahul ES and Jithin James after two years of helping various teams improve their AI applications. They realized that most teams relied on 'vibe checks' rather than rigorous metrics for RAG systems and built Ragas to provide a structured, component-wise evaluation standard.
Business Model
Revenue Model
Open core: Free open-source library with enterprise services and monitoring features available for paid customers.
Pricing Tiers
Full access to the Python library, metrics, and synthetic data generation tools.
Additional support, production monitoring, and enterprise-scale evaluation features.
Target Markets
- AI Startups
- Enterprise Data Science Teams
- LLM Developers
- Evaluating RAG pipeline performance
- Generating synthetic datasets for LLM training/testing
- Monitoring production hallucinations
- A/B testing different LLM prompts and models
- AWS
- Microsoft
- Databricks
- Moody's