Vals AI, Inc.
Vals AI provides independent, standardized benchmarks for evaluating large language models and AI applications on real-world enterprise tasks. The company aims to bridge the gap between theoretical AI advancements and practical business applications by offering transparent, unbiased evaluations across industries like legal, finance, healthcare, and coding.
At a Glance
- Legal firms and legal service providers
- Financial services and banking institutions
- Healthcare organizations
- AI labs and model developers (OpenAI, Anthropic, Google, etc.)
- +6 more
AI Tools by Vals AI, Inc.
(1)Vals AI
LLM Evaluation Platform
Discussions
No discussions yet
Be the first to start a discussion about Vals AI, Inc.
Latest News
The Winners (and Losers) of This New Vibe-Coding Benchmark Will Surprise You
Why Nvidia Keeps Backing Would-Be Competitors to OpenAI
Readme: Human vs machine vs legal machine - A study of AI and legal research
OpenAI's Less-Flashy Rival Might Have a Better Business Model
Products & Services
Public enterprise LLM benchmarks that rank model performance across real-world business tasks in finance, legal, coding, and other domains
Comprehensive evaluation platform for testing LLMs and LLM applications with automated testing, expert review, CI/CD integration, and performance analytics
Specialized benchmarks for legal, finance, healthcare, tax, and other verticals to evaluate AI model performance on domain-specific tasks
First-of-its-kind legal AI benchmarking study evaluating AI platforms against human lawyer baselines on real-world legal tasks
Market Position
Vals AI positions itself as the independent, neutral evaluator of AI models, distinguishing itself from self-reported benchmarks by AI companies. The company focuses on real-world, industry-specific tasks rather than academic benchmarks, and addresses data contamination issues in traditional evaluation methods. Competitors include WitnessAI, Modulos, Armilla AI, Credo AI, and others in the AI governance and evaluation space. Vals AI has established credibility through partnerships with top law firms, AI vendors, and academic institutions.
Leadership
Founders
Rayan Krishnan
Co-Founder & CEO. 24 years old (as of 2025), abandoned Ph.D. plans to start Vals AI after ChatGPT's release. Previous experience at Palantir, Microsoft, University of Washington, and SAP Concur. Based at Stanford University.
Langston Nashold
Co-Founder & CTO. Dropped out of AI-focused master's program at Stanford to pursue Vals AI. Stanford CS, Andrew Ng's AI + Climate Change Lab, previously worked at Hudson River Trading.
Executive Team
Rayan Krishnan
Co-Founder & CEO
24 years old, former experience at Palantir, Microsoft, University of Washington, and SAP Concur
Langston Nashold
Co-Founder & CTO
Stanford CS, Andrew Ng's AI + Climate Change Lab, previously at Hudson River Trading
Founding Story
Founded in 2023 by Rayan Krishnan and Langston Nashold, who both dropped out of their AI-focused master's program at Stanford University to pursue their vision. Following ChatGPT's release, they recognized a critical gap in the tech industry: the lack of an independent, standardized test to evaluate AI services. They saw the need for a neutral, third-party review system for large language models, addressing issues like data contamination in existing benchmarks and the need for industry-specific evaluation rather than generic tests.
Business Model
Revenue Model
Enterprise subscriptions and API access for evaluation platform. Provides both public free benchmarks and paid enterprise platform for companies to run custom evaluations. Revenue from AI labs, model developers, enterprise customers, and legal/financial firms needing evaluation services.
Target Markets
- Legal firms and legal service providers
- Financial services and banking institutions
- Healthcare organizations
- AI labs and model developers (OpenAI, Anthropic, Google, etc.)
- Enterprise software companies building AI applications
- Legal technology vendors
- Evaluating LLM suitability for enterprise applications before deployment
- Benchmarking AI models on legal research, case analysis, and contract review
- Testing AI performance on financial analysis and Excel-based tasks
- Measuring accuracy of AI in healthcare and medical applications
- Auditing LLM applications to replace manual review teams
- Model selection and purchasing decisions for enterprises
- Anthropic
- OpenAI
- Everlaw