vLLM
To grow vLLM as the world's leading AI inference engine and provide a universal inference layer that makes AI serving fast, cheap, and accessible.
At a Glance
- AI Startups
- Enterprise AI Teams
- Cloud Service Providers
- Open Source AI Community
AI Tools by vLLM
(1)vLLM
Open Source LLM Inference Library
Discussions
No discussions yet
Be the first to start a discussion about vLLM
Latest News
Inferact Launches With $150 Million Funding At $800M Valuation To Commercialize vLLM
Investing in Inferact: The Team Behind vLLM
vLLM Project in Talks for Major Funding Round
vLLM Paper on PagedAttention Presented at SOSP 2023
Products & Services
Open-source library for high-throughput and memory-efficient LLM inference and serving using PagedAttention.
A commercial, managed, and serverless version of the vLLM inference engine, designed to provide a universal inference layer.
Market Position
Positions itself as the 'universal inference layer' building on the most popular open-source inference engine (vLLM) to compete with proprietary inference stacks.
Leadership
Founders
Simon Mo
Co-creator of vLLM, core maintainer, and former PhD student at UC Berkeley Sky Computing Lab. CEO of Inferact.
Woosuk Kwon
Co-creator of vLLM and PhD candidate at UC Berkeley Sky Lab. CTO of Inferact.
Kaichao You
Core maintainer of vLLM, Ph.D. from Tsinghua University. Chief Scientist at Inferact.
Roger Wang
Core maintainer of vLLM, previously a software engineer at NVIDIA, Microsoft, and Amazon.
Ion Stoica
Professor at UC Berkeley, Director of Sky Computing Lab. Co-founder of Databricks and Anyscale. Board member and co-founder of Inferact.
Joseph Gonzalez
Professor at UC Berkeley, co-founder of Databricks and Anyscale. Previously co-founded Turi (acquired by Apple).
Executive Team
Simon Mo
CEO
Co-creator and maintainer of vLLM.
Woosuk Kwon
CTO
Co-creator of vLLM and lead researcher.
Board of Directors
Founding Story
Founded by the creators of the vLLM project at UC Berkeley's Sky Computing Lab to commercialize the popular open-source inference engine and provide enterprise-grade infrastructure.
Business Model
Revenue Model
Managed service (SaaS), serverless AI inference platform, and potential enterprise support/licensing.
Pricing Tiers
Community-supported open-source version.
Serverless managed version of vLLM (in roadmap/early access).
Target Markets
- AI Startups
- Enterprise AI Teams
- Cloud Service Providers
- Open Source AI Community
- High-throughput LLM serving
- Cost-efficient AI inference
- Real-time chat and agentic workflows
- Enterprise AI infrastructure
- Meta
- Character.ai
- DoorDash