Datacurve AI
Datacurve is building the data engine for frontier AI, providing high-fidelity, vetted coding data through a bounty-based platform to help labs train superior foundation models.
At a Glance
- AI Foundation Model Labs
- Enterprise Software Teams
- Agentic AI Developers
AI Tools by Datacurve AI
(1)DeepSWE
Coding Agent Benchmark Tool
Discussions
No discussions yet
Be the first to start a discussion about Datacurve AI
Latest News
Datacurve raises $15 million Series A led by Chemistry
Datacurve releases DeepSWE, a new standard for agentic coding benchmarks
Datacurve partners with Together AI for DeepSWE model training
Datacurve joins Y Combinator Winter 2024 batch
Products & Services
A market-based platform that connects AI labs with expert engineers to source high-quality coding data.
An agentic coding benchmark and evaluation standard for software engineering tasks.
Market Position
Positions itself as a high-quality alternative to Scale AI, focusing specifically on expert-level coding data through a unique bounty hunter model for engineers.
Leadership
Founders
Serena Ge
Co-founder and CEO. Previously interned at Cohere and was involved in research. Mentioned as an expert in LLM training data.
Charley Lee
Co-founder. Previously interned at Google. Education: University of Waterloo.
Eric Zhang
Co-founder. Software engineer, researcher, and designer. Mentioned in early founding announcements.
Executive Team
Serena Ge
CEO & Co-founder
Prev. Cohere intern, LLM data researcher.
Charley Lee
Co-founder
Prev. Google intern, University of Waterloo.
Board of Directors
Founding Story
Founded by Serena Ge and Charley Lee (and Eric Zhang) after their experiences at Cohere and Google, where they observed the critical bottleneck in obtaining high-quality, complex data for training coding LLMs. They developed a bounty system to source data from top engineers and automate quality control.
Business Model
Revenue Model
B2B Marketplace model. Connects AI labs with specialized engineers. Generates revenue through platform fees and data curation services.
Pricing Tiers
Pricing is typically custom based on data volume, quality requirements, and complexity of coding tasks.
Target Markets
- AI Foundation Model Labs
- Enterprise Software Teams
- Agentic AI Developers
- Foundation model training (SFT, RLHF)
- Coding agent development
- Software engineering benchmarks
- Data curation for AI labs
- Together AI
- Major foundation-model labs