Datacurve AI

Datacurve is building the data engine for frontier AI, providing high-fidelity, vetted coding data through a bounty-based platform to help labs train superior foundation models.

Visit Website

At a Glance

38Tool Views

San Francisco, CAHeadquarters

2024Est.

47Employees

AI Tools by Datacurve AI

(1)

DeepSWE

Coding Agent Benchmark Tool

LLM Evaluations AI Coding Asst.Agent Harness

Discussions

No discussions yet

Be the first to start a discussion about Datacurve AI

Latest News

10/09/2025

Products & Services

Datacurve Bounty Platform

2024

A market-based platform that connects AI labs with expert engineers to source high-quality coding data.

DeepSWE

October 2025

An agentic coding benchmark and evaluation standard for software engineering tasks.

Market Position

Positions itself as a high-quality alternative to Scale AI, focusing specifically on expert-level coding data through a unique bounty hunter model for engineers.

Leadership

Founders

Serena Ge

Co-founder and CEO. Previously interned at Cohere and was involved in research. Mentioned as an expert in LLM training data.

Charley Lee

Co-founder. Previously interned at Google. Education: University of Waterloo.

Eric Zhang

Co-founder. Software engineer, researcher, and designer. Mentioned in early founding announcements.

Executive Team

Serena Ge

CEO & Co-founder

Prev. Cohere intern, LLM data researcher.

Charley Lee

Co-founder

Prev. Google intern, University of Waterloo.

Board of Directors

Mark Goldberg

Board Member (Partner at Chemistry)

Founding Story

Founded by Serena Ge and Charley Lee (and Eric Zhang) after their experiences at Cohere and Google, where they observed the critical bottleneck in obtaining high-quality, complex data for training coding LLMs. They developed a bounty system to source data from top engineers and automate quality control.

Business Model

Revenue

$2.9M ARR reported in 2024 (as per some sources, though potentially early stage).

Revenue Model

B2B Marketplace model. Connects AI labs with specialized engineers. Generates revenue through platform fees and data curation services.

Pricing Tiers

Custom / Enterprise

Variable

Pricing is typically custom based on data volume, quality requirements, and complexity of coding tasks.

Private

Target Markets

Industries & Segments

AI Foundation Model Labs
Enterprise Software Teams
Agentic AI Developers

Use Cases

Foundation model training (SFT, RLHF)
Coding agent development
Software engineering benchmarks
Data curation for AI labs

Notable Customers

Together AI
Major foundation-model labs

Quick Facts

Headquarters

San Francisco, CA

Founded

2024

Entity Type

Inc.

Employees

Total Funding

$17.7 million

Investors

Chemistry, Y Combinator

Office Locations

San Francisco

Funding History

Series A$15,000,000

October 2025

Chemistry (Mark Goldberg)

Seed~$2,700,000 - $3,200,000

2024

Y Combinator

History & Milestones

October 9, 2025

Raised $15M Series A funding round led by Chemistry.

October 9, 2025

Released DeepSWE, a new agentic coding benchmark.

2024

Company founded in San Francisco.

Early 2024

Joined Y Combinator as part of the W24 batch.

Key Capabilities

Bounty-based sourcing platform

Expert engineer vetting

Automated quality control at scale

High-fidelity coding datasets

Agentic coding evaluation

Integrations & Partnerships

Platform Integrations

Cloud-based data delivery
Integration with foundation model training pipelines

Key Partnerships

Together AI

Vercel (Investors/Employees)

Anthropic (Investors/Employees)

Connect

Website

deepswe.datacurve.ai/

GitHub

datacurve-ai

AI Topics

Datacurve AI focuses on these topics:

LLM Evaluations(1)

AI Coding Assistants(1)

Agent Harness(1)

Back to all developers Suggest an edit

Datacurve AI

Datacurve is building the data engine for frontier AI, providing high-fidelity, vetted coding data through a bounty-based platform to help labs train superior foundation models.

Visit Website

At a Glance

38Tool Views

San Francisco, CAHeadquarters

2024Est.

47Employees

AI Tools by Datacurve AI

(1)

DeepSWE

Coding Agent Benchmark Tool

LLM Evaluations AI Coding Asst.Agent Harness

Discussions

No discussions yet

Be the first to start a discussion about Datacurve AI

Latest News

10/09/2025

Datacurve raises $15 million Series A led by Chemistry

techcrunch.com

10/09/2025

Datacurve releases DeepSWE, a new standard for agentic coding benchmarks

x.com

10/01/2025

Datacurve partners with Together AI for DeepSWE model training

together.ai

01/01/2024

Datacurve joins Y Combinator Winter 2024 batch

ycombinator.com

Products & Services

Datacurve Bounty Platform

2024

A market-based platform that connects AI labs with expert engineers to source high-quality coding data.

DeepSWE

October 2025

An agentic coding benchmark and evaluation standard for software engineering tasks.

Market Position

Positions itself as a high-quality alternative to Scale AI, focusing specifically on expert-level coding data through a unique bounty hunter model for engineers.

Leadership

Founders

Serena Ge

Co-founder and CEO. Previously interned at Cohere and was involved in research. Mentioned as an expert in LLM training data.

Charley Lee

Co-founder. Previously interned at Google. Education: University of Waterloo.

Eric Zhang

Co-founder. Software engineer, researcher, and designer. Mentioned in early founding announcements.

Executive Team

Serena Ge

CEO & Co-founder

Prev. Cohere intern, LLM data researcher.

Charley Lee

Co-founder

Prev. Google intern, University of Waterloo.

Board of Directors

Mark Goldberg

Board Member (Partner at Chemistry)

Founding Story

Business Model

Revenue

$2.9M ARR reported in 2024 (as per some sources, though potentially early stage).

Revenue Model

B2B Marketplace model. Connects AI labs with specialized engineers. Generates revenue through platform fees and data curation services.

Pricing Tiers

Custom / Enterprise

Variable

Pricing is typically custom based on data volume, quality requirements, and complexity of coding tasks.

Private

Target Markets

Industries & Segments

AI Foundation Model Labs
Enterprise Software Teams
Agentic AI Developers

Use Cases

Foundation model training (SFT, RLHF)
Coding agent development
Software engineering benchmarks
Data curation for AI labs

Notable Customers

Together AI
Major foundation-model labs

Quick Facts

Headquarters

San Francisco, CA

Founded

2024

Entity Type

Inc.

Employees

Total Funding

$17.7 million

Investors

Chemistry, Y Combinator

Office Locations

San Francisco

Funding History

Series A$15,000,000

October 2025

Chemistry (Mark Goldberg)

Seed~$2,700,000 - $3,200,000

2024

Y Combinator

History & Milestones

October 9, 2025

Raised $15M Series A funding round led by Chemistry.

October 9, 2025

Released DeepSWE, a new agentic coding benchmark.

2024

Company founded in San Francisco.

Early 2024

Joined Y Combinator as part of the W24 batch.

Key Capabilities

Bounty-based sourcing platform

Expert engineer vetting

Automated quality control at scale

High-fidelity coding datasets

Agentic coding evaluation

Integrations & Partnerships

Platform Integrations

Cloud-based data delivery
Integration with foundation model training pipelines

Key Partnerships

Together AI

Vercel (Investors/Employees)

Anthropic (Investors/Employees)

Connect

Website

deepswe.datacurve.ai/

GitHub

datacurve-ai

AI Topics

Datacurve AI focuses on these topics:

LLM Evaluations(1)

AI Coding Assistants(1)

Agent Harness(1)

Back to all developers Suggest an edit