BentoML

To provide an inference platform built for speed and control, allowing AI teams to deploy any model anywhere with tailored optimization and efficient scaling.

Visit Website

At a Glance

23Tool Views

San Francisco, CaliforniaHeadquarters

2018Est.

15Employees

Cloud Computing Platforms

Connect

AI Tools by BentoML

(1)

BentoML

AI Model Inference Platform

AI Infrastructure Model Management Cloud Platforms

Discussions

No discussions yet

Be the first to start a discussion about BentoML

Latest News

02/10/2026

Products & Services

BentoML

Unified open-source model serving framework with support for Python-based service definitions and multi-framework compatibility.

BentoCloud

Fully managed AI inference platform offering serverless scaling, integrated observability, and secure deployment (BYOC).

OpenLLM

Open-source platform for running any open-source Large Language Model (LLM) in production with optimized performance.

LLM-Optimizer

Optimization toolkit designed to improve LLM inference speed and resource efficiency.

Market Position

Described as the 'Hypervisor for AI compute', focusing on unifying the AI software stack from infrastructure to production deployment.

Leadership

Founders

Chaoyu Yang

Co-founder and CEO. Former software engineer at Databricks and early developer in the Apache Spark ecosystem. Currently GTM at Modular post-acquisition.

Bo Jiang

Co-founder. Previously a software engineer with a strong background in distributed systems and infrastructure. Focuses on AI researcher/engineer roles in AI safety and secure systems.

Executive Team

Chaoyu Yang

Founder and CEO

Databricks engineer, Apache Spark developer, University of Washington alumnus.

Bo Jiang

Co-founder

Distributed systems expert, Northwest University alumnus.

Board of Directors

Hurst Lin

Board Member (General Partner at DCM Ventures)

Gloria Zhang

Board Observer (DCM Ventures)

Mingsheng Hong

Advisor / Board Member (Technologist & Entrepreneur)

Founding Story

BentoML started as an open-source framework for model serving in 2018. The vision was to simplify and optimize AI model deployment and serving, eventually evolving into a comprehensive platform for managing AI inference at scale.

Business Model

Revenue

$1.2M (estimated/reported)

Revenue Model

Usage-based (BentoCloud) and Enterprise subscriptions for managed services.

Pricing Tiers

Pay As You Go

Usage-based

Hourly pricing based on specific CPU and GPU resources consumed.

Enterprise

Custom

Custom pricing for BYOC, VPC isolation, enterprise SLAs, and dedicated support.

Acquired by Modular (2026)

Target Markets

Industries & Segments

Enterprise AI teams
AI/ML developers
Fintech
Gaming
Consumer Lending

Use Cases

Standardizing model packaging
Enterprise-grade AI deployment
Self-hosting Large Language Models
Scaling image generation workflows
High-performance real-time inference

Notable Customers

Yext
Jabali AI
LINE
Over 10,000 organizations

Quick Facts

Headquarters

San Francisco, California

Founded

2018

Entity Type

Inc

Employees

Total Funding

$19.6M

Investors

DCM Ventures, Modular

Office Locations

San Francisco

San Mateo

Palo Alto

Funding History

Seed$600K

2018-01-01

Alchemist Accelerator

Seed

2019-07-01

Alchemist Accelerator

Samsung Next

Seed$9M

2023-06-26

DCM Ventures

History & Milestones

2026-02-10

Acquired by Modular to unify AI infrastructure and optimization technologies.

2025-07-01

Raised $10 million Series A round led by Modular with participation from various VCs.

2023-06-26

Secured $9 million in seed financing led by DCM Ventures to expedite AI app development.

2019-07

Participated in Alchemist Accelerator and raised initial seed funding.

2018

Founded in San Francisco as an open-source framework for model serving.

Key Capabilities

Unified model serving framework

Serverless scaling with BentoCloud

Integrated observability and monitoring

Bring Your Own Cloud (BYOC) support

Multi-framework compatibility (PyTorch, TensorFlow, etc.)

Optimized LLM inference

Integrations & Partnerships

Platform Integrations

PyTorch
TensorFlow
Scikit-learn
Keras
XGBoost
Kubernetes
Docker

Key Partnerships

Modular

AWS

Google Cloud

Connect

Website

GitHub

X / Twitter

AI Topics

BentoML focuses on these topics:

AI Infrastructure(1)

Model Management(1)

Cloud Computing Platforms(1)

Back to all developers

BentoML

To provide an inference platform built for speed and control, allowing AI teams to deploy any model anywhere with tailored optimization and efficient scaling.

Visit Website

At a Glance

23Tool Views

San Francisco, CaliforniaHeadquarters

2018Est.

15Employees

Cloud Computing Platforms

Connect

AI Tools by BentoML

(1)

BentoML

AI Model Inference Platform

AI Infrastructure Model Management Cloud Platforms

Discussions

No discussions yet

Be the first to start a discussion about BentoML

Latest News

02/10/2026

BentoML Is Joining Modular

bentoml.com

01/20/2026

The Best Open-Source LLMs in 2026

bentoml.com

01/15/2026

The Best Open-Source Text-to-Speech Models in 2026

bentoml.com

12/10/2025

The Best Open-Source Image Generation Models in 2026

bentoml.com

Products & Services

BentoML

Unified open-source model serving framework with support for Python-based service definitions and multi-framework compatibility.

BentoCloud

Fully managed AI inference platform offering serverless scaling, integrated observability, and secure deployment (BYOC).

OpenLLM

Open-source platform for running any open-source Large Language Model (LLM) in production with optimized performance.

LLM-Optimizer

Optimization toolkit designed to improve LLM inference speed and resource efficiency.

Market Position

Described as the 'Hypervisor for AI compute', focusing on unifying the AI software stack from infrastructure to production deployment.

Leadership

Founders

Chaoyu Yang

Co-founder and CEO. Former software engineer at Databricks and early developer in the Apache Spark ecosystem. Currently GTM at Modular post-acquisition.

Bo Jiang

Co-founder. Previously a software engineer with a strong background in distributed systems and infrastructure. Focuses on AI researcher/engineer roles in AI safety and secure systems.

Executive Team

Chaoyu Yang

Founder and CEO

Databricks engineer, Apache Spark developer, University of Washington alumnus.

Bo Jiang

Co-founder

Distributed systems expert, Northwest University alumnus.

Board of Directors

Hurst Lin

Board Member (General Partner at DCM Ventures)

Gloria Zhang

Board Observer (DCM Ventures)

Mingsheng Hong

Advisor / Board Member (Technologist & Entrepreneur)

Founding Story

Business Model

Revenue

$1.2M (estimated/reported)

Revenue Model

Usage-based (BentoCloud) and Enterprise subscriptions for managed services.

Pricing Tiers

Pay As You Go

Usage-based

Hourly pricing based on specific CPU and GPU resources consumed.

Enterprise

Custom

Custom pricing for BYOC, VPC isolation, enterprise SLAs, and dedicated support.

Acquired by Modular (2026)

Target Markets

Industries & Segments

Enterprise AI teams
AI/ML developers
Fintech
Gaming
Consumer Lending

Use Cases

Standardizing model packaging
Enterprise-grade AI deployment
Self-hosting Large Language Models
Scaling image generation workflows
High-performance real-time inference

Notable Customers

Yext
Jabali AI
LINE
Over 10,000 organizations

Quick Facts

Headquarters

San Francisco, California

Founded

2018

Entity Type

Inc

Employees

Total Funding

$19.6M

Investors

DCM Ventures, Modular

Office Locations

San Francisco

San Mateo

Palo Alto

Funding History

Seed$600K

2018-01-01

Alchemist Accelerator

Seed

2019-07-01

Alchemist Accelerator

Samsung Next

Seed$9M

2023-06-26

DCM Ventures

History & Milestones

2026-02-10

Acquired by Modular to unify AI infrastructure and optimization technologies.

2025-07-01

Raised $10 million Series A round led by Modular with participation from various VCs.

2023-06-26

Secured $9 million in seed financing led by DCM Ventures to expedite AI app development.

2019-07

Participated in Alchemist Accelerator and raised initial seed funding.

2018

Founded in San Francisco as an open-source framework for model serving.

Key Capabilities

Unified model serving framework

Serverless scaling with BentoCloud

Integrated observability and monitoring

Bring Your Own Cloud (BYOC) support

Multi-framework compatibility (PyTorch, TensorFlow, etc.)

Optimized LLM inference

Integrations & Partnerships

Platform Integrations

PyTorch
TensorFlow
Scikit-learn
Keras
XGBoost
Kubernetes
Docker

Key Partnerships

Modular

AWS

Google Cloud

Connect

Website

GitHub

X / Twitter

AI Topics

BentoML focuses on these topics:

AI Infrastructure(1)

Model Management(1)

Cloud Computing Platforms(1)

Back to all developers