Modal icon

Modal

AI Infrastructure

Serverless cloud platform for running and scaling compute-intensive AI and ML workloads, including model inference, training, batch jobs, and notebooks with usage-based compute billing.

At a Glance

Pricing

Free tier available

Get started with Modal at no cost with $30 / month free compute credits and Up to 3 workspace seats.

Team: $250/mo
Enterprise: Custom/contact

Engagement

Available On

Web
API
SDK

About Modal

Modal is a serverless cloud platform designed to run compute-intensive AI and machine learning applications without requiring users to manage infrastructure. It provides an AI-native runtime with fast model initialization and autoscaling, a built-in globally distributed storage layer for high-throughput model and data access, and first-class support for GPU and CPU workloads for inference, training, and batch processing. Modal exposes APIs and SDKs to deploy OpenAI-compatible model endpoints, run notebooks and sandboxes, and orchestrate large-scale batch jobs.

  • AI-native runtime — a serverless runtime engineered for heavy AI workloads with fast startup and autoscaling to handle parallel GPU/CPU jobs.
  • Built-in distributed storage — globally distributed storage optimized for fast model loading and dataset access to reduce I/O bottlenecks.
  • Flexible compute options — per-second GPU and CPU pricing for many GPU types and hardware configurations, enabling fine-grained usage-based billing.
  • Model serving & inference — deploy OpenAI-compatible LLM endpoints and host custom model APIs at scale.
  • Batch, notebooks, and sandboxes — run large-scale batch workflows, interactive notebooks, and isolated sandboxes for development and testing.

To get started, sign up on the web console, consult the documentation for SDK and API examples, and deploy a sample model or notebook to validate autoscaling and compute billing.

Demo Video

Modal Demo Video
Watch on YouTube

Community Discussions

Be the first to start a conversation about Modal

Share your experience with Modal, ask questions, or help others learn from your insights.

Pricing

FREE

Free Plan Available

Get started with Modal at no cost with $30 / month free compute credits and Up to 3 workspace seats.

  • $30 / month free compute credits
  • Up to 3 workspace seats
  • 100 containers + 10 GPU concurrency
  • Crons and web endpoints (limited)
  • Real-time metrics and logs

Team

Designed for teams with $100 / month free compute credits and Unlimited seats and collaboration features.

$250
per month
  • $100 / month free compute credits
  • Unlimited seats
  • 1000 containers + 50 GPU concurrency
  • Unlimited crons and web endpoints
  • Custom domains
  • Static IP proxy
  • Deployment rollbacks
  • Access to Modal Community Slack

Enterprise

Custom compute pricing with advanced security, support, and higher concurrency.

Custom
contact sales
  • Contact sales
  • Volume-based discounts
  • Unlimited seats
  • Higher GPU concurrency
  • Embedded ML engineering services
  • Support via private Slack
  • Audit logs, Okta SSO, and HIPAA
View official pricing

Capabilities

Key Features

  • AI-native serverless runtime with fast model initialization
  • Built-in globally distributed storage layer
  • Per-second usage-based GPU and CPU pricing
  • Deploy OpenAI-compatible model endpoints
  • Support for notebooks, sandboxes, batch jobs, and training
  • APIs and developer SDKs for programmatic control
  • Community Slack support and documentation with examples

Integrations

OpenAI-compatible API
Amazon S3
DuckDB
Slack (community)
API Available
View Docs