Modal
Serverless cloud platform for running and scaling compute-intensive AI and ML workloads, including model inference, training, batch jobs, and notebooks with usage-based compute billing.
At a Glance
Pricing
Get started with Modal at no cost with $30 / month free compute credits and Up to 3 workspace seats.
Engagement
Available On
About Modal
Modal is a serverless cloud platform designed to run compute-intensive AI and machine learning applications without requiring users to manage infrastructure. It provides an AI-native runtime with fast model initialization and autoscaling, a built-in globally distributed storage layer for high-throughput model and data access, and first-class support for GPU and CPU workloads for inference, training, and batch processing. Modal exposes APIs and SDKs to deploy OpenAI-compatible model endpoints, run notebooks and sandboxes, and orchestrate large-scale batch jobs.
- AI-native runtime — a serverless runtime engineered for heavy AI workloads with fast startup and autoscaling to handle parallel GPU/CPU jobs.
- Built-in distributed storage — globally distributed storage optimized for fast model loading and dataset access to reduce I/O bottlenecks.
- Flexible compute options — per-second GPU and CPU pricing for many GPU types and hardware configurations, enabling fine-grained usage-based billing.
- Model serving & inference — deploy OpenAI-compatible LLM endpoints and host custom model APIs at scale.
- Batch, notebooks, and sandboxes — run large-scale batch workflows, interactive notebooks, and isolated sandboxes for development and testing.
To get started, sign up on the web console, consult the documentation for SDK and API examples, and deploy a sample model or notebook to validate autoscaling and compute billing.

Community Discussions
Be the first to start a conversation about Modal
Share your experience with Modal, ask questions, or help others learn from your insights.
Pricing
Free Plan Available
Get started with Modal at no cost with $30 / month free compute credits and Up to 3 workspace seats.
- $30 / month free compute credits
- Up to 3 workspace seats
- 100 containers + 10 GPU concurrency
- Crons and web endpoints (limited)
- Real-time metrics and logs
Team
Designed for teams with $100 / month free compute credits and Unlimited seats and collaboration features.
- $100 / month free compute credits
- Unlimited seats
- 1000 containers + 50 GPU concurrency
- Unlimited crons and web endpoints
- Custom domains
- Static IP proxy
- Deployment rollbacks
- Access to Modal Community Slack
Enterprise
Custom compute pricing with advanced security, support, and higher concurrency.
- Contact sales
- Volume-based discounts
- Unlimited seats
- Higher GPU concurrency
- Embedded ML engineering services
- Support via private Slack
- Audit logs, Okta SSO, and HIPAA
Capabilities
Key Features
- AI-native serverless runtime with fast model initialization
- Built-in globally distributed storage layer
- Per-second usage-based GPU and CPU pricing
- Deploy OpenAI-compatible model endpoints
- Support for notebooks, sandboxes, batch jobs, and training
- APIs and developer SDKs for programmatic control
- Community Slack support and documentation with examples
Integrations
Demo Video
