# Modular

> AI infrastructure platform with MAX framework, Mojo language, and Mammoth for GPU-portable GenAI serving across NVIDIA and AMD hardware.

Modular provides a unified AI infrastructure platform designed to deliver state-of-the-art performance for generative AI workloads across multiple GPU vendors. The platform combines MAX (a GenAI serving framework), Mojo (a high-performance programming language), and Mammoth (a Kubernetes-native control plane for large-scale distributed AI serving) to enable developers to build, optimize, and deploy AI systems with unprecedented hardware portability.

- **MAX Framework** offers a GenAI serving framework that supports 500+ open models with customizable, open-source implementations portable across NVIDIA and AMD GPUs, delivering up to 70% faster inference compared to vanilla vLLM.

- **Mojo Language** provides Python-like syntax with systems-level performance, enabling developers to write high-performance GPU code without deep CUDA expertise while achieving speeds up to 12x faster than Python.

- **Mammoth Orchestration** scales AI workloads from a single GPU to unlimited nodes with a Kubernetes-native control plane specifically designed for large-scale distributed AI serving.

- **Hardware Portability** eliminates vendor lock-in through breakthrough compiler technology that automatically generates optimized kernels for any hardware target, supporting NVIDIA, AMD, and Apple Silicon.

- **Tiny Containers** delivers 90% smaller container sizes (under 700MB vs vLLM) with sub-second cold starts, reducing infrastructure costs and deployment complexity.

- **Open Source Stack** democratizes high-performance AI by open-sourcing the entire stack including optimized kernels, enabling full customization down to the silicon level.

- **Enterprise Support** includes SOC 2 Type I certification, dedicated engineering contacts, custom SLAs, and flexible deployment options including cloud, on-premise, and hybrid configurations.

To get started, install the free Community Edition via Docker, PIP, UV, PIXI, or Conda, then deploy GenAI models locally using the OpenAI-compatible API. Browse the model repository at builds.modular.com to find optimized models for your use case.

## Features
- 500+ GenAI model support
- GPU portability across NVIDIA and AMD
- MAX GenAI serving framework
- Mojo programming language
- Mammoth distributed orchestration
- OpenAI API compatibility
- 90% smaller container sizes
- Sub-second cold starts
- Open source kernels
- Multi-cloud deployment
- SOC 2 Type I certified
- Custom kernel development
- Batch inference endpoints
- Dedicated inference endpoints
- Enterprise hybrid deployments

## Integrations
NVIDIA GPUs, AMD GPUs, Apple Silicon, AWS, Docker, Kubernetes, OpenAI API, Hugging Face, PyTorch, LLVM, MLIR, ROCm, CUDA

## Platforms
WINDOWS, MACOS, LINUX, ANDROID, WEB, API

## Pricing
Freemium — Free tier available with paid upgrades

## Links
- Website: https://www.modular.com
- Documentation: https://docs.modular.com/max/
- Repository: https://github.com/modular/modular
- EveryDev.ai: https://www.everydev.ai/tools/modular