# Modular > AI infrastructure platform with MAX framework, Mojo language, and Mammoth for GPU-portable GenAI serving across NVIDIA and AMD hardware. Modular provides a unified AI infrastructure platform designed to deliver state-of-the-art performance for generative AI workloads across multiple GPU vendors. The platform combines MAX (a GenAI serving framework), Mojo (a high-performance programming language), and Mammoth (a Kubernetes-native control plane for large-scale distributed AI serving) to enable developers to build, optimize, and deploy AI systems with unprecedented hardware portability. - **MAX Framework** offers a GenAI serving framework that supports 500+ open models with customizable, open-source implementations portable across NVIDIA and AMD GPUs, delivering up to 70% faster inference compared to vanilla vLLM. - **Mojo Language** provides Python-like syntax with systems-level performance, enabling developers to write high-performance GPU code without deep CUDA expertise while achieving speeds up to 12x faster than Python. - **Mammoth Orchestration** scales AI workloads from a single GPU to unlimited nodes with a Kubernetes-native control plane specifically designed for large-scale distributed AI serving. - **Hardware Portability** eliminates vendor lock-in through breakthrough compiler technology that automatically generates optimized kernels for any hardware target, supporting NVIDIA, AMD, and Apple Silicon. - **Tiny Containers** delivers 90% smaller container sizes (under 700MB vs vLLM) with sub-second cold starts, reducing infrastructure costs and deployment complexity. - **Open Source Stack** democratizes high-performance AI by open-sourcing the entire stack including optimized kernels, enabling full customization down to the silicon level. - **Enterprise Support** includes SOC 2 Type I certification, dedicated engineering contacts, custom SLAs, and flexible deployment options including cloud, on-premise, and hybrid configurations. To get started, install the free Community Edition via Docker, PIP, UV, PIXI, or Conda, then deploy GenAI models locally using the OpenAI-compatible API. Browse the model repository at builds.modular.com to find optimized models for your use case. ## Features - 500+ GenAI model support - GPU portability across NVIDIA and AMD - MAX GenAI serving framework - Mojo programming language - Mammoth distributed orchestration - OpenAI API compatibility - 90% smaller container sizes - Sub-second cold starts - Open source kernels - Multi-cloud deployment - SOC 2 Type I certified - Custom kernel development - Batch inference endpoints - Dedicated inference endpoints - Enterprise hybrid deployments ## Integrations NVIDIA GPUs, AMD GPUs, Apple Silicon, AWS, Docker, Kubernetes, OpenAI API, Hugging Face, PyTorch, LLVM, MLIR, ROCm, CUDA ## Platforms WINDOWS, MACOS, LINUX, ANDROID, WEB, API ## Pricing Freemium — Free tier available with paid upgrades ## Links - Website: https://www.modular.com - Documentation: https://docs.modular.com/max/ - Repository: https://github.com/modular/modular - EveryDev.ai: https://www.everydev.ai/tools/modular