# llamafile

> llamafile lets you distribute and run LLMs with a single self-contained executable file, with no installation required, across most operating systems and CPU architectures.

llamafile is a Mozilla Builders project that collapses the complexity of running large language models into a single-file executable. It combines llama.cpp with Cosmopolitan Libc so that one file runs locally on most operating systems and CPU architectures without any installation. The project also includes **whisperfile**, a single-file speech-to-text tool built on whisper.cpp using the same packaging approach. llamafile is fully open source under the Apache 2.0 license and is actively maintained by Mozilla.ai.

- **Single-file distribution** — *Download one `.llamafile` executable and run it directly; no Python environment, Docker, or package manager needed.*
- **Cross-platform support** — *The same file runs on macOS, Linux, Windows, BSD, and multiple CPU architectures thanks to Cosmopolitan Libc.*
- **Built on llama.cpp** — *Inherits broad model compatibility and GPU acceleration support from the widely-used llama.cpp inference engine.*
- **whisperfile included** — *A companion single-file speech-to-text tool built on whisper.cpp for audio transcription and translation, requiring no installation.*
- **Local inference** — *All computation runs on your own hardware; no data is sent to external servers.*
- **Pre-built model files** — *Ready-to-run llamafiles for popular models (e.g., Qwen, LLaVA) are hosted on Hugging Face for immediate download.*
- **Quick start** — *Download a `.llamafile`, mark it executable (`chmod +x`), and run it; Windows users rename with `.exe` extension.*
- **Versioned releases** — *Stable and legacy releases are available on GitHub; pre-built llamafiles indicate which server version they bundle.*
- **Open source** — *Apache 2.0 licensed core; llama.cpp and whisper.cpp modifications are MIT licensed for upstream compatibility.*

## Features
- Single-file LLM executable
- No installation required
- Cross-platform (Windows, macOS, Linux, BSD)
- Multi-architecture CPU support
- Built on llama.cpp
- whisperfile speech-to-text tool
- Local inference
- Pre-built model files on Hugging Face
- GPU acceleration support
- Open source (Apache 2.0)

## Integrations
llama.cpp, whisper.cpp, Cosmopolitan Libc, Hugging Face

## Platforms
WINDOWS, MACOS, LINUX, API, CLI

## Pricing
Open Source

## Version
0.10.0

## Links
- Website: https://mozilla-ai.github.io/llamafile/
- Documentation: https://mozilla-ai.github.io/llamafile/
- Repository: https://github.com/mozilla-ai/llamafile
- EveryDev.ai: https://www.everydev.ai/tools/llamafile