# YoMo

> Open-source serverless LLM Function Calling framework for building scalable, geo-distributed AI agents with TypeScript or Go.

YoMo is an open-source serverless LLM Function Calling framework designed for building scalable and ultra-fast AI agents. Developed by the yomorun organization, it focuses on geo-distributed edge AI infrastructure, bringing AI inference and tools closer to end users worldwide. The project is written primarily in Rust and is available on GitHub under the Apache License 2.0.

## What It Is

YoMo sits in the AI agent framework category, providing the infrastructure layer for deploying and managing LLM tools (also called "skills" or "serverless functions") that AI agents can call at runtime. Rather than bundling tools into a monolithic application, YoMo lets developers write strongly-typed function handlers in TypeScript or Go, deploy them as serverless units, and expose them to any LLM via a Chat Completions-compatible API or as an MCP server. The framework handles routing, security, and geo-distribution so developers focus on writing the tool logic itself.

## Geo-Distributed Architecture

A core design principle of YoMo is moving AI inference and tooling closer to users geographically. The README describes a "Geo-distributed System Architecture" where agent components can be deployed at edge locations worldwide, reducing latency compared to centralized data-center deployments. The underlying transport uses QUIC (referenced in GitHub topics), which is well-suited for low-latency, unreliable network conditions at the edge.

## How the Serverless Tool Model Works

Developers define LLM tools as individual TypeScript or Go modules with a typed `Argument` schema and a `handler` function. The YoMo CLI (`yomo init`, `yomo run`, `yomo serve`) manages the full lifecycle:

- `yomo serve -c my-agent.yaml` starts the server with a YAML config pointing at an LLM provider (OpenAI-compatible endpoints, Ollama, etc.)
- `yomo run -n <tool-name>` registers a serverless tool with the running server
- The server exposes a `/v1/chat/completions` endpoint that proxies requests to the configured LLM while automatically injecting registered tools as function-calling capabilities

TLS v1.3 encryption is applied to every data packet by design, according to the project README.

## MCP and Multi-Model Support

YoMo supports exposing registered LLM function callings as an MCP (Model Context Protocol) server, making tools accessible to MCP-compatible clients. The server configuration includes both a Chat Completions API bridge and an MCP server bridge. The framework is described as model-agnostic — the YAML configuration accepts any OpenAI-compatible `base_url`, enabling use with providers like Google Gemma via Ollama or any other compatible inference endpoint.

## Update: v2.0.2

The latest release is v2.0.2, published on June 15, 2026, with the repository last updated on June 18, 2026. The project has been active since July 2020 and has accumulated over 1,900 GitHub stars and 143 forks. The primary language shifted to Rust in recent development, and GitHub topics include `a2a-protocol`, `mcp`, `function-calling`, `quic`, `serverless`, and `geodistributedsystems`, signaling continued investment in edge AI infrastructure and agent interoperability standards.

## Features
- Serverless LLM Function Calling
- Strongly-typed tool definitions in TypeScript and Go
- Geo-distributed edge AI infrastructure
- MCP server support
- Chat Completions API bridge
- TLS v1.3 encryption on all data packets
- YAML-based server configuration
- Multi-model support via OpenAI-compatible endpoints
- YoMo CLI for full tool lifecycle management
- A2A protocol support

## Integrations
OpenAI, Ollama, Google Gemma, MCP (Model Context Protocol), TypeScript, Go

## Platforms
CLI, API

## Pricing
Open Source

## Version
v2.0.2

## Links
- Website: https://yomo.run
- Documentation: https://yomo.run/introduction
- Repository: https://github.com/yomorun/yomo
- EveryDev.ai: https://www.everydev.ai/tools/yomo