Ph.D. in Encryption from University of Waterloo. Former Lead of Crypto R&D and Engineering Security at Uber. Former engineer at Google and Oracle. Co-founder and CEO of IoTeX/MachineFi Lab.

Executive Team

Raullen Chai

Founder and Lead Developer

Cybersecurity and cryptography expert, CEO of IoTeX.

Founding Story

Started to address the performance gap of existing local LLM engines like Ollama on Apple Silicon by leveraging the MLX framework for native Metal compute kernels.

Business Model

Revenue

Not reported (Open-source project).

Revenue Model

Open-source project (Apache 2.0). No formal revenue model disclosed for the software itself.

Pricing Tiers

Open Source

Free

Available on GitHub and PyPI.

N/A

Target Markets

Industries & Segments

Apple Silicon users
Open source developers
AI researchers
Privacy-conscious AI users

Use Cases

Local LLM serving for developers
AI agents (Cursor, Claude Code, Aider)
Private AI infrastructure
Apple Silicon performance benchmarking

Notable Customers

Cursor
Claude Code
Aider
PydanticAI

Quick Facts

Headquarters

San Francisco Bay Area, CA

Founded

2025

Entity Type

Open Source Project

Employees

Total Funding

Office Locations

Remote

History & Milestones

March 21, 2026

Project officially renamed to 'rapid-mlx' on PyPI with version 0.3.2.

April 13, 2026

Released Model-Harness Index (MHI) benchmark and one-liner installer.

May 4, 2026

Launched single-command merge-readiness pipeline for PR validation.

May 6, 2026

Released version 0.6.15 with enhanced model support and codex review fixes.

Dec 2025

Initial development and implementation of paged KV cache architecture.

Key Capabilities

17+ Tool Parsers

Prompt Cache (KV + DeltaNet)

Reasoning Separation (Qwen3, DeepSeek-R1)

Smart Cloud Routing

Multimodal Support (Vision, Audio)

Continuous Batching

Integrations & Partnerships

Platform Integrations

OpenAI API
Anthropic API
LangChain
PydanticAI
Homebrew
PyPI

Key Partnerships

MLX Community

Hugging Face

Connect

Website

pypi.org/project/rapid-mlx

GitHub

raullenchai

AI Topics

raullenchai focuses on these topics:

Local Inference(1)

AI Infrastructure(1)

LLM Orchestration(1)

Back to all developers

raullenchai

To provide the fastest local AI engine for Apple Silicon, enabling high-performance LLM inference with native tool calling and prompt caching.

Visit Website

At a Glance

San Francisco Bay Area, CAHeadquarters

2025Est.

1Employee

AI Tools by raullenchai

(1)

Rapid-MLX

Local AI Inference for Apple Silicon

Local Inference AI Infrastructure LLM Orchestration

Discussions

No discussions yet

Be the first to start a discussion about raullenchai

Latest News

05/06/2026

Rapid-MLX v0.6.15 released with post-v0.6.14 batch and codex review fixes.

github.com

05/04/2026

Introduction of the single-command merge-readiness pipeline for automated PR grading.

github.com

04/13/2026

Launch of the Model-Harness Index (MHI) and Homebrew distribution support.

github.com

03/23/2026

Rapid-MLX adopts Apache 2.0 license and formalizes community contribution templates.

github.com

Products & Services

Rapid-MLX

2026-03-21

A high-performance local AI inference engine optimized for Apple Silicon, featuring 17 tool parsers, prompt caching, and reasoning separation.

Market Position

2-4x faster than Ollama and llama.cpp on Apple Silicon. Only engine supporting day-0 features for MoE models like DeepSeek V4 Flash on Mac.

Leadership

Founders

Raullen Chai

Ph.D. in Encryption from University of Waterloo. Former Lead of Crypto R&D and Engineering Security at Uber. Former engineer at Google and Oracle. Co-founder and CEO of IoTeX/MachineFi Lab.

Executive Team

Raullen Chai

Founder and Lead Developer

Cybersecurity and cryptography expert, CEO of IoTeX.

Founding Story

Started to address the performance gap of existing local LLM engines like Ollama on Apple Silicon by leveraging the MLX framework for native Metal compute kernels.

Business Model

Revenue

Not reported (Open-source project).

Revenue Model

Open-source project (Apache 2.0). No formal revenue model disclosed for the software itself.

Pricing Tiers

Open Source

Free

Available on GitHub and PyPI.

N/A

Target Markets

Industries & Segments

Apple Silicon users
Open source developers
AI researchers
Privacy-conscious AI users

Use Cases

Local LLM serving for developers
AI agents (Cursor, Claude Code, Aider)
Private AI infrastructure
Apple Silicon performance benchmarking

Notable Customers

Cursor
Claude Code
Aider
PydanticAI

Quick Facts

Headquarters

San Francisco Bay Area, CA

Founded

2025

Entity Type

Open Source Project

Employees

Total Funding

Office Locations

Remote

History & Milestones

March 21, 2026

Project officially renamed to 'rapid-mlx' on PyPI with version 0.3.2.

April 13, 2026

Released Model-Harness Index (MHI) benchmark and one-liner installer.

May 4, 2026

Launched single-command merge-readiness pipeline for PR validation.

May 6, 2026

Released version 0.6.15 with enhanced model support and codex review fixes.

Dec 2025

Initial development and implementation of paged KV cache architecture.

Key Capabilities

17+ Tool Parsers

Prompt Cache (KV + DeltaNet)

Reasoning Separation (Qwen3, DeepSeek-R1)

Smart Cloud Routing

Multimodal Support (Vision, Audio)

Continuous Batching

Integrations & Partnerships

Platform Integrations

OpenAI API
Anthropic API
LangChain
PydanticAI
Homebrew
PyPI

Key Partnerships

MLX Community

Hugging Face

Connect

Website

pypi.org/project/rapid-mlx

GitHub

raullenchai

AI Topics

raullenchai focuses on these topics:

Local Inference(1)

AI Infrastructure(1)

LLM Orchestration(1)

Back to all developers