txtai

Name: txtai
Availability: OnlineOnly
Author: NeuML

An open-source, all-in-one AI framework for semantic search, LLM orchestration, RAG pipelines, autonomous agents, and language model workflows built with Python.

Visit Website

At a Glance

Pricing

Open Source

Fully open-source under Apache 2.0 license. Free to use, modify, and distribute.

Engagement

Available On

API

CLI

SDK

NeuMLWashington DC-Baltimore AreaEst. 2020

Listed Jun 2026

About txtai

txtai is an open-source AI framework developed by NeuML, released under the Apache 2.0 license and hosted on GitHub. It combines semantic search, LLM orchestration, retrieval augmented generation (RAG), and autonomous agents into a single Python library. The project was created in 2020 and has reached v9.10.0 as of June 2026, with active development continuing on the master branch.

What It Is

txtai is a Python framework that unifies vector search, language model pipelines, and agentic workflows under one API. Its core component is an embeddings database — a hybrid data store combining dense and sparse vector indexes, graph networks, and a relational database. This foundation powers both standalone semantic search applications and serves as a knowledge source for LLM-driven systems. The framework is built on top of Hugging Face Transformers, Sentence Transformers, and FastAPI, and can run entirely locally or be scaled out via container orchestration.

Architecture and Core Components

txtai is organized around four main building blocks:

Embeddings database: Supports vector search with SQL, object storage, topic modeling, graph analysis, and multimodal indexing across text, documents, audio, images, and video.
Pipelines: Language model-powered tasks including LLM prompting, question-answering, zero-shot labeling, transcription (Whisper), translation (OPUS), summarization (DistilBART), and text-to-speech (ESPnet JETS).
Workflows: Composable pipeline chains that aggregate business logic, ranging from simple microservices to multi-model processing graphs.
Agents: Built on top of the Hugging Face smolagents framework, agents autonomously connect embeddings, pipelines, workflows, and other agents to solve complex problems. Agent prompting via agents.md and skill.md specifications is supported.

The framework exposes both a REST API (via FastAPI) and a Model Context Protocol (MCP) API, with official client bindings for JavaScript, Java, Rust, and Go.

Setup Path

Installation is straightforward via pip:

pip install txtai

Python 3.10+ is required. Docker-based deployment is also supported for containerized or cloud environments. The README includes quickstart code that gets a working semantic search index running in a few lines. A YAML-based configuration system allows the built-in API server to be launched with a single command, making it accessible to non-Python consumers immediately.

Model Support and Integrations

txtai recommends commercially-usable models from the Hugging Face Hub as defaults for each pipeline component. For LLMs, it supports Hugging Face models, llama.cpp, and any model accessible via LiteLLM — which covers OpenAI, Anthropic Claude, and AWS Bedrock. Models can be loaded from the Hugging Face Hub by path or from local directories. The framework integrates with smolagents for agent orchestration and supports GraphRAG patterns via its built-in semantic graph capabilities.

Update: v9.10.0

The latest release is v9.10.0, published on June 4, 2026. The project has maintained a consistent major-version release cadence, with blog posts covering what's new in versions 4.0 through 9.0 published on Medium. Recent additions highlighted in the README include agent skill integration via skill.md, MCP API support, and GraphRAG capabilities. The repository shows 12,672 stars and 835 forks on GitHub, with the last code push on June 19, 2026, indicating active maintenance. NeuML also notes a hosted version of txtai applications is in development at txtai.cloud.

Community Discussions

Be the first to start a conversation about txtai

Share your experience with txtai, ask questions, or help others learn from your insights.

Pricing

OPEN SOURCE

Open Source

Fully open-source under Apache 2.0 license. Free to use, modify, and distribute.

Full framework access
Semantic search and embeddings database
LLM orchestration and RAG pipelines
Autonomous agents
REST and MCP APIs

Capabilities

Key Features

Embeddings database with dense and sparse vector indexes
Semantic/vector search with SQL support
Multimodal indexing for text, documents, audio, images, and video
LLM orchestration with support for Hugging Face, llama.cpp, OpenAI, Claude, and AWS Bedrock via LiteLLM
Retrieval augmented generation (RAG) pipelines
Autonomous agents built on smolagents framework
Graph networks and GraphRAG support
Pipeline tasks: summarization, transcription, translation, text-to-speech, question-answering, labeling
Composable workflow system for multi-model processing
REST API via FastAPI
Model Context Protocol (MCP) API
Client bindings for JavaScript, Java, Rust, and Go
Local execution with no external data shipping required
Docker and container orchestration support
Topic modeling and graph analysis
Zero-shot and fine-tuned text labeling
agents.md and skill.md agent prompting support

Integrations

Hugging Face Transformers

Sentence Transformers

FastAPI

smolagents

llama.cpp

LiteLLM

OpenAI

Anthropic Claude

AWS Bedrock

Whisper

OPUS translation models

DistilBART

ESPnet JETS

BLIP image captioning

DeBERTa zero-shot

Gemma 4

API Available

View Docs

Back to all tools Suggest an edit