DSPy is an open-source Python framework for programming—not prompting—language models, enabling modular, optimizable AI systems through structured signatures and automatic prompt optimization.
At a Glance
About DSPy
DSPy is a Python framework originating from Stanford NLP that lets developers build AI systems by expressing tasks as typed signatures rather than hand-crafted prompt strings. Released under the MIT license, it is freely available on GitHub and has grown to over 35,000 stars and 433 contributors since its initial commit in December 2022. The project publishes new optimizers and module types as academic research first, then ships them into the library.
What It Is
DSPy (Declarative Self-improving Python) treats LLM pipelines as programs rather than prompt templates. Developers define structured input/output signatures, compose them into modules, and then run an optimizer that automatically tunes prompts—or fine-tunes weights—against a user-defined metric. The result is a pipeline that can be compiled once and redeployed across different models (e.g., GPT-4, Llama, T5) without rewriting prompt strings.
Core Abstractions
DSPy is built around three composable primitives:
- Signatures — typed input/output declarations that replace raw prompt strings. A signature like
"question -> answer"or a class withdspy.InputField/dspy.OutputFieldannotations tells DSPy what the task is without specifying how to prompt for it. - Modules — execution strategies that wrap a signature. Built-ins include
dspy.Predict(direct completion),dspy.ChainOfThought(step-by-step reasoning),dspy.ReAct(tool-using agent loop),dspy.ProgramOfThought,dspy.BestOfN,dspy.Refine, and more. Modules share the same interface, so swapping strategies requires changing one line. - Optimizers — algorithms that compile a program against a metric. Available optimizers include
BootstrapFewShot,MIPROv2,COPRO,SIMBA,GEPA,BootstrapFinetune,BetterTogether, and others. Each optimizer explores different strategies: few-shot bootstrapping, instruction generation, fine-tuning, or reinforcement-learning-style evolution.
Research Lineage
DSPy grew out of the Demonstrate-Search-Predict paper (Dec 2022) and has since produced a series of peer-reviewed publications. Notable papers include the original DSPy paper (Oct 2023, ICLR 2024), MIPROv2 for multi-stage instruction optimization (Jun 2024), BetterTogether combining fine-tuning and prompt optimization (Jul 2024), and GEPA: Reflective Prompt Evolution (Jul 2025). The homepage cites a GEPA experiment showing a RAG program improving from 0.41 to 0.63 F1 on the same small model after compilation.
Production Deployment Model
The DSPy documentation lists production deployments at Shopify, Databricks, Dropbox, JetBlue, Moody's, Replit, AWS, Sephora, and VMware, among others, according to the vendor's own use-case page. The homepage attributes a ~550× cost reduction to Shopify's metadata extraction use case. For production use, DSPy integrates with MLflow for tracing (via OpenTelemetry), reproducibility logging, and model serving deployment. The framework is designed with thread-safety and native async execution for high-throughput environments.
Update: Version 3.2.1 and 3.3.0b1
The latest stable release is 3.2.1 (published May 5, 2026). A beta release 3.3.0b1 is also available, introducing a new ReActV2 module and improved LM/BaseLM interfaces. The repository shows active development with 523+ merged PRs per year and recent pushes as of June 2026. The homepage reports 6.4M+ monthly downloads and an active Discord community of 8,400+ members.
Community Discussions
Be the first to start a conversation about DSPy
Share your experience with DSPy, ask questions, or help others learn from your insights.
Pricing
Open Source
Fully free and open-source under the MIT license. Install via pip and use all features without cost.
- All signatures, modules, and optimizers
- MLflow integration
- Async and streaming support
- Community support via GitHub and Discord
Capabilities
Key Features
- Typed input/output signatures replacing raw prompt strings
- Modular execution strategies: Predict, ChainOfThought, ReAct, ProgramOfThought, BestOfN, Refine
- Automatic prompt optimization via multiple optimizers (GEPA, MIPROv2, BootstrapFewShot, COPRO, SIMBA, etc.)
- Fine-tuning support via BootstrapFinetune and BetterTogether
- Tool use and MCP integration with ReAct agent loop
- Multimodal support (Image, Audio field types)
- MLflow integration for tracing, reproducibility, and deployment
- Native async execution and thread-safety for production
- Save and load compiled programs as JSON
- Built-in evaluation utilities and custom metric support
- Caching with configurable cache directory
- Streaming support
