Instructor
A Python library for structured data extraction from LLMs using Pydantic validation and automatic retries.
At a Glance
Pricing
Free and open source library available on GitHub
Engagement
Available On
About Instructor
Instructor is a Python library that simplifies extracting structured data from large language models (LLMs). Built on top of Pydantic, it provides a seamless way to define data schemas and automatically validate LLM outputs, ensuring type-safe and reliable data extraction. The library supports automatic retries with error correction, making it robust for production use cases.
-
Pydantic Integration - Define your data models using familiar Pydantic syntax, and Instructor handles the conversion to LLM-compatible prompts and validates responses automatically.
-
Automatic Retries - When LLM outputs don't match your schema, Instructor automatically retries with error context, improving success rates without manual intervention.
-
Multi-Provider Support - Works with OpenAI, Anthropic, Google, Cohere, Mistral, and other major LLM providers through a unified interface.
-
Streaming Support - Extract structured data from streaming responses, enabling real-time data processing and partial results.
-
Validation Hooks - Add custom validators to your Pydantic models for complex business logic validation beyond type checking.
-
Multimodal Capabilities - Extract structured data from images and other multimodal inputs supported by compatible LLMs.
-
Parallel Extraction - Process multiple extraction tasks concurrently for improved throughput in batch processing scenarios.
To get started, install Instructor via pip with pip install instructor. Import the library, patch your OpenAI client, and define a Pydantic model for your desired output structure. Call the patched client with your model as the response_model parameter, and Instructor handles the rest. The library includes comprehensive documentation with examples for common use cases like entity extraction, classification, and data transformation.

Community Discussions
Be the first to start a conversation about Instructor
Share your experience with Instructor, ask questions, or help others learn from your insights.
Pricing
Free Plan Available
Free and open source library available on GitHub
- Full library access
- All LLM provider integrations
- Community support
- Documentation access
Capabilities
Key Features
- Structured data extraction from LLMs
- Pydantic model validation
- Automatic retry with error correction
- Multi-provider LLM support
- Streaming response handling
- Custom validation hooks
- Multimodal input support
- Parallel extraction processing
- Type-safe outputs
- OpenAI function calling integration