GLiNER
An open-source framework for training and deploying zero-shot Named Entity Recognition models using bidirectional transformer encoders, capable of identifying any entity type without task-specific training.
At a Glance
About GLiNER
GLiNER is an open-source Python framework for training and deploying small Named Entity Recognition (NER) models with zero-shot capabilities, originally developed by Urchade Zaratiana and collaborators and published at NAACL 2024. It runs on CPUs and consumer hardware, supports ONNX export, and the project states it achieves performance competitive with much larger LLMs like ChatGPT and UniNER. The repository is licensed under Apache 2.0 and has accumulated over 3,300 GitHub stars as of mid-2026.
What It Is
GLiNER (Generalist and Lightweight Model for Named Entity Recognition) is a framework that lets developers extract any named entity type from text by specifying labels at inference time — no labeled training data or task-specific fine-tuning required for zero-shot use. It uses bidirectional transformer encoders (BERT-like) rather than autoregressive LLMs, keeping models small and fast. Beyond standard NER, GLiNER supports joint entity and relation extraction, PII detection, multi-lingual information extraction across 100+ languages, and multi-task token classification through specialized architectures.
Architecture Options
GLiNER ships with four distinct architectures to match different deployment needs:
- Uni-encoder — the original GLiNER architecture; strong zero-shot capabilities, supports up to ~50 entity types simultaneously.
- Bi-encoder — encodes text and labels separately, scaling to hundreds or thousands of entity types without quality degradation.
- RelEx — joint NER and relation extraction in a single forward pass, enabling knowledge graph construction.
- GLiNER Decoder — a hybrid architecture that generates entity types with a small decoder for maximum flexibility in open NER settings.
Deployment and Optimization
GLiNER is designed to run anywhere: CPU, GPU, edge devices, and cloud. The framework provides several optimization paths:
torch.compilefor up to ~1.5× speedup with no quality loss.- FP16 quantization (
quantize=True) for up to ~1.9× faster GPU inference. - INT8 quantization for further memory reduction (requires Quantization-Aware Training).
- ONNX export for cross-platform and high-performance inference.
- A Ray Serve-based serving layer with dynamic batching, memory-aware batch sizing, precompiled kernels, horizontal GPU scaling, and an HTTP API.
Popular Use Cases
The documentation highlights several production-oriented use cases:
- Compliance & PII Redaction — detecting and masking 40+ types of personal data (SSN, credit cards, passports, emails, IBANs) across documents and pipelines.
- Knowledge Graph Construction — jointly extracting entities and relations to power Graph RAG and semantic search.
- Large-Scale Entity Extraction — using the bi-encoder to tag millions of documents against hundreds of entity types.
- Domain-Specific NER — fine-tuning on biomedical, legal, financial, or other specialized corpora with minimal labeled data.
- Search & Retrieval Augmentation — parsing queries into structured entities to improve RAG pipelines.
Ecosystem and Integrations
GLiNER has spawned a community ecosystem of related projects including GLiClass (zero-shot text classification), GLinker (entity linking), GLiNER.cpp (C++ inference), gline-rs (Rust implementation), a spaCy integration (gliner-spacy), and a vLLM integration for scalable serving. The project maintains a Discord community and a subreddit (r/GLiNER), and models are distributed via Hugging Face under the gliner-community organization.
Update: v0.2.27 and GLiNER2
The latest release is v0.2.27, published May 11, 2026. Alongside the core library, the GLiNER family has expanded: GLiNER2 (from Fastino Labs) is described as a unified multi-task model for NER, text classification, and structured data extraction, with a paper published in 2025. Additional follow-up work includes GLiNER2-PII (multilingual PII extraction) and GLiGuard (schema-conditioned classification for LLM safeguarding), signaling active research and product development around the GLiNER architecture.
Community Discussions
Be the first to start a conversation about GLiNER
Share your experience with GLiNER, ask questions, or help others learn from your insights.
Pricing
Open Source
Fully free and open-source under Apache License 2.0. Install via pip and use, modify, or distribute freely.
- Zero-shot NER
- Fine-tuning support
- ONNX export
- Ray Serve integration
- All architectures (Uni-encoder, Bi-encoder, RelEx, Decoder)
Capabilities
Key Features
- Zero-shot Named Entity Recognition
- Joint entity and relation extraction
- PII detection and redaction
- Multi-lingual support (100+ languages)
- Bi-encoder architecture for 100+ entity types
- ONNX export for cross-platform deployment
- Ray Serve-based production serving with dynamic batching
- torch.compile and FP16/INT8 quantization
- Fine-tuning on custom datasets
- Knowledge graph construction
- spaCy integration
- Hugging Face model hub integration
- CPU and consumer hardware support
