EveryDev.ai
Sign inSubscribe
Explore AI Tools
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Podcasts
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    Home
    Tools

    2,480+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1596
    • Coding1181
    • Infrastructure526
    • Marketing447
    • Design427
    • Projects384
    • Research357
    • Analytics331
    • Testing221
    • MCP216
    • Data205
    • Security196
    • Integration169
    • Learning154
    • Communication146
    • Prompts140
    • Extensions137
    • Commerce123
    • Voice122
    • DevOps99
    • Web77
    • Finance21
    1. Home
    2. Tools
    3. Anserini
    Anserini icon

    Anserini

    Academic Research
    Featured

    A Lucene-based toolkit for reproducible information retrieval research, bridging academic IR research and real-world search application development.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully free and open-source under the Apache License 2.0. No cost to use, modify, or distribute.

    Engagement

    Available On

    CLI
    API
    SDK

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    Academic ResearchSearch and DiscoveryRetrieval-Augmented Generation

    Alternatives

    EnterpriseRAG-BenchLOFTSkillsBench
    Developer
    CastoriniCastorini is a research group that builds open-source toolki…

    Listed May 2026

    About Anserini

    Anserini is an open-source Java toolkit built on Apache Lucene, designed to make information retrieval research reproducible and practically applicable. Maintained by the Castorini research group, it grew out of a 2016 reproducibility study of open-source retrieval engines (Lin et al., ECIR 2016) and has since been described in peer-reviewed publications at SIGIR 2017 and the Journal of Data and Information Quality (2018). The project is licensed under Apache 2.0 and is actively developed on GitHub with over 380 contributors.

    What It Is

    Anserini is a research toolkit that wraps Apache Lucene to provide a principled, reproducible environment for information retrieval (IR) experiments. Its core job is to let researchers index document collections, run retrieval experiments, and reproduce published baselines — all with a consistent, version-controlled codebase. The project explicitly positions itself as a bridge between academic IR research and the engineering of real-world search systems. A companion Python interface, Pyserini, exposes most Anserini features for users who prefer Python over Java.

    Architecture and Setup Paths

    Anserini offers two primary installation modes:

    • Fatjar: A self-contained JAR downloaded via curl, requiring no repository clone. This is the fastest path for running experiments.
    • Dev environment: A full repository clone for contributors or users who need to modify source code.

    The toolkit is primarily written in Java (83%), with Python (14%) and Shell scripts rounding out the codebase. It is distributed on Maven Central under the io.anserini namespace, making it easy to include as a dependency in other Java projects.

    Reproducibility as a First-Class Goal

    The project's stated mission is reproducible IR research. It ships with prebuilt index registries and topic registries so that published experimental results can be re-run with a single command. Two reproduction workflows are documented: one from prebuilt indexes (faster) and one from raw document collections (more thorough). The repository includes dedicated runs/ and logs/ directories to capture experiment outputs in a structured way, and CI badges confirm that the build and test suite remain green on the master branch.

    Agent-Aware Workflow

    Anserini has added explicit support for coding agents (such as those powered by large language models). The repository includes an .agents/skills/ directory with structured skill files for:

    • Installing the dev environment or fatjar
    • Running CLI commands (prebuilt-index registry, topics registry, search, REST workflows)
    • Executing reproducibility experiments

    The README provides direct prompt templates users can give to their coding agents, making Anserini one of the earlier research toolkits to formally document agent-oriented onboarding paths.

    Update: v2.0.0 and Lucene 10.4.0

    As of April 12, 2026 (commit c6eed6), Anserini was upgraded to Lucene 10.4.0 as part of the v2.0.0 release. Lucene 9 indexes remain readable by the new code, but indexes generated by Lucene 10 cannot be read by older versions of Anserini. The repository shows active development with commits as recent as May 20, 2026, including SPLADE-v3 ONNX reproduction updates and locale-stable reproduction output fixes.

    Anserini - 1

    Community Discussions

    Be the first to start a conversation about Anserini

    Share your experience with Anserini, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully free and open-source under the Apache License 2.0. No cost to use, modify, or distribute.

    • Apache License 2.0
    • Full source code access
    • Maven Central distribution
    • Fatjar download
    • Community contributions

    Capabilities

    Key Features

    • Lucene-based indexing and retrieval
    • Reproducible IR experiment framework
    • Prebuilt index registry
    • Topics registry
    • BM25 and dense retrieval support
    • SPLADE and ONNX model support
    • Fatjar self-contained distribution
    • Maven Central package
    • Pyserini Python interface
    • Agent-oriented skill files for coding agents
    • REST API workflows
    • Prebuilt and raw document collection reproduction paths

    Integrations

    Apache Lucene
    Pyserini
    Maven Central
    ONNX
    SPLADE
    trec_eval
    MS MARCO
    BEIR
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate Anserini and help others make informed decisions.

    Developer

    Castorini

    Castorini is a research group that builds open-source toolkits for information retrieval and natural language processing research. The group develops Anserini (Java/Lucene) and Pyserini (Python) to enable reproducible IR experiments at scale. Led by Jimmy Lin and collaborators across academia, Castorini focuses on bridging the gap between academic research and practical search system engineering. Their tools are widely used in the IR research community and are published under permissive open-source licenses.

    Read more about Castorini
    WebsiteGitHub
    1 tool in directory

    Similar Tools

    EnterpriseRAG-Bench icon

    EnterpriseRAG-Bench

    An open-source benchmark dataset of 500,000+ enterprise documents and 500 questions for evaluating RAG systems on realistic company internal data.

    LOFT icon

    LOFT

    LOFT (Long-context Frontiers) is a Google DeepMind benchmark for evaluating large language models on long-context retrieval and reasoning tasks across diverse modalities.

    SkillsBench icon

    SkillsBench

    An open-source evaluation framework that benchmarks how well AI agent skills work across diverse, expert-curated tasks in high-GDP-value domains.

    Browse all tools

    Related Topics

    Academic Research

    AI tools designed specifically for academic and scientific research.

    42 tools

    Search and Discovery

    AI-powered tools for finding and exploring information.

    40 tools

    Retrieval-Augmented Generation

    RAG Systems that enhance LLM outputs by retrieving relevant information from external knowledge bases, combining the power of generative AI with information retrieval for more accurate and contextual responses.

    72 tools
    Browse all topics
    Back to all tools
    Discussions