SciArena

Name: SciArena
Availability: OnlineOnly
Author: Allen Institute for AI

Open evaluation platform from the Allen Institute for AI where researchers compare and rank foundation models on scientific literature tasks using head-to-head, literature-grounded responses.

Visit Website

At a Glance

Pricing

Free

Free access to core SciArena search, summarization, and conversational features.

Engagement

Available On

Web

API

Allen Institute for AISeattle, WAEst. 2014$40M raised

Updated Feb 2026

About SciArena

SciArena is an open evaluation platform from the Allen Institute for AI (Ai2) for benchmarking foundation models on scientific literature tasks. Instead of relying on static benchmarks, SciArena collects head-to-head comparisons from human researchers: users submit research questions, see side-by-side, literature-grounded answers from two models, and vote for the better response. These votes drive a public leaderboard and power SciArena-Eval, a meta-evaluation benchmark for testing LLM-as-judge systems.

Arena-style model comparison — Submit scientific questions, inspect long-form, citation-attributed answers from two foundation models, and cast a vote for the preferred output.
Leaderboard with Elo-style ratings — Track how models like o3, Claude, Gemini, and DeepSeek rank overall and by scientific discipline using an Elo-style rating system.
SciArena-Eval benchmark — Use the released human preference data and code to study automated evaluators, LLM-as-judge setups, and model alignment with expert judgments.
Literature-grounded retrieval — Behind the scenes, SciArena uses a multi-stage retrieval pipeline over the Semantic Scholar corpus to ground answers in relevant, up-to-date papers.
Research-grade data quality controls — Expert annotators, training, blind ratings, and agreement checks help ensure the preference data is reliable enough for serious evaluation work.

Community Discussions

Be the first to start a conversation about SciArena

Share your experience with SciArena, ask questions, or help others learn from your insights.

Pricing

FREE

Free

Free access to core SciArena search, summarization, and conversational features.

Core semantic search
AI-generated summaries
Conversational Q&A
Basic filters and citation export

Capabilities

Key Features

Semantic search across scientific literature
AI-generated paper summaries
Conversational Q&A over papers
Filters for date/venue/author and citation export

Integrations

Semantic Scholar

arXiv

PubMed

API Available

View Docs

Back to all tools

SciArena

Academic Research

Open evaluation platform from the Allen Institute for AI where researchers compare and rank foundation models on scientific literature tasks using head-to-head, literature-grounded responses.

Visit Website

At a Glance

Pricing

Free

Free access to core SciArena search, summarization, and conversational features.

Engagement

33views

Discussions

Available On

Web

API

Resources

Website Docs llms.txt

Topics

Academic Research LLM Evaluations Information Synthesis

Alternatives

ASTA AutoDiscovery olmOCR

Developer

Allen Institute for AISeattle, WAEst. 2014$40M raised

Updated Feb 2026

About SciArena

Arena-style model comparison — Submit scientific questions, inspect long-form, citation-attributed answers from two foundation models, and cast a vote for the preferred output.
Leaderboard with Elo-style ratings — Track how models like o3, Claude, Gemini, and DeepSeek rank overall and by scientific discipline using an Elo-style rating system.
SciArena-Eval benchmark — Use the released human preference data and code to study automated evaluators, LLM-as-judge setups, and model alignment with expert judgments.
Literature-grounded retrieval — Behind the scenes, SciArena uses a multi-stage retrieval pipeline over the Semantic Scholar corpus to ground answers in relevant, up-to-date papers.
Research-grade data quality controls — Expert annotators, training, blind ratings, and agreement checks help ensure the preference data is reliable enough for serious evaluation work.

Community Discussions

Be the first to start a conversation about SciArena

Share your experience with SciArena, ask questions, or help others learn from your insights.

Pricing

FREE

Free

Free access to core SciArena search, summarization, and conversational features.

Core semantic search
AI-generated summaries
Conversational Q&A
Basic filters and citation export

Capabilities

Key Features

Semantic search across scientific literature
AI-generated paper summaries
Conversational Q&A over papers
Filters for date/venue/author and citation export

Integrations

Semantic Scholar

arXiv

PubMed

API Available

View Docs

Back to all tools