Design Arena

Name: Design Arena
Availability: OnlineOnly
Author: Arcada Labs

Crowdsourced benchmark for AI‑generated design. Users vote on head‑to‑head outputs (web UI, images, video, audio) to rank models by human preference.

Visit Website

At a Glance

Pricing

Free tier available

Full public access to Design Arena

Enterprise: Custom/contact

Engagement

Available On

Web

Arcada LabsSan Francisco, CAEst. 2025

Updated May 2026

About Design Arena

Design Arena is the world's first crowdsourced benchmark platform dedicated to evaluating AI-generated design quality through real human preference testing. Built by Y Combinator S25 company Arcada Labs, it uses head-to-head comparisons where users vote on anonymous AI outputs across categories like websites, UI components, images, video, audio, logos, and data visualizations. The platform applies a Bradley-Terry rating system (Elo-style scoring) to aggregate thousands of votes into transparent public leaderboards that reveal which AI models produce designs people actually prefer.

How Design Arena Works

Design Arena presents two AI-generated outputs side-by-side from identical prompts. Users vote on which design they prefer, and these votes feed into an Elo-based ranking system that updates in real time. Bot protection via captcha ensures only human preferences count toward the benchmark.

Design Arena Categories

Arena	What It Benchmarks	Example Tools
Model Arena	LLMs generating single-file HTML/CSS/JS code	OpenAI, Anthropic, Google Gemini, xAI, DeepSeek, Mistral
Builder Arena	Vibe-coding tools deploying complete web apps	Lovable, Bolt, v0, Replit, Cursor, Devin, Firebase Studio
Mobile Builder Arena	Mobile app generators	Rork, Blink.new
Image Arena	Image diffusion models	Midjourney, Black Forest Labs, Ideogram, Recraft
Video Arena	Video generation models	Luma Labs, Kling AI, Pika, Midjourney
Slides Arena	Presentation generators	SlidesGPT, Gamma

Design Arena Model Coverage

Design Arena tracks 50+ LLM models, 12+ image models, 4+ video models, and 22+ audio models across its specialized arenas. Each arena uses category-specific prompts and evaluation criteria to produce fair comparisons within its domain.

Design Arena Features

Elo-Based Rankings - Uses the Bradley-Terry model to calculate win rates and Elo scores from pairwise comparisons, providing statistically robust rankings
Micro Evals - Automated code evaluations that test agent-generated apps for specific technical criteria like Next.js routing, Tailwind implementation, and Vercel deployment
Transparent Methodology - Publishes all system prompts, evaluation methods, and ranking formulas openly so users can verify rankings reflect genuine community preferences
Private Enterprise Evaluations - Offers companies secure version-over-version testing to track model improvements and accelerate R&D cycles with human preference data

Getting Started with Design Arena

Visit the arena to vote on design matchups and explore leaderboards to discover which AI models excel at specific design tasks.

Community Discussions

Be the first to start a conversation about Design Arena

Share your experience with Design Arena, ask questions, or help others learn from your insights.

Pricing

FREE

Free

Full public access to Design Arena

Head-to-head voting on AI designs
Access to all arena categories
Public leaderboard access
Create and view tournaments
Real-time Elo rankings

Enterprise

Private evaluations for AI companies and teams

Custom

contact sales

Private benchmark environments
Version-over-version model testing
Proprietary evaluation workflows
Human preference data for R&D
Custom prompt sets
Analytics and performance tracking
API access for workflow integration

View official pricing

Capabilities

Key Features

Head‑to‑head voting UI for AI‑generated outputs
Public leaderboards ranked by live community votes
Model comparison studio that does not affect rankings
Human vs AI challenge mode (“Humanity”)
Published methodology and system prompts
Captcha/bot‑resistance for human‑only ratings
Private evaluations for enterprises

Integrations

OpenAI

Anthropic Claude

Google Gemini

xAI Grok

DeepSeek

Mistral

Alibaba Qwen

Moonshot Kimi

Cohere

Zhipu

Meta LLaMA

Lovable

Bolt.new

Replit

Cursor

Devin

Firebase Studio

new.website

Magic Patterns

Figma Make

Rork

Blink.new

Midjourney

Black Forest Labs

Ideogram

Recraft

Luma Labs

Kling AI

Pika

SlidesGPT

Gamma

Demo Video

Watch on YouTube

Reviews & Ratings

No ratings yet

Be the first to rate Design Arena and help others make informed decisions.

Developer

Arcada Labs

# Arcada Dev (Arcada Labs) Arcada Dev is the product studio behind **Arcada Labs** — a team building “creative environments” that turn fuzzy human traits (like taste, aesthetic judgment, and play) into something measurable. Their bet is simple: if we can’t measure what humans *prefer*, we can’t reliably improve AI that makes things for humans. --- ## What Arcada Dev builds Arcada Labs frames its work as three “arenas”: * **Taste → Design Arena** — measuring what “looks right” * **Sound → Audio Arena** — measuring what “sounds right” * **Play → (coming soon)** — measuring what “feels right” Think of these as public training grounds for subjectivity: instead of debating taste endlessly, you run controlled matchups, collect votes, and let the data tell you where models actually land. --- ## Flagship: Design Arena **[Design Arena](/tools/design-arena)** is a crowdsourced benchmark for AI-generated design — spanning real-world creative tasks (like UI/front-end design, images, audio, video, and more) and evaluating them with live, organic user feedback. ### The core mechanic (simple, but sharp) Design Arena uses blind, head-to-head tournaments: * Two model outputs face off on the same prompt/task * The model names are hidden to reduce brand bias * Users vote for the better result * Those matchups roll up into a public leaderboard ### How ranking works (in plain English) Every vote is treated like a “match.” Over many matches, Design Arena estimates how likely each model is to beat another and converts that into a rating you can compare across the field (presented in an Elo-style form). --- ## Why it exists LLMs can ace tests and proofs, but design failures are painfully human: unreadable contrast, awkward layouts, weird spacing, “technically correct” outputs that still feel wrong. Design Arena exists because there hasn’t been a standard way to pressure-test taste, usability, and aesthetics at scale. No benchmark → no consistent feedback loop → slow improvement. --- ## Traction Design Arena gained significant early adoption, drawing tens of thousands of users across well over a hundred countries within weeks of launch, and later expanding to well over a hundred thousand users worldwide. --- ## Mission & posture Arcada’s posture is refreshingly direct: * Build grounded, real-user evaluation instead of vibes and anecdotes * Make the benchmark public so progress is visible (and comparable) * Use the leaderboard as a mirror that reveals limitations, not a trophy case Design Arena also presents a strong “access” stance: tomorrow’s creative evaluation tools should be broadly available, not gated behind enterprise walls. --- ## Team Arcada Labs is led by a small founding team with deep technical roots and a shared background, including experience building at Apple. Public-facing leadership includes: * **Grace Li (CEO)** * **Kamryn Ohly (CTO)** --- ## Funding & company status Arcada Labs is a 2025-founded company, backed by an accelerator/incubator round, and associated with Y Combinator. --- ## Contact Arcada maintains a founder-facing contact channel for leaderboard nominations, partnerships, and community collaboration.

Founded 2025

San Francisco, CA

Similar Tools

LM Arena

Web platform for comparing, running, and deploying large language models with hosted inference and API access.

BridgeBench

BridgeBench ranks AI coding models across UI generation, security, refactoring, hallucination, debugging, and speed benchmarks.

Attention Insight

AI-powered pre-launch visual analytics tool that predicts where users look on designs using eye-tracking heatmaps trained on millions of real fixations.

Browse all tools