Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • News
  • Blogs
  • Builds
  • Contests
Create
Sign In
    EveryDev.ai
    Sign inSubscribe
    Home
    Tools

    1,441+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    Categories
    • Coding733
    • Agents644
    • Marketing309
    • Infrastructure298
    • Design239
    • Analytics230
    • Research228
    • Projects215
    • Integration148
    • Testing129
    • Data125
    • Learning115
    • MCP113
    • Security107
    • Extensions95
    • Communication80
    • Prompts79
    • Voice71
    • Commerce70
    • Web59
    • DevOps46
    • Finance12
    Sign In
    1. Home
    2. Tools
    3. Tracking AI
    Tracking AI icon

    Tracking AI

    LLM Evaluations

    A free web tool that quizzes 17+ AI models weekly on IQ tests and political compass questions to monitor and compare AI biases and capabilities over time.

    Visit Website

    At a Glance

    Pricing

    Free tier available

    Full access to all AI tracking, IQ test results, political compass data, and searchable database at no cost.

    Engagement

    Available On

    Web
    API

    Resources

    WebsiteDocsllms.txt

    Topics

    LLM EvaluationsMarket AnalysisAcademic Research

    Listed Mar 2026

    About Tracking AI

    Tracking AI is a free monitoring platform that systematically quizzes major AI chatbots on IQ tests and political compass questions every week, publishing the results publicly. Created by Maxim Lott (executive producer of Stossel TV), the site tracks both verbal and vision AI models including GPT, Claude, Gemini, Grok, DeepSeek, and more. It provides interactive charts showing AI IQ scores over time and political bias positions, helping users understand the ideological and cognitive tendencies of the AI tools they use. The site uses an offline Mensa-member-created IQ test not in any AI training data, ensuring uncontaminated benchmarking.

    • Political Compass Testing — All tracked AIs are asked standardized political compass questions weekly, with results plotted on an economic vs. social axis to reveal ideological leanings.
    • IQ Benchmarking — AIs are tested using both the public Mensa Norway test and an exclusive offline test never published on the internet, preventing training data contamination.
    • Vision Model Support — Vision-capable AI models are tested using actual test images rather than verbalized prompts, enabling separate tracking of multimodal performance.
    • Historical Score Tracking — Interactive time-series charts show how each AI's IQ and political scores have changed over time, with daily and moving-average views.
    • Per-Question Answer Viewer — Users can browse every AI's exact answer to every test question, including reasoning and refusal tracking.
    • Searchable Database — A searchable database of all AI responses is available for deeper research and comparison.
    • Refusal Tracking — The site records when AIs refuse to answer questions, using the most recent prior answer for scoring and flagging the refusal.
    • Model Info Pages — Each tracked AI has a dedicated info page linking to where users can access that chatbot directly.
    Tracking AI - 1

    Community Discussions

    Be the first to start a conversation about Tracking AI

    Share your experience with Tracking AI, ask questions, or help others learn from your insights.

    Pricing

    FREE

    Free Plan Available

    Full access to all AI tracking, IQ test results, political compass data, and searchable database at no cost.

    • Weekly IQ testing for 17+ verbal and vision AI models
    • Political compass tracking and historical charts
    • Per-question answer viewer
    • Searchable database of AI responses
    • Refusal tracking
    View official pricing

    Capabilities

    Key Features

    • Weekly AI IQ testing (verbal and vision)
    • Political compass tracking for 17+ AI models
    • Offline Mensa IQ test not in AI training data
    • Interactive historical score charts
    • Per-question answer viewer for all AIs
    • Searchable database of AI responses
    • Refusal tracking and scoring
    • Vision model image-based testing
    • AI model info and access links

    Integrations

    OpenAI GPT
    Claude (Anthropic)
    Gemini (Google)
    Grok (xAI)
    DeepSeek
    Llama (Meta)
    Bing Copilot (Microsoft)
    Mistral
    Perplexity
    Kimi (MoonShot AI)
    Qwen (Alibaba)
    Manus
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate Tracking AI and help others make informed decisions.

    Developer

    Maxim Lott / Maximum Truth

    Maxim Lott builds data-driven public interest projects focused on transparency and accuracy. He serves as executive producer of Stossel TV and created ElectionBettingOdds.com (25M+ unique visits) and TaxPlanCalculator.com (2M+ uses). He runs the Maximum Truth Substack for data deep dives and built Tracking AI to expose political and cognitive biases in major AI chatbots.

    Read more about Maxim Lott / Maximum Truth
    WebsiteX / Twitter
    1 tool in directory

    Similar Tools

    Amplifying icon

    Amplifying

    AI benchmarking research studio that systematically measures the subjective choices AI systems make, such as tool recommendations, product picks, and build decisions.

    SkillsBench icon

    SkillsBench

    An open-source evaluation framework that benchmarks how well AI agent skills work across diverse, expert-curated tasks in high-GDP-value domains.

    LLM Stats icon

    LLM Stats

    Public leaderboards and benchmark site that publishes verifiable evaluations, scores, and performance metrics for large language models and AI providers.

    Browse all tools

    Related Topics

    LLM Evaluations

    Platforms and frameworks for evaluating, testing, and benchmarking LLM systems and AI applications. These tools provide evaluators and evaluation models to score AI outputs, measure hallucinations, assess RAG quality, detect failures, and optimize model performance. Features include automated testing with LLM-as-a-judge metrics, component-level evaluation with tracing, regression testing in CI/CD pipelines, custom evaluator creation, dataset curation, and real-time monitoring of production systems. Teams use these solutions to validate prompt effectiveness, compare models side-by-side, ensure answer correctness and relevance, identify bias and toxicity, prevent PII leakage, and continuously improve AI product quality through experiments, benchmarks, and performance analytics.

    36 tools

    Market Analysis

    AI-driven platforms that analyze market trends, competitive landscapes, and consumer behavior patterns to provide actionable intelligence for strategic marketing decisions.

    18 tools

    Academic Research

    AI tools designed specifically for academic and scientific research.

    23 tools
    Browse all topics
    Back to all tools
    Explore AI Tools
    • AI Coding Assistants
    • Agent Frameworks
    • MCP Servers
    • AI Prompt Tools
    • Vibe Coding Tools
    • AI Design Tools
    • AI Database Tools
    • AI Website Builders
    • AI Testing Tools
    • LLM Evaluations
    Follow Us
    • X / Twitter
    • LinkedIn
    • Reddit
    • Discord
    • Threads
    • Bluesky
    • Mastodon
    • YouTube
    • GitHub
    • Instagram
    Get Started
    • About
    • Editorial Standards
    • Corrections & Disclosures
    • Community Guidelines
    • Advertise
    • Contact Us
    • Newsletter
    • Submit a Tool
    • Start a Discussion
    • Write A Blog
    • Share A Build
    • Terms of Service
    • Privacy Policy
    Explore with AI
    • ChatGPT
    • Gemini
    • Claude
    • Grok
    • Perplexity
    Agent Experience
    • llms.txt
    Theme
    With AI, Everyone is a Dev. EveryDev.ai © 2026
    Sign in
    1view
    0upvotes
    0discussions