Harbor Framework Team
To provide a framework for specifying and running sandboxed agent tasks for evaluation and optimization at scale.
At a Glance
- AI Research Communities
- Large Language Model Developers
- Academic Research Institutions
AI Tools by Harbor Framework Team
(1)terminal-bench
AI Agent Terminal Benchmark
Discussions
No discussions yet
Be the first to start a discussion about Harbor Framework Team
Latest News
Harbor Framework announces upcoming integration with LangChain.
Harbor Framework Team joins Open Benchmarks Grants initiative to support AI agent evaluation.
Alex Shaw speaks on using Harbor for agent evals at scale at Daytona Compute Conference.
Release of Terminal-Bench 2.0 and the Harbor Framework.
Products & Services
An open-source framework for running and optimizing AI agents in containerized sandboxes.
A benchmarking platform for evaluating AI agents on complex, terminal-based tasks.
Market Position
Leading open-source infrastructure for agentic evaluation, providing a more robust and secure alternative to generic LLM benchmarks.
Leadership
Founders
Andy Konwinski
Co-founder of Databricks and Perplexity AI; prominent AI researcher and entrepreneur.
Alex Shaw
Founding Member of Technical Staff at Laude Institute; co-creator of Terminal-Bench.
Executive Team
Andy Konwinski
Founder, Laude Institute
Co-founder of Databricks and Perplexity; lead visionary for Laude's AI research initiatives.
Alex Shaw
Founding Member of Technical Staff
Primary developer and researcher behind Harbor and Terminal-Bench.
Board of Directors
Founding Story
Harbor grew out of Terminal-Bench, a collaborative research project by Stanford University and the Laude Institute. It was developed to provide a standardized, secure environment for benchmarking autonomous AI agents.
Business Model
Revenue Model
Non-profit Research Institute / Philanthropic Support
Pricing Tiers
Community-driven and free to use
Target Markets
- AI Research Communities
- Large Language Model Developers
- Academic Research Institutions
- AI Agent benchmarking
- LLM model optimization
- Safety and reliability testing for autonomous systems
- Stanford University
- AfterQuery
- Tensorlake
- Snorkel AI