Marin

Name: Marin
Availability: OnlineOnly
Author: marin-community

An open-source framework for researching and developing foundation models, with full reproducibility of every step from raw data to final model.

Visit Website

At a Glance

Pricing

Open Source

Fully free and open-source under Apache License 2.0. Free to use, modify, and distribute.

Engagement

Available On

Windows

Web

API

SDK

CLI

marin-communityStanford University, CAEst. 2025

Listed May 2026

About Marin

Marin is an open-source framework built by the marin-community organization for the research and development of foundation models. It operates as an open lab where every step of the model-building process—data curation, training, evaluation, and even failed experiments—is recorded and shared publicly in real time. The project is licensed under Apache 2.0 and hosted on GitHub, with documentation available on ReadTheDocs.

What It Is

Marin is a Python-based framework designed to make foundation model research fully reproducible and transparent. Rather than sharing only final model weights, Marin captures the entire provenance graph: raw data sources, tokenization pipelines, training configurations, hyperparameter choices, and evaluation results. It targets researchers and practitioners who want to train language models like Llama, DeepSeek, or Qwen-style architectures from scratch, and who want every decision to be auditable and replicable.

How the Experiment Workflow Works

Marin structures research as a directed acyclic graph of steps, similar to a Makefile, where each step can depend on prior steps and is executed in topological order. The lifecycle of an experiment follows a defined pattern:

A GitHub issue is created to preregister the experiment with hypotheses and goals.
A pull request is submitted with code that reproduces the experiment.
The code defines a provenance graph that is executed, with results summarized in a WandB report.

This means every experiment—including those that failed—is traceable through a GitHub issue, a PR, executable code, and a WandB run. Example experiments tracked this way include comparisons of z-loss impact, optimizer sweeps (AdamW vs. alternatives), BERT vs. fastText as quality filters, and MoE vs. dense model efficiency.

Models Trained with Marin

The marin-community has used the framework to train and release several models:

Marin-8B-Base: The project claims this was the first open-source 8B parameter model to outperform Llama 3.1 8B, beating it on 14 out of 19 standard benchmarks.
Marin-8B-Instruct: A fine-tuned instruction-following variant available to try on Together AI.
Marin-32B-Base: The project states this beats OLMo 2 32B Base on 14/19 standard benchmarks and is competitive with Gemma 3 27B PT and Qwen 2.5 32B Base.

All training scripts, execution graphs, and WandB reports for these models are publicly linked from the project homepage.

Core Capabilities

Marin covers the full pipeline for language model development:

Data curation: filtering, transformation, and quality scoring of raw datasets
Tokenization: configurable tokenization pipelines (e.g., Llama 3 tokenizer)
Training: supports TPU pods (including multislice TPU) and GPU multi-node setups
Evaluation: integrates with EleutherAI's lm-evaluation-harness for in-loop eval during training
Speedrun competition: a community benchmark inspired by the nanogpt speedrun, where participants compete to train models to a target quality within a compute budget

Current Status and Community

As of May 2026, the repository shows active development with 983 stars, 116 forks, and 578 open issues. The project acknowledges support from the Google TPU Research Cloud program. Community participation happens via Discord and a mailing list, and the project explicitly invites contributions across architecture experiments, training algorithms, datasets, and evaluations. Agent skill guides (e.g., for adding new datasets) are included in the repository under .agents/skills/.

Community Discussions

Be the first to start a conversation about Marin

Share your experience with Marin, ask questions, or help others learn from your insights.

Pricing

OPEN SOURCE

Open Source

Fully free and open-source under Apache License 2.0. Free to use, modify, and distribute.

Full framework source code
Data curation and tokenization pipelines
Language model training on TPU and GPU
In-loop evaluation with lm-evaluation-harness
WandB integration

Capabilities

Key Features

Full reproducibility of every training step
Provenance graph execution (DAG-based, like a Makefile)
Data curation, filtering, transformation, and tokenization pipelines
Language model training on TPU pods and multi-node GPUs
In-loop evaluation with lm-evaluation-harness
WandB integration for experiment reporting
GitHub issue-based experiment preregistration
Speedrun competition for efficient training methods
Perplexity Gap Dashboard for analysis
Agent skill guides for common tasks

Integrations

WandB (Weights & Biases)

EleutherAI lm-evaluation-harness

Hugging Face Datasets

Google TPU Research Cloud

Together AI

GitHub

API Available

View Docs

Back to all tools Suggest an edit