EveryDev.ai
Sign inSubscribe
AI Tools by Topic
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • Communities
  • News
  • Podcasts
  • Blogs
  • Builds
  • Contests
  • Compare
  • Arena
Create
    Home
    Tools

    2,508+ AI tools

    • New
    • Trending
    • Featured
    • Compare
    • Arena
    Categories
    • Agents1655
    • Coding1204
    • Infrastructure536
    • Marketing448
    • Design430
    • Projects388
    • Research368
    • Analytics335
    • Testing230
    • MCP225
    • Data210
    • Security198
    • Integration169
    • Learning155
    • Communication148
    • Prompts144
    • Extensions137
    • Commerce125
    • Voice122
    • DevOps99
    • Web78
    • Finance21
    1. Home
    2. Tools
    3. DiffusionBlocks
    DiffusionBlocks icon

    DiffusionBlocks

    AI Development Libraries

    A principled framework for block-wise neural network training via diffusion interpretation, reducing memory requirements proportionally while maintaining competitive performance across transformer architectures.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Freely available under Apache License 2.0. Use, modify, and distribute without cost.

    Engagement

    Available On

    CLI
    API

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    AI Development LibrariesAcademic ResearchAI Infrastructure

    Alternatives

    PyTorchJAXApache TVM
    Developer
    Sakana AITokyo, JapanEst. 2023$370M+ raised

    Listed May 2026

    About DiffusionBlocks

    DiffusionBlocks is an open-source research framework from Sakana AI that enables memory-efficient training of transformer-based neural networks by partitioning them into independently trainable blocks. Accepted at ICLR 2026, the work was authored by Makoto Shing, Masanori Koyama, and Takuya Akiba, and the official implementation is available on GitHub under the Apache License 2.0.

    What It Is

    DiffusionBlocks addresses a fundamental bottleneck in deep learning: end-to-end backpropagation requires storing activations across all layers simultaneously, which limits how large models can be trained on available hardware. The framework reframes transformer residual connections as updates in a dynamical system, then converts those updates into a denoising process. This allows each block to be trained independently using a score matching objective, so only one block's gradients need to be held in memory at a time — reducing memory requirements in proportion to the number of blocks.

    Core Technical Approach

    The key insight in DiffusionBlocks is that residual connections in transformers naturally correspond to updates in a dynamical system. With minimal modifications, these updates can be recast as those of a denoising diffusion process, enabling each block to learn independently via score matching rather than requiring a global backpropagation pass. This is a theoretically grounded departure from prior block-wise training methods, which the paper characterizes as relying on ad-hoc local objectives.

    • Each training step updates only one block at a time
    • Total iterations are aligned with baseline by multiplying epochs by the number of blocks
    • Compatible with vision transformers (ViT), diffusion models, autoregressive models, recurrent-depth models, and masked diffusion architectures

    Experimental Scope

    The paper's experiments span a range of transformer architectures and tasks, going beyond the small-scale classification benchmarks that prior block-wise methods typically target. The official implementation focuses on image classification using Vision Transformers on CIFAR-100, with support for data augmentation schedules and cosine learning rate schedulers. The paper reports that DiffusionBlocks training matches the performance of end-to-end training across these diverse settings.

    Setup and Requirements

    The repository uses uv for dependency management and targets Python 3.12 with CUDA 12.2 on H100 GPUs. Setup requires logging into Hugging Face and Weights & Biases. The ViT implementation builds on HuggingFace Transformers, and the EDM (energy-based diffusion model) implementation is based on Stability AI's generative-models codebase.

    Update: ICLR 2026 Acceptance and arXiv v3

    The paper was first submitted to arXiv in June 2025 (v1), revised in October 2025 (v2), and last updated in February 2026 (v3). It is confirmed to appear at the 14th International Conference on Learning Representations (ICLR 2026). The GitHub repository was created in September 2025 and last pushed in February 2026, with 94 stars and 5 forks as of the available data.

    DiffusionBlocks - 1

    Community Discussions

    Be the first to start a conversation about DiffusionBlocks

    Share your experience with DiffusionBlocks, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Freely available under Apache License 2.0. Use, modify, and distribute without cost.

    • Full source code access
    • Apache License 2.0
    • Block-wise ViT training on CIFAR-100
    • Baseline and DiffusionBlocks training scripts
    • Evaluation scripts

    Capabilities

    Key Features

    • Block-wise transformer training with independent gradient computation per block
    • Memory reduction proportional to number of blocks
    • Score matching objective for local block training
    • Compatible with vision, diffusion, autoregressive, recurrent-depth, and masked diffusion transformers
    • CIFAR-100 image classification reference implementation
    • Support for cosine learning rate scheduler and random augmentation
    • Hugging Face and Weights & Biases integration
    • uv-based dependency management

    Integrations

    HuggingFace Transformers
    Weights & Biases
    Hugging Face Hub
    Stability AI generative-models (EDM)
    API Available
    View Docs

    Reviews & Ratings

    No ratings yet

    Be the first to rate DiffusionBlocks and help others make informed decisions.

    Developer

    Sakana AI

    Sakana AI builds nature-inspired AI research systems, combining evolutionary algorithms and large language models to automate scientific discovery. The team, led by researchers with backgrounds at Google DeepMind and other leading AI labs, develops open-source frameworks like ShinkaEvolve, the AI Scientist, and the Darwin Goedel Machine. Sakana AI publishes peer-reviewed research and releases production-ready tools for the scientific computing community.

    Founded 2023
    Tokyo, Japan
    $370M+ raised
    193 employees

    Used by

    MUFG
    Sumitomo Mitsui Banking Corporation…
    Mizuho Financial Group
    Daiwa Securities
    +2 more
    Read more about Sakana AI
    WebsiteGitHubX / Twitter
    2 tools in directory

    Similar Tools

    PyTorch icon

    PyTorch

    An open-source machine learning framework for deep learning research and production with GPU acceleration and distributed training support.

    JAX icon

    JAX

    JAX is a Python library for accelerator-oriented array computation and program transformation, designed for high-performance numerical computing and large-scale machine learning.

    Apache TVM icon

    Apache TVM

    An open-source machine learning compiler framework that compiles pre-trained ML models into optimized, deployable modules for any hardware platform.

    Browse all tools

    Related Topics

    AI Development Libraries

    Programming libraries and frameworks that provide machine learning capabilities, model integration, and AI functionality for developers.

    194 tools

    Academic Research

    AI tools designed specifically for academic and scientific research.

    46 tools

    AI Infrastructure

    Infrastructure designed for deploying and running AI models.

    249 tools
    Browse all topics
    Back to all tools
    Discussions