EveryDev.ai
Sign inSubscribe
Home
Tools

2,747+ AI tools

  • New
  • Trending
  • Featured
  • Compare
  • Arena
Categories
  • Agents1877
  • Coding1340
  • Infrastructure633
  • Marketing503
  • Projects447
  • Research410
  • Design393
  • Analytics357
  • MCP246
  • Security246
  • Testing242
  • Data236
  • Integration180
  • Prompts169
  • Communication162
  • Learning162
  • Extensions154
  • Voice138
  • Commerce127
  • DevOps112
  • Web83
  • Finance24
AI Tools by Topic
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
    1. Home
    2. Tools
    3. Sana
    Sana icon

    Sana

    Image
    Featured

    SANA is an open-source, efficiency-oriented framework by NVIDIA Labs for high-resolution image and video generation using Linear Diffusion Transformers, deployable on consumer GPUs with as little as 8GB VRAM.

    Visit Website

    At a Glance

    Pricing
    Open Source

    Fully open-source under Apache 2.0 license. Free to use, modify, and distribute.

    Engagement

    Available On

    CLI
    API
    SDK

    Resources

    WebsiteDocsGitHubllms.txt

    Topics

    ImageVideoAI Development Libraries

    Alternatives

    STARFlowVisual ExplainerMLX-VLM
    Developer
    NVlabs (NVIDIA Research)NVlabs is NVIDIA's research division, publishing open-source…

    Listed Jun 2026

    About Sana

    SANA is a fully open-source codebase developed by NVIDIA Labs (NVlabs) for high-resolution image and video generation. Released under the Apache 2.0 license, it provides complete training and inference pipelines and is designed to run on consumer-grade hardware via 4-bit quantization. The repository has accumulated over 8,200 GitHub stars since its initial release in October 2024.

    What It Is

    SANA is a series of efficient diffusion models built around a Linear Diffusion Transformer (DiT) architecture. The core innovation replaces standard attention in DiT with linear attention, enabling high-resolution generation at dramatically reduced compute cost. The codebase covers multiple model variants — SANA (image), SANA-1.5 (scaled training/inference), SANA-Sprint (one/few-step generation), SANA-Video (video generation), SANA-WM (world modeling), and Sol-RL (reinforcement learning post-training) — all sharing a unified framework.

    Key Architectural Techniques

    • Linear Attention: Replaces vanilla attention in DiT for efficiency at high resolutions.
    • DC-AE (Deep Compression AutoEncoder): Achieves 32× image compression versus the traditional 8×, reducing latent token count significantly.
    • Decoder-only Text Encoder: Uses a modern decoder-only LLM with in-context learning for improved text-image alignment.
    • Block Causal Linear Attention & Causal Mix-FFN: Efficient attention and feedforward mechanisms designed for long video generation.
    • sCM Distillation: Enables one/few-step generation via continuous-time consistency distillation (used in SANA-Sprint).
    • Sol-RL: Combines NVFP4 (low-precision) rollout with BF16 (high-precision) optimization for faster RL training convergence.

    Model Variants and Performance

    The README benchmarks SANA against FLUX-dev at 1024×1024 resolution. According to the repository's own performance table, Sana-0.6B achieves 39.5× speedup over FLUX-dev while Sana-1.6B achieves 23.3×. SANA-Sprint generates a 1024px image in 0.1 seconds on H100 and 0.3 seconds on RTX 4090. For video, SANA-Video-2B achieves a 36-second latency at 720p versus 400 seconds for Wan-2.1-1.3B, per the repository's VBench comparison table. SANA-WM is a 2.6B parameter controllable world model supporting 720p, 1-minute video generation with 6-DoF camera control.

    Deployment and Integration Ecosystem

    SANA is designed for flexible deployment across a wide range of environments:

    • HuggingFace Diffusers: Full pipeline support via SanaPipeline, SanaPAGPipeline, and compatible schedulers (requires diffusers>=0.32.0).
    • ComfyUI: Official node support for SANA, SANA-1.5, and SANA-Sprint workflows.
    • SGLang: High-performance serving with an OpenAI-compatible API.
    • Replicate API: Available on H100 hardware via Replicate.
    • Cosmos-RL: Post-training (SFT/RL) integration for SANA-Image and SANA-Video.
    • Quantization: 4-bit (via SVDQuant/Nunchaku) and 8-bit quantization allow inference within 8GB GPU VRAM, including on laptop GPUs.
    • LoRA / DreamBooth: Fine-tuning support via diffusers.
    • ControlNet: Training, inference, and model weights for controllable generation.

    Update: v2.0.0 — SANA-Video and SANA-WM

    The latest release (v2.0.0, published June 9, 2026) bundles SANA-Video and SANA-WM as the headline additions. Recent milestones include: SANA-Video 720p with LTX-VAE (March 2026), Sol-RL with NVFP4 rollout recipes for SANA, FLUX.1, and SD3.5-L (April 2026), and SANA-WM with 6-DoF camera control (May 2026). The project has received multiple academic recognitions — SANA was accepted as an ICLR 2025 Oral, SANA-Sprint as an ICCV 2025 Highlight, SANA-1.5 at ICML 2025, and SANA-Video as an ICLR 2026 Oral — indicating active research-to-release velocity.

    Sana - 1

    Community Discussions

    Be the first to start a conversation about Sana

    Share your experience with Sana, ask questions, or help others learn from your insights.

    Pricing

    OPEN SOURCE

    Open Source

    Fully open-source under Apache 2.0 license. Free to use, modify, and distribute.

    • Full training and inference pipelines
    • All model weights on HuggingFace
    • ComfyUI, Diffusers, SGLang integrations
    • 4-bit and 8-bit quantization
    • ControlNet, LoRA, DreamBooth support

    Capabilities

    Key Features

    • Text-to-image generation up to 4K resolution
    • Text-to-video generation up to 720p
    • One/few-step image generation via sCM distillation (SANA-Sprint)
    • Linear Diffusion Transformer architecture
    • DC-AE 32x image compression
    • 4-bit and 8-bit quantization for consumer GPU inference
    • ControlNet support for controllable generation
    • LoRA and DreamBooth fine-tuning
    • Inference-time and training-time compute scaling (SANA-1.5)
    • Reinforcement learning post-training via Sol-RL
    • World modeling with 6-DoF camera control (SANA-WM)
    • Real-time minute-length video generation (LongSANA)
    • Multi-linguistic support (English, Chinese, Emoji)
    • FSDP and DDP distributed training
    • ComfyUI node integration
    • SGLang OpenAI-compatible API serving
    • HuggingFace Diffusers pipeline support

    Integrations

    HuggingFace Diffusers
    ComfyUI
    SGLang
    Replicate
    Cosmos-RL
    SVDQuant / Nunchaku
    SUPIR (4K super-resolution)
    LongLive
    LTX-VAE
    PixArt-alpha
    PixArt-sigma
    EfficientViT
    API Available
    View Docs

    Demo Video

    Sana Demo Video
    Watch on YouTube

    Reviews & Ratings

    No ratings yet

    Be the first to rate Sana and help others make informed decisions.

    Developer

    NVlabs (NVIDIA Research)

    NVlabs is NVIDIA's research division, publishing open-source AI and deep learning frameworks. The team develops efficiency-oriented models and training infrastructure for computer vision, generative AI, and embodied intelligence. SANA is a flagship open-source project from NVlabs, combining academic research with production-ready deployment tooling. The lab collaborates with MIT Han Lab and other academic partners on model efficiency and quantization.

    Read more about NVlabs (NVIDIA Research)
    WebsiteGitHubLinkedIn
    1 tool in directory

    Similar Tools

    STARFlow icon

    STARFlow

    STARFlow is Apple's open-source transformer autoregressive flow model for high-quality text-to-image and text-to-video generation, combining autoregressive models with normalizing flows.

    Visual Explainer icon

    Visual Explainer

    An AI-powered tool that generates visual explanations and diagrams from text descriptions to help understand complex concepts.

    MLX-VLM icon

    MLX-VLM

    A Python library for running Vision Language Models on Apple Silicon using the MLX framework.

    Browse all tools

    Related Topics

    Image

    AI tools that generate or edit still images — illustrations, photos, logos, icons, and graphics — from text prompts, references, or existing images.

    76 tools

    Video

    AI tools that generate or edit video — from text-to-video and animation to avatars, dubbing, and short-form clips.

    66 tools

    AI Development Libraries

    Programming libraries and frameworks that provide machine learning capabilities, model integration, and AI functionality for developers.

    206 tools
    Browse all topics
    Back to all toolsSuggest an edit
    Discussions