SiliconFlow

Name: SiliconFlow
Availability: OnlineOnly
Author: SiliconFlow

AI cloud platform providing high-speed inference for LLMs, image, video, and audio models with serverless, fine-tuning, and reserved GPU options.

Visit Website

At a Glance

Pricing

Free tier available

Get started with $1 in free credits

Pay-as-you-go: $0.27 usage-based

Enterprise: Custom/contact

Engagement

Available On

Web

API

SiliconFlowSingaporeEst. 2023$35M+ raised

Listed Jan 2026

About SiliconFlow

SiliconFlow is a comprehensive AI cloud platform that delivers high-performance inference for text, image, video, and audio models through a single unified API. The platform supports both open-source and commercial models from providers like DeepSeek, Qwen, OpenAI, MiniMax, and more, enabling developers to build AI-powered applications with predictable costs and blazing-fast performance.

Serverless Inference allows running any model instantly with no setup required—just one API call with pay-per-use pricing and $1 in free credits to get started.
Fine-tuning Capabilities enable customizing powerful models to specific use cases with one-click deployment for tailored AI solutions.
Reserved GPUs provide guaranteed GPU capacity with NVIDIA H100/H200, AMD MI300, and RTX 4090 for stable performance and predictable billing.
Elastic GPUs offer flexible FaaS deployment with reliable and scalable inference for variable workloads.
AI Gateway delivers unified access with smart routing, rate limits, and cost control across all models.
OpenAI-Compatible API ensures seamless integration with existing workflows through a fully compatible interface.
Multi-Modal Support covers LLMs, image generation (FLUX, Z-Image), video generation (Wan2.1/2.2), and audio models (Fish-Speech, CosyVoice).
Privacy-First Architecture ensures no data is stored—models and data remain under user control.
Transparent Pricing with per-token billing for chat models and per-output pricing for media generation, with no hidden fees or commitments.

To get started, sign up for an account at cloud.siliconflow.com, obtain an API key, and begin making API calls immediately. The platform provides comprehensive documentation and code examples for quick integration. SiliconFlow also maintains open-source projects including OneDiff (a lightning-fast inference engine for diffusion models) and BizyAir (an AI-native runtime for scalable inference workloads).

Community Discussions

Be the first to start a conversation about SiliconFlow

Share your experience with SiliconFlow, ask questions, or help others learn from your insights.

Pricing

FREE

Free Tier

Get started with $1 in free credits

$1 free credits
Access to all models
Pay-per-use pricing
No minimum commitments

Pay-as-you-go

Popular

Usage-based pricing for all models

$0.27

usage based

DeepSeek-V3.2: $0.27/$0.42 per M tokens
Qwen3-VL-32B: $0.2/$0.6 per M tokens
Image generation from $0.005/image
Video generation from $0.21/video
Audio models available
No hidden fees
Set spending limits
Volume discounts available

Enterprise

Custom pricing for high-usage customers

Custom

contact sales

Volume discounts
Custom pricing plans
Reserved GPU capacity
Dedicated support
Contact sales for details

View official pricing

Capabilities

Key Features

Serverless model inference
Model fine-tuning
Reserved GPU capacity
Elastic GPU deployment
AI Gateway with smart routing
OpenAI-compatible API
LLM inference
Image generation
Video generation
Audio processing and synthesis
Multi-model support
Pay-per-use pricing
Spending limits control
Volume discounts
No data storage policy

Integrations

DeepSeek models

Qwen models

OpenAI models

MiniMax models

Moonshot AI models

Zhipu AI models

FLUX image models

Wan video models

Fish-Speech audio

CosyVoice audio

API Available

View Docs

Back to all tools Suggest an edit

About SiliconFlow

Serverless Inference allows running any model instantly with no setup required—just one API call with pay-per-use pricing and $1 in free credits to get started.
Fine-tuning Capabilities enable customizing powerful models to specific use cases with one-click deployment for tailored AI solutions.
Reserved GPUs provide guaranteed GPU capacity with NVIDIA H100/H200, AMD MI300, and RTX 4090 for stable performance and predictable billing.
Elastic GPUs offer flexible FaaS deployment with reliable and scalable inference for variable workloads.
AI Gateway delivers unified access with smart routing, rate limits, and cost control across all models.
OpenAI-Compatible API ensures seamless integration with existing workflows through a fully compatible interface.
Multi-Modal Support covers LLMs, image generation (FLUX, Z-Image), video generation (Wan2.1/2.2), and audio models (Fish-Speech, CosyVoice).
Privacy-First Architecture ensures no data is stored—models and data remain under user control.
Transparent Pricing with per-token billing for chat models and per-output pricing for media generation, with no hidden fees or commitments.

SiliconFlow