Replicate

Replicate is a developer platform and API for running, fine‑tuning, deploying, and scaling machine learning models. It exposes models as production-ready APIs and supports running community and private models with per-second hardware billing. Teams can deploy custom models (via Cog), fine-tune models with their data, and monitor predictions with logs and metrics.

One-line API access call any model with a single API request using official SDKs (Node, Python, HTTP).
Pay‑for‑what‑you‑use billing models are billed by runtime (per-second) and by hardware type so you only pay for compute used.
Deploy custom models package and deploy your own model with Cog to create a scalable API endpoint.
Fine-tuning support train or fine-tune models on Replicate to produce custom versions for specific tasks.
Hardware choices & scaling choose CPU or GPU hardware (T4, L40S, A100, etc.) and scale automatically when demand increases.
Logging & monitoring built-in metrics and logs let teams track model performance and debug predictions.

To get started, sign up on the web, obtain an API token, and use the Node/Python/HTTP SDKs to run a published model or deploy your own model packaged with Cog.

No discussions yet

Be the first to start a discussion about Replicate

Demo Video for Replicate

Developer

Replicate

Replicate builds a developer platform that makes models available as production-ready APIs and infrastructure. The team of engineers an…read more

Replicate developer profile

Pricing and Plans

(Paid)

CPU (standard)

$0.36/usage

Standard CPU runtime billed per second (example rate shown on pricing).

Per-second billing by runtime
Runs on shared CPU hardware

Nvidia T4 GPU

$0.81/usage

Nvidia T4 GPU runtime billed per second (example rate shown on pricing).

Per-second billing for GPU inference
Lower-cost GPU option for image and model inference

Nvidia A100 (80GB) GPU

Popular

$5.04/usage

Nvidia A100 (80GB) GPU runtime billed per second (example rate shown on pricing).

High-memory GPU for large models and training
Per-second billing for multi‑GPU options

System Requirements

Operating System

Any OS with a modern web browser or server environment for API clients

Memory (RAM)

4 GB+ RAM

Processor

Any modern 64-bit CPU

Disk Space

No local storage required (cloud-based)

AI Capabilities

Text-to-image generation

Text-to-video generation

Speech generation

Music generation

Image restoration

Language models (LLMs)

← Back to all tools

Stats on Replicate

Links

Website Docs Repository

Topics

AI Infrastructure

Model Management

Deployment Automation

Usage Stats

0

Related Tools

Trickle

2d

Vibe Coding

An AI-powered web app and website builder that turns natural language prompts into live, hosted apps with a built-in database and credit-based usage.

LMNT

6d

Generative Media

LMNT provides a low-latency, multilingual text-to-speech API with studio-quality voices and instant voice cloning from short audio samples.

Factory

6d

Autonomous Systems

Agent-native platform that embeds software development agents (Droids) into IDEs, CLI, web, and collaboration tools to automate coding tasks, CI/CD, incident response, and repository maintenance.

Scale AI

6d

LLM Evaluations

Scale AI provides enterprise-grade data labeling, model evaluation, RLHF, and a GenAI Data Engine with API and SDKs to build, fine-tune, and deploy production AI systems.

Meticulous

6d

Automated Testing

Automatically generates and runs visual end-to-end tests by recording user sessions and replaying them to detect regressions without writing or maintaining tests.

React Grab

6d

Code Intelligence

Embed a small script in your web app to let coding agents copy UI elements as contextual input for tools like Cursor, Claude, and OpenCode.

Deep Infra

6d

AI Infrastructure

Cloud inference platform providing low-cost, scalable APIs and infrastructure to run, host, and deploy machine learning models and custom LLMs.

Middleware

6d

Observability Platforms

Full-stack observability platform that consolidates logs, metrics, traces, RUM, and synthetic tests and uses AI to detect and automatically remediate issues.

Inngest

6d

Agent Frameworks

Durable execution and workflow orchestration platform for building scalable, fault-tolerant serverless and AI agent workflows.

Profound

6d

Marketing Automation

Platform that monitors and optimizes brand visibility across AI answer engines and automates AI-optimized content workflows for teams and enterprises.

Newsletter

Get the latest AI Dev Tools in your inbox

Curated tools, community insights, and AI news from EveryDev.ai

No spam — unsubscribe anytime

EveryDev.ai

Everywhere

You Scroll.

Follow us on your feed of choice and keep building with AI.

Reddit

r/EveryDevAI

LinkedIn

@everydev-ai

Threads

@everydev.ai

YouTube

@everydevai

Bluesky

@everydevai.bsky.social

Mastodon

@EveryDevAI

X / Twitter

@everydevai