MemU

Name: MemU
Availability: OnlineOnly
Author: NevaMind AI

An agentic memory framework for LLMs and AI agents with persistent, self-evolving memory for proactive 24/7 autonomous agents.

Visit Website

At a Glance

Pricing

Paid

GPT-4.1-mini: $0 usage-based

DeepSeek-v3.1: $0 usage-based

Gemini-3-flash: $0 usage-based

+2 more plans

Engagement

Available On

Linux

Web

API

SDK

NevaMind AISingaporeEst. 2025

Listed Feb 2026

About MemU

MemU is an agentic memory framework designed for LLMs and AI agents that provides persistent, self-evolving memory capabilities. It enables autonomous AI agents to continuously predict user intentions, act proactively, and work around the clock with intelligent memory management. The platform achieves 92.09% average accuracy in reasoning tasks and offers sub-50ms latency with 99.9% uptime SLA.

Three-Layer Memory Architecture provides a unified multimodal memory framework consisting of Resource Layer (raw data), Memory Item Layer (fine-grained memory items), and Memory Category Layer (thematic knowledge structures) with full bidirectional traceability.
Dual-Mode Retrieval combines embedding search for fast semantic matching with LLM-based search that allows models to directly read and interpret memory category files for richer context understanding.
Response API offers one API call for fully autonomous responses where agents retrieve memories, generate context-aware replies, and store new learnings automatically—perfect for 24/7 agents.
Memory API provides full control over agent memory with semantic search, pattern queries, and bulk operations for storing strategic insights and building agents that anticipate user needs.
Self-Evolution Capability enables memory structures to adapt automatically based on usage patterns and agent behavior, with intelligent forgetting mechanisms that gracefully manage memory decay.
Multimodal Support handles text, images, audio, and video inputs, transforming heterogeneous data into queryable, semantically interpretable textual memory.
Visual Memory Console allows real-time monitoring of memory health, decision tracing, and agent behavior debugging.
User Intention Prediction continuously infers user intentions from behavior patterns, enabling agents to know what users need before they ask.

To get started, install the Python SDK with pip install memu-py, initialize the MemuClient with your API key, and use the memorize_conversation method to store interactions. The platform integrates with OpenAI, Anthropic, Gemini, DeepSeek, Qwen, and LangGraph, with CrewAI, N8N, and Dify integrations coming soon. MemU offers both cloud-hosted and self-hosted deployment options through its open-source components including memU-server and memU-ui.

Community Discussions

Be the first to start a conversation about MemU

Share your experience with MemU, ask questions, or help others learn from your insights.

Pricing

GPT-4.1-mini

Memory model pricing for GPT-4.1-mini

usage based

Input: $0.00040 per 1K tokens
Output: $0.00160 per 1K tokens

DeepSeek-v3.1

Memory model pricing for DeepSeek-v3.1

usage based

Input: $0.00055 per 1K tokens
Output: $0.00165 per 1K tokens

Gemini-3-flash

Memory model pricing for Gemini-3-flash

usage based

Input: $0.00050 per 1K tokens
Output: $0.00300 per 1K tokens

Voyage 3.5 Lite Embedding

Embedding model for memory search

usage based

$0.00002 per 1K tokens
Used for embedding search in Memory APIs

Enterprise

Enterprise-grade AI solutions with custom development and premium support

Custom

contact sales

Commercial License
Custom Development
SSO/RBAC integration
Intelligence & Analytics
24/7 Premium Support
Custom SLAs

View official pricing

Capabilities

Key Features

Three-layer memory architecture (Resource, Memory Item, Memory Category)
Dual-mode retrieval (embedding search + LLM-based search)
Response API for autonomous agent responses
Memory API for granular memory control
Self-evolving memory structures
Multimodal memory support (text, image, audio, video)
User intention prediction
Cross-session continuity
Proactive pattern recognition
Visual memory console
24/7 always-on memory
Intelligent forgetting mechanism
Full bidirectional traceability
Sub-50ms latency
99.9% uptime SLA
SOC 2 compliant

Integrations

OpenAI

Anthropic

Gemini

DeepSeek

Qwen

LangGraph

SillyTavern

API Available

View Docs

Back to all tools

About MemU

Three-Layer Memory Architecture provides a unified multimodal memory framework consisting of Resource Layer (raw data), Memory Item Layer (fine-grained memory items), and Memory Category Layer (thematic knowledge structures) with full bidirectional traceability.
Dual-Mode Retrieval combines embedding search for fast semantic matching with LLM-based search that allows models to directly read and interpret memory category files for richer context understanding.
Response API offers one API call for fully autonomous responses where agents retrieve memories, generate context-aware replies, and store new learnings automatically—perfect for 24/7 agents.
Memory API provides full control over agent memory with semantic search, pattern queries, and bulk operations for storing strategic insights and building agents that anticipate user needs.
Self-Evolution Capability enables memory structures to adapt automatically based on usage patterns and agent behavior, with intelligent forgetting mechanisms that gracefully manage memory decay.
Multimodal Support handles text, images, audio, and video inputs, transforming heterogeneous data into queryable, semantically interpretable textual memory.
Visual Memory Console allows real-time monitoring of memory health, decision tracing, and agent behavior debugging.
User Intention Prediction continuously infers user intentions from behavior patterns, enabling agents to know what users need before they ask.

Community Discussions

Be the first to start a conversation about MemU

Share your experience with MemU, ask questions, or help others learn from your insights.

Pricing

GPT-4.1-mini

Memory model pricing for GPT-4.1-mini

usage based

Input: $0.00040 per 1K tokens
Output: $0.00160 per 1K tokens

DeepSeek-v3.1

Memory model pricing for DeepSeek-v3.1

usage based

Input: $0.00055 per 1K tokens
Output: $0.00165 per 1K tokens

Gemini-3-flash

Memory model pricing for Gemini-3-flash

usage based

Input: $0.00050 per 1K tokens
Output: $0.00300 per 1K tokens

Voyage 3.5 Lite Embedding

Embedding model for memory search

usage based

$0.00002 per 1K tokens
Used for embedding search in Memory APIs

Enterprise

Enterprise-grade AI solutions with custom development and premium support

Custom

contact sales

Commercial License
Custom Development
SSO/RBAC integration
Intelligence & Analytics
24/7 Premium Support
Custom SLAs

View official pricing

Capabilities

Key Features

Three-layer memory architecture (Resource, Memory Item, Memory Category)
Dual-mode retrieval (embedding search + LLM-based search)
Response API for autonomous agent responses
Memory API for granular memory control
Self-evolving memory structures
Multimodal memory support (text, image, audio, video)
User intention prediction
Cross-session continuity
Proactive pattern recognition
Visual memory console
24/7 always-on memory
Intelligent forgetting mechanism
Full bidirectional traceability
Sub-50ms latency
99.9% uptime SLA
SOC 2 compliant

Integrations

OpenAI

Anthropic

Gemini

DeepSeek

Qwen

LangGraph

SillyTavern

API Available

View Docs

MemU

At a Glance

Engagement

Available On

Resources

Topics

Alternatives

About MemU

Community Discussions

Be the first to start a conversation about MemU

Pricing

GPT-4.1-mini

DeepSeek-v3.1

Gemini-3-flash

Voyage 3.5 Lite Embedding

Enterprise

Capabilities

Key Features

Integrations

MemU

At a Glance

Engagement

Available On

Resources

Topics

Alternatives

About MemU

Community Discussions

Be the first to start a conversation about MemU

Pricing

GPT-4.1-mini

DeepSeek-v3.1

Gemini-3-flash

Voyage 3.5 Lite Embedding

Enterprise

Capabilities

Key Features

Integrations