AI Dev News Digest: January 30th, 2026
MCP finally got its UI layer this week. Tools can now pop up actual interfaces inside your chat, dashboards and forms and all that. Claude, ChatGPT, VS Code, Goose, they're all on board. JetBrains and Zed shipped an agent registry too, so you can one-click install Gemini CLI or Copilot without messing with config files. Meanwhile Microsoft released Maia 200, their custom inference chip that supposedly beats Google's TPU v7. And Anthropic closed somewhere north of ten billion in funding. The infrastructure race marches on.
But the week's real tension was about agents themselves. Andrej Karpathy posted on X that he flipped from 80% manual coding to 80% agent-driven in weeks, calling it the biggest workflow change in 20 years of programming. Same week, a Hacker News post called "After two years of vibecoding, I'm back to writing by hand" hit 300+ points. The author's take: agents write code that looks fine in isolation but turns into architectural slop. Karpathy sees it coming too, warning of a 2026 "slopacolypse." LinkedIn started certifying people in vibe coding anyway. Then at Davos, Hassabis and Amodei sat down for what the moderator called "the Beatles versus the Rolling Stones." Amodei said all software devs could be replaced within a year. Hassabis said we're maybe two breakthroughs away. I'm curious how you're thinking about this stuff.
Foundation Models & Platforms
-
OpenAI retiring GPT-4o and older models from ChatGPT. GPT-4o, GPT-4.1, GPT-4.1 mini, and o4-mini will be removed on February 13. Only 0.1% of users still choose GPT-4o daily. The chatgpt-4o-latest API endpoint retires February 16. Sam Altman acknowledged at a town hall that GPT-5.2 writing quality declined as priorities shifted toward coding, reasoning, and math. (CNBC)
-
Moonshot AI releases Kimi K2.5 with Agent Swarm. The 1 trillion parameter MoE model (32B active) was trained on 15 trillion visual and text tokens. Agent Swarm orchestrates up to 100 sub-agents executing parallel workflows across 1,500 tool calls, reducing execution time by up to 4.5x compared to single-agent setups. SWE-bench Verified: 76.8%. Also ships Kimi Code, an open-source CLI integrating with VSCode, Cursor, and Zed. (Kimi)
-
Alibaba launches Qwen3-Max-Thinking reasoning model. Claims to outperform GPT-5.2 and Gemini 3 Pro on Humanity's Last Exam benchmark. Features adaptive tool calling that autonomously selects external tools based on task complexity. Available in Qwen Chat and Model Studio. (Seeking Alpha)
-
Chinese AI labs accelerate model releases ahead of DeepSeek. Alibaba, Moonshot, and others are racing to release models before DeepSeek's expected February launch. ByteDance preparing its own releases. (CNBC)
-
DeepMind's AlphaGenome published in Nature. The DNA sequence model predicts regulatory variant effects across 11 modalities at single-base-pair resolution. Processes up to 1M base pairs, matching or exceeding specialized models in 25 of 26 evaluations. Source code and weights now available for noncommercial research. Nearly 3,000 scientists from 160 countries already using it. (Nature)
-
NVIDIA launches Earth-2 open models for AI weather. First fully open, accelerated weather AI stack. Earth-2 Medium Range (Atlas architecture) handles 15-day forecasts across 70+ variables, outperforming leading open models on key benchmarks. Earth-2 Nowcasting (StormScope architecture) uses generative AI for kilometer-resolution, 0-6 hour storm predictions. Earth-2 Global Data Assimilation (HealDA) generates initial conditions in seconds on GPUs vs hours on supercomputers. Partners include Brightband, Israel Meteorological Service, The Weather Company, and TotalEnergies. (NVIDIA)
Developer Tools & Infrastructure
-
MCP Apps ships as first official protocol extension. Tools can now return interactive UI components that render directly in the conversation: dashboards, forms, visualizations, multi-step workflows. Supported in Claude, ChatGPT, Goose, and VS Code Insiders. Joint effort between Anthropic, OpenAI, and the MCP-UI community. (MCP Blog)
-
JetBrains and Zed launch ACP Agent Registry. Browse and install AI coding agents with one click inside JetBrains IDEs and Zed. Launch partners include Gemini CLI, Claude Code, GitHub Copilot, Auggie, OpenCode, Mistral Vibe, and Qwen Code. The Agent Client Protocol works like LSP but for coding agents. No JetBrains AI subscription required to use ACP agents. (JetBrains)
-
Microsoft unveils Maia 200 inference accelerator. Custom 3nm chip with 140B transistors delivering 10+ petaFLOPS in FP4 and 5+ petaFLOPS in FP8. Features 216GB HBM3e at 7 TB/s bandwidth. Microsoft claims 3x the FP4 performance of Amazon Trainium 3 and better FP8 than Google TPU v7. Will serve GPT-5.2 in Microsoft 365 Copilot and Foundry. Deployed in Iowa, Phoenix next. SDK preview now open. (Microsoft)
-
MiniMax ships free Agent Desktop app. Desktop agent with secure connections to local docs, GitLab repos, email, and calendar. Browser automation handles form filling and multi-site research. Features an "Experts" ecosystem for specialized tasks like cold outreach research and automated code reviews. Free tier while competitors charge $20/month. (EveryDev)
-
xAI launches Grok Imagine API for video generation. Unified bundle handling text-to-video, image-to-video, and video editing with native audio. Claims top rankings on Artificial Analysis and LMArena benchmarks against Veo 3.1 and Sora 2. 64.1% win rate against Runway Aleph in editing benchmarks. Partner integrations live through fal.ai, ComfyUI, InVideo, and HeyGen. (xAI)
-
Google introduces Agentic Vision in Gemini 3 Flash. Converts image understanding into an iterative think-act-observe loop. The model generates Python code to zoom, crop, annotate, and calculate, then re-examines the results. 5-10% quality improvement across vision benchmarks. Available via Gemini API in AI Studio and Vertex AI. (Google Blog)
-
Google ships Auto Browse in Chrome. Agentic browsing powered by Gemini 3. Can research flights and hotels, find products matching your budget, add items to cart, and handle sites requiring login via Google Password Manager. Side panel integrates with Gmail, Calendar, YouTube, Maps, Shopping, and Flights. Requires Google AI Pro or Ultra subscription. Google's answer to OpenAI Operator and Perplexity Comet. (Google Blog, EveryDev)
-
OpenAI launches Prism, a free LaTeX workspace for scientists. Cloud-based, GPT-5.2 powered workspace for scientific writing and collaboration. Integrates AI directly into manuscript workflow with full-document context, equation reasoning, and arXiv literature search. Unlimited projects and collaborators. Built on acquired Crixet platform. Kevin Weil: "2026 could be to science what 2025 was to software engineering." (OpenAI)
-
Google bundles Developer Program benefits into AI Pro and Ultra. AI Pro subscribers get $10/month in Cloud credits, AI Ultra gets $100/month. Workflow from AI Studio and Antigravity IDE to Vertex AI and Cloud Run deployment. (Google Blog)
-
Allen AI releases SERA open coding agents. First release in the Open Coding Agents family. SERA-32B solves 54.2% of SWE-Bench Verified problems, matching Devstral Small 2 and GLM-4.5-Air. Training costs just $400 to reproduce prior open-source results, or $12,000 for industry-leading performance. Built on Qwen3 with 32K context. NVIDIA-optimized inference hits 8,600 tokens/second on 4xB200 GPUs in NVFP4. Compatible with Claude Code out of the box. (Allen AI)
-
GitHub adds Agents tab to repositories. New tab brings coding agent management directly into the repo alongside code, PRs, and issues. Redesigned session logs with grouped tool calls, inline previews, and expandable diffs. Can resume sessions directly in Copilot CLI with one-click command copy. (GitHub)
-
Tabnine launches CLI for terminal-based coding agents. Standalone AI coding agent that runs natively in terminal. Supports interactive and fully autonomous "Yolo mode" execution. Integrates with MCP servers for enterprise context and organizational coaching guidelines. Model-agnostic backend allows centralized model selection. Works in on-premises, VPC, and air-gapped environments. Can create Git branches, commit changes, and open PRs automatically. (Tabnine)
-
Project Genie world model available to AI Ultra users. Google DeepMind's Genie 3 now accessible through web app. US-only, requires the $250/month AI Ultra subscription. Generates interactive environments from text and image prompts with 60-second sessions at 24fps and 720p. (Engadget)
-
Vertex AI Sessions and Memory Bank begin charging. Delayed from January 28 to February 11. Memory storage: $0.25/1K memories/month. Memory retrieval: $0.50/1K (first 1K free). Session events: $0.25/1K. Code Execution: $0.0864/vCPU-hour. (Google Cloud)
-
Claude launches interactive tools powered by MCP Apps. Tools now render as interactive UI inside conversations. Build Asana project timelines, preview Slack messages, create Figma diagrams, explore Amplitude analytics, search Box files, design Canva presentations, and query Hex data, all without switching tabs. Salesforce integration coming soon. Available on Pro, Max, Team, Enterprise plans. (Claude Blog)
Funding & Startups
-
Anthropic closes funding round above $10B at $350B valuation. The round landed between $10B and $15B, with Coatue and Singapore's GIC leading. Could still grow if Microsoft and Nvidia contribute from their November pledge of up to $15B combined. Nearly doubles valuation from September's $183B. Company expects to break even by 2028, ahead of OpenAI. Potential IPO as early as late 2026. (CNBC)
-
Ricursive Intelligence raises $300M Series A at $4B valuation. AI chip design startup pairs recursive AI with semiconductor design. Founded by Google DeepMind alumni. (Ricursive)
-
Upwind lands $250M Series B for cloud security. Tel Aviv-based company builds runtime cloud security with agent-based protection for containers and serverless. (Upwind)
-
Synthesia secures $200M Series E for AI video. Expanding enterprise AI video generation platform. (Synthesia)
-
Cellares closes $257M to scale automated cell therapy. Building automated manufacturing for cell therapies. (Cellares)
-
PaleBlueDot AI raises $150M for AI cloud infrastructure. Expanding global AI cloud footprint. (PaleBlueDot)
-
Vention raises $110M Series D for industrial robotics. Uses generative AI to automate design and programming of robotic manufacturing cells. Led by Investissement Québec with participation from Nvidia's NVentures. (Newswire)
-
Rogo raises $75M for CFO workflow automation. Building AI to unify CFO workflows. (Rogo)
-
Gyde emerges from stealth with $60M. Modernizing insurance and wealth brokerage. (Gyde)
-
Outtake lands $40M to stop deepfake attacks. Targeting deepfake-enabled impersonation. (TechCrunch)
Industry News
-
Figure unveils Helix 02 neural prior for humanoid robotics. Replaces 109k lines of hand-engineered C++ with a 10M-parameter network trained on 1,000+ hours of human motion data. Enables Figure 03 to output joint-level commands at 1kHz for whole-body control. (Figure)
-
Sam Altman town hall: GPT-5.2x level at 100x less cost by end of 2027. Altman confirmed OpenAI expects dramatic cost reductions. Also announced "sign in with ChatGPT" feature in development, and plans to dramatically slow workforce growth. "We think we'll be able to do so much more with fewer people." Hiring interviews will test candidates' ability to work with AI tools rather than traditional coding. (TechLoy)
-
LinkedIn launches AI skills certifications including vibe coding. Partnership with Descript, Lovable, Replit, and Relay.app allows users to earn verified certifications based on AI usage patterns and outcomes. GitHub, Gamma, and Zapier joining soon. U.S. job postings mentioning AI rose to 4.2% by end of 2025. (TechCrunch)
-
Moltbot survives trademark dispute, crypto scam, and security crisis in 72 hours. Anthropic forced Clawdbot to rebrand over "Claude" similarity. During account migration, crypto scammers hijacked GitHub and X accounts in ~10 seconds, enabling a $16M pump-and-dump. Security researchers simultaneously found hundreds of exposed instances leaking API keys and credentials. Project now has 75K+ stars and is operational at molt.bot. (EveryDev)
-
Manus AI integrates Agent Skills on all platforms. Agent Skills offers modular scripts for domain expertise with lower memory usage. Team plan early access rolling out. (Manus)
Weekend Watch
- Hassabis and Amodei debate "The Day After AGI" at Davos. The moderator called it "chairing a conversation between the Beatles and the Rolling Stones." Amodei: AI will replace all software developers within a year, reach Nobel-level research in two years. Hassabis: 50% chance of AGI by end of decade, but "maybe one or two more breakthroughs" needed. Both agreed the window to get governance right is shrinking. Full session runs 45 minutes. (YouTube)
Comments
Sign in to join the discussion.
No comments yet
Be the first to share your thoughts!