EveryDev.ai
Sign inSubscribe
  1. Home
  2. News
  3. AI Dev News Digest: February 13th, 2026

AI Dev News Digest: February 13th, 2026

Joe Seifi's avatar
Joe Seifi
just now·Founder at EveryDev.ai

Developers left the desk to AI agents this week, ads followed them in, and safety researchers walked out the door!

Spotify's co-CEO told investors this week that the company's top engineers haven't written a line of code since December. Microsoft's AI CEO told the Financial Times that most white-collar desk jobs will be automated within 18 months. A Harvard study found that AI tools don't make people work less, they make people work more. And the actual job market data is starting to back all of it up: 50,000 AI-attributed layoffs last year, but 80,000+ new job postings requiring gen AI skills paying 25% more than their non-AI equivalents. The question isn't whether work is changing. It's whether anyone has a plan for how fast.

Meanwhile, the model race kept moving. Google's Gemini 3 Deep Think put up 84.6% on ARC-AGI-2, a 16-point lead over Claude Opus 4.6. OpenAI shipped its first model on non-Nvidia hardware (Cerebras chips, 1,000 tokens/sec). Two Chinese labs released open-weight models pricing themselves at a fraction of what the big three charge. Anthropic closed a $30B round at a $380B valuation on the same day OpenAI turned on ads in ChatGPT. And safety researchers kept walking out the door, with Anthropic's safeguards lead resigning with a public letter saying "the world is in peril."

Developers left the desk to AI agents this week, ads followed them in, and safety researchers walked out the door

Who's Writing the Code Now?

This week there wasn't a single big headline. Instead, the above question sort of emerged as a trend through many different data sets.

During their Q4 earnings call, Spotify stated their most productive engineers are no longer coding manually and now work directly on AI. Their in-house Honk system is created using Claude Code and 10 years of internal tool development (Backstage, Fleet Management). The Honk system generates over 650 PRs from agents each month and reduces the amount of time spent on complex migrations by 90%. Co-CEO Söderström said during the earnings call: "When you have productivity on tap, you just need great plans." (EveryDev.ai)

Later in the day, Microsoft AI CEO Mustafa Suleyman told the Financial Times he believes "most, if not all," professional tasks for lawyers, accountants, project managers and marketers will be automated by AI in the near future. He referenced the use of AI by his own engineers to produce "the vast majority" of their code production. For reference, Anthropic CEO Dario Amodei had previously provided a 5 year timeline for such disruptive impact. (Business Insider)

Hiring data is beginning to reflect these trends:

Jobs LostJobs Created
2025 totals50,000 layoffs attributed to AI (Challenger Gray & Christmas)80,000+ postings requiring gen AI skills (Lightcast), 2x YoY
2026 YTDAmazon (16,000), Dow (4,500), Pinterest (780, 15% of staff)AI-skilled roles pay ~25% more ($18K/year premium)

Vasant Dhar, NYU professor, describes the jobs lost due to automation: "Low-stake cognitive tasks such as data collection, report creation," etc. while the new jobs will be based on true decision-making. (CBS News Chicago)

A Harvard Business Review study added the uncomfortable nuance. UC Berkeley researchers were embedded at a 200-person tech company for eight months. They found that AI tools caused workers to work faster, take on broader scope, and put in longer hours, often without being asked. Workers saw AI as a "partner" that made doing more feel possible, but the result was cognitive fatigue, burnout, and decreased quality over time. The takeaway: companies need to develop an "AI practice" with intentional breaks to avoid these outcomes. (HBR)

Matt Shumer's viral essay "Something Big Is Happening", provided a feeling for what many are going through. Shumer explained that he had written out an app in plain English and then walked away for four hours. When he returned, the app was built, tested itself and worked perfectly. Shumer states that the "AI Plateau" debate has ended and the window to get ahead of this trend is rapidly closing. Shumer's audience (non-technical) resonated with the article because it was written for them. (shumer.dev)


The Model Race

This week, Google, OpenAI, and two Chinese labs all launched new AI models. Here's what each one brings to the table.

ModelCompanyARC-AGI-2SWE-Bench VerifiedHumanity's Last ExamPricing (input/output per M tokens)
Gemini 3 Deep ThinkGoogle84.6%—48.4% (no tools)AI Ultra ($124.99/mo)
Claude Opus 4.6Anthropic68.8%80.8%—$15 / $75
GPT-5.3-Codex-SparkOpenAI———Pro tier (Cerebras, 1K TPS)
GLM-5Zhipu AI—77.8%50.4% (tools)~$1.50 / $7.50 (est. 6-10x cheaper)
MiniMax M2.5MiniMax—80.2%—$0.30 / $1.20
  • Gemini 3 Deep Think scored 84.6% on ARC-AGI-2. On Wednesday, Google rolled out Gemini 3, a significant update to its deep thinking feature for developers. Also achieved a rating of 3455 Elo in codeforces competition and won all three gold medals in the 2025 Math, Physics, and Chemistry Olympiads. Users who have AI Ultra can use Gemini 3, while API early access will be available via Vertex AI. (Google Blog)

  • OpenAI deployed GPT-5.3-Codex-Spark on Cerebras hardware. A smaller, faster GPT-5.3-Codex model running on Cerebras' Wafer Scale Engine 3 chips at over 1,000 tokens per second. This is the first OpenAI model operating on hardware other than Nvidia hardware; it also represents the first milestone in OpenAI's $10B+ Cerebras partnership. At launch, the model is text only and has a maximum 128K context. GPT-5.3-Codex-Spark is currently available to ChatGPT Pro customers. (OpenAI)

  • Zhipu AI released an open-source 744B model called GLM-5. Trained entirely on Huawei chips. Zero dependence on U.S.-made hardware. MIT-licensed with weights on HuggingFace. The model was formerly known as "Pony Alpha" when it briefly surfaced as a stealth test on OpenRouter. Currently the #1 open-source model on AI Arena. (Zhipu AI)

  • MiniMax released M2.5 with nearly state-of-the-art coding at 1/10th the price. M2.5 uses MoE architecture and has 230B parameters (40B active). Its results on SWE-Bench nearly match those of Claude Opus 4.6. MiniMax offers two versions of M2.5: the base M2.5 with 50 TPS and the M2.5-Lightning variant which offers 100 TPS. Both are open-weight. MiniMax states that 30% of the internal work they perform is currently done using M2.5. (MiniMax)

Ultimately, the pricing story is just as important as the benchmarking story. Like GLM-5, MiniMax M2.5 is an open-weight model with competitive coding results, but at a tiny fraction of the price of the top-tier models. For agents that call hundreds of APIs per task, the price differential is huge.


Developer Tools & Infrastructure

  • OpenAI updates Responses API; new features include hosted shell, server-side compaction and Skills. These three updates for agent builders: server-side compaction allows an agent to run for hours without losing contextual information (as seen when Triple Whale's agent "Moby" ran for over 150 tool calls and 5 million tokens without experiencing an accuracy drop), hosted shell gives each agent a full Debian 12 container, and the new Skills standard (SKILL.md with YAML frontmatter) is now interoperable between OpenAI and Anthropic. (OpenAI Developer Blog)

  • Codex app from OpenAI reaches 1M downloads in first week. Sam Altman confirmed this milestone on X and stated that it was seeing 60% week-over-week growth in download numbers. The Mac-only app lets developers run multiple AI agents at once. Free and Go Tier users were allowed access to the app during a limited time launch promotion, but Altman indicated that limits may be imposed. As a point of reference, Anthropic's Claude Code is reportedly generating an estimated $2.5 billion in annualized revenue. (VentureBeat)

  • NanoClaw takes on security problems with OpenClaw with 500 lines of code. OpenClaw has been the AI agent of choice for many developers since January 2026. However, over 135k OpenClaw instances have been found to have known RCE vulnerabilities, exposing those instances to potential attack. NanoClaw, which is a lightweight version of OpenClaw created by former Wix developer Gavriel Cohen, is a stripped-down version of OpenClaw with approximately 500 lines of TypeScript code, with all agents isolated within containers using the Claude Agent SDK. Cohen's agency currently uses a NanoClaw agent named "Andy" to operate their sales pipeline. (EveryDev.ai)

  • VS Code 1.109 shipped multi-agent development. You can now run Claude and Codex Agents side-by-side with GitHub Copilot in VS Code. Additionally, VS Code 1.109 introduces native Git WorkTrees, MCP Apps in Chat, Mermaid Diagram Rendering, Terminal Sandbox, and Copilot Memory. There is a nice YouTube walkthrough for your weekend viewing pleasure. (VS Code Blog)


The Money

A lot of money flowed into AI companies this week. This is the breakdown.

CompanyAmountValuationKey Detail
Anthropic$30B Series G$380B$14B ARR, Claude Code at $2.5B run-rate
Databricks$5B + $2B debt$134B$5.4B ARR, AI products at $1.4B annually
OpenAI~$100B (in progress)TBD800M+ weekly ChatGPT users, 10%+ monthly growth
Gather AI$40M Series B—Warehouse drones, $74M total raised
  • Anthropic closes $30B Series G at $380B valuation. Led by GIC and Coatue, with D.E. Shaw, Founders Fund, and MGX co-leading. Includes portions of previously announced Microsoft ($5B) and Nvidia ($10B) investments. 10x revenue growth for 3 years running, 500+ customers spending $1M+ annually, 8 of the Fortune 10. Second-largest private tech raise ever, behind OpenAI's $40B+ in 2025. (CNBC)

  • Databricks closes $5B raise at $134B valuation. Also secured $2B in new debt capacity. Revenue up 65% year-over-year. CEO Ali Ghodsi said the company is "well capitalized in case there's a winter coming." Goldman Sachs, Morgan Stanley, Neuberger Berman, and Qatar Investment Authority participated. Funds going toward Lakebase (serverless Postgres for AI agents) and Genie (conversational AI). Now surpasses Snowflake in revenue. (CNBC)

  • OpenAI officially launches ads in ChatGPT. The day after Anthropic's Super Bowl ad mocked the idea of AI chatbots with ads, OpenAI began testing sponsored content for Free and Go tier users in the US. Adobe, Omnicom (30+ clients), and WPP are launch partners. Target's Roundel retail media network is bringing 2,000+ vendor brands in, and ChatGPT-to-Target traffic is growing 40% monthly. Plus, Pro, Business, and Enterprise tiers remain ad-free. (OpenAI)

  • ChatGPT is back to 10%+ monthly growth. Altman told employees in an internal Slack message that ChatGPT now serves 800+ million weekly users. Growth had plateaued but reaccelerated after OpenAI's December "code red" refocus on product quality. (CNBC)

  • Gather AI raises $40M Series B for warehouse drone platform. The CMU spinout uses off-the-shelf drones and forklift-mounted cameras for autonomous inventory tracking. Led by Smith Point Capital (Keith Block's firm). Claims 99.9% accuracy, 5x productivity gains. Bookings grew 250% last year. (TechCrunch)


Safety & Trust

  • Microsoft shows how a single prompt can break safety training in 15 models. Using a mild prompt, Azure CTO Mark Russinovich et al. demonstrated how to reverse safety training by flipping the reward signal of the GRPO loss function. This exploit has been tested on GPT-OSS, DeepSeek-R1 distillations, Gemma, Llama, Ministral, and Qwen variants of GPT-4, and spread to safety domains the model had never encountered. Similarly, the same technique increased harmful generation rates of Stable Diffusion 2.1 from 56% to 90%. The authors argue that alignment is more fragile than assumed when models are fine-tuned downstream. (Microsoft Security Blog)

  • OpenAI tells Congress that DeepSeek employees are distilling its models. In a memo to the House Select Committee on Strategic Competition, OpenAI accused DeepSeek of using obfuscated third-party routers to mask their source and programmatically extract model outputs for training. OpenAI called it "ongoing efforts to free-ride on the capabilities developed by OpenAI and other US frontier labs." The accusations are difficult to verify publicly, but the memo escalates a dispute simmering since DeepSeek's R1 launch. (Reuters)

  • Safety researchers continue to leave AI labs. On February 9th, Mrinank Sharma, the former Head of Anthropic's Safeguards Research Team, resigned along with a public letter stating "the world is in peril". At OpenAI, Zoë Hitzig resigned on February 11th over the company's ads strategy and published a New York Times essay arguing that ChatGPT ads should be banned. In January, Ryan Beiermeister, a safety executive at OpenAI, was fired for her opposition to "adult mode". Two of the cofounders of xAI also departed on February 11th after the SpaceX merger. (American Bazaar)


Platform & Policy

  • EU warns Meta it must open WhatsApp to rival AI chatbots. The European Commission issued a statement of objections on February 9th, claiming that Meta had blocked third-party AI chatbots from accessing WhatsApp since January in violation of EU antitrust regulations. With approximately 3 billion WhatsApp users, the EU claims that Meta is preventing consumers from choosing alternative Meta AIs. The EU is currently evaluating emergency interim measures. Meta responded by stating that the framing of the situation was inaccurate. (Bloomberg)

  • Alibaba's Qwen app generated 120 million orders in 6 days during Lunar New Year. Alibaba provided $431 million in incentives through its Qwen AI app, and almost 50% of those orders were placed by customers living in rural areas. Additionally, 1.56 million individuals aged 60 or older made their first online purchase through the Qwen app. Qwen now includes shopping, grocery delivery, entertainment, travel booking, and payment services all in one AI-based user interface. Competitors such as Tencent and Baidu are also launching competing campaigns. AI chatbot apps are rapidly becoming the front lines of consumer commerce in China. (Alizila)

  • Apple's Gemini-powered Siri faces testing snags. According to a report by Bloomberg, several of the Siri upgrades announced as part of iOS 26.4 (to be released in March) will not be included in the final product. Several features may be delayed until iOS 26.5 (released in May) or iOS 27 (released in September). Testing revealed that Siri would often fail to properly interpret user requests and respond very slowly. The delay could affect the performance of four upcoming products: HomePad, Smart Doorbell, Apple TV 4K, and AR Glasses. (Bloomberg)

  • Anthropic commits to covering the increase in electricity prices from data centers. Anthropic will pay 100% of the cost of upgrading grids, offsetting consumer electricity bills, and investing in new power generation at their facilities in Texas, New York, and Louisiana. The commitment comes just months after Microsoft made a similar commitment for 2026. The move comes amid pressure from the Trump Administration to require data center operators to use renewable energy and pressure from some New York Senators to freeze the permitting of new data centers. (Anthropic)

  • Musk pitches a lunar factory for making AI satellites at xAI all-hands meeting. Musk reportedly told xAI employees that he believes the company needs a manufacturing plant on the moon to make AI satellites, which will then be launched into orbit using electromagnetic catapults. There was no roadmap, budget, or timeline for the project. Musk pitched the idea just days after the announcement that xAI and SpaceX had merged ($1.25 trillion combined market value) and just days after two of the founders of xAI (out of 12 total) had departed. The pitch appears to be a pre-IPO marketing campaign for SpaceX's planned IPO later this summer. (TechCrunch)


Creative AI & Media

  • Autodesk files lawsuit against Google over AI filmmaking tool 'Flow' trademark. Autodesk filed a complaint in San Francisco Federal Court, alleging that Google had infringed upon Autodesk's trademark for 'Flow', which has been used by Autodesk since 2022 to describe VFX and production software. Google launched a competing AI filmmaking tool called Flow in May 2025. What makes this case interesting is that Autodesk claims that Google told them that they would not commercially develop the name, yet Google applied for a trademark in Tonga (which does not publish trademarks publicly) prior to applying for a U.S. trademark. Autodesk ($51B) is suing Alphabet ($3.9T) for compensatory and punitive damages. (CNBC)

  • Qwen Image 2.0 merges generation and editing in one 7B model. Alibaba's Qwen team shipped a new-generation image model that can handle text-to-image and editing with a single architecture. The new model has fewer parameters (7B vs 20B) while retaining accurate text rendering, supporting native 2K resolution, and ranking #1 on LM Arena for both text-to-image and editing. (Qwen)

  • ByteDance's Seedance 2.0 is a major step forward in video generation. Many are describing Seedance 2.0 as a "step-change" in quality with better natural motion and detail. Developers are already treating it as a forcing function for Google and OpenAI to update Veo and Sora. (SCMP)

  • Google opened Project Genie to AI Ultra subscribers. Powered by Genie 3, Project Genie allows users to create and explore interactive 3D worlds from text prompts. The app offers three modes: World Sketching, Exploration, and Remixing. Worlds last about 60 seconds at 720p/24FPS with imperfect physics. This is not a game engine, but rather a demo of what world models can accomplish. Currently only available in the United States. (Google Blog)


Weekend Reads

  • The Spotify Honk deep dive. An in-depth look at the "haven't written code since December" claim. The short answer is it's a Backstage story, not a Claude Code story. (EveryDev.ai)

  • "Something Big Is Happening" by Matt Shumer. The viral essay that defined the tone of the week. Worth sharing with anyone in your life who is still undecided on where this is all headed. (shumer.dev)

  • QuitGPT: why AI chatbot switching actually works. The real story in AI today is commoditization. When switching costs are virtually free, loyalty will follow features not brands. (EveryDev.ai)

  • A deep dive into VS Code 1.109 (video). Features of multi-agent development in action. (YouTube)


About the Author

Joe Seifi's avatar
Joe Seifi

Founder at EveryDev.ai

Apple, Disney, Adobe, Eventbrite, Zillow, Affirm. I've shipped frontend at all of them. Now I build and write about AI dev tools: what works, what's hype, and what's worth your time.

Comments

Sign in to join the discussion.

No comments yet

Be the first to share your thoughts!

Explore AI Tools
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • News
  • Blogs
  • Builds
  • Contests
Create
Sign In
    Sign in