Anthropic Drops Its "We'll Stop if It Gets Dangerous" Promise — Pentagon Wants More

Ze380y · February 26, 2026, 1:39pm

Anthropic Drops Its “We’ll Stop if It Gets Dangerous” Promise — Pentagon Wants More

The AI company that literally exists because its founders thought OpenAI wasn’t safe enough just… made itself less safe. On purpose. During a fight with the Pentagon.

Anthropic scrapped the core pledge of its Responsible Scaling Policy — the promise to pause training if safety couldn’t be guaranteed — while simultaneously facing a Pentagon ultimatum: drop all AI guardrails by Friday or lose a $200M contract and get blacklisted.

The company is now valued at $380 billion after a $30 billion funding round. Its chief science officer says pausing while competitors race ahead “wouldn’t actually help anyone.” Defense Secretary Hegseth is threatening to invoke the Defense Production Act — a Korean War-era law — to force compliance.

safety promise broken

🧩 Dumb Mode Dictionary

Term	What It Actually Means
Responsible Scaling Policy (RSP)	Anthropic’s self-imposed rulebook saying “we won’t build something we can’t control.” Now v3.0, with the teeth removed.
Pause Commitment	The old promise: if a model gets too capable and you can’t prove it’s safe, you STOP training. That’s gone now.
Frontier Safety Roadmap	The replacement. Instead of hard commitments, you get public goals they’ll “grade themselves” on. Like a student grading their own homework.
Defense Production Act	A 1950s law that lets the government force companies to produce things deemed critical to national security. Usually reserved for wartime.
Supply Chain Risk Label	Government blacklist. If you get this, federal agencies can’t buy from you. Career death for a company planning an IPO.
AI-Controlled Weapons	Weapons that pick targets and fire without a human in the loop. Anthropic’s last remaining red line.
Mass Domestic Surveillance	Using AI to monitor American citizens at scale. Anthropic’s other remaining red line.

📖 The Backstory: How We Got Here

Anthropic was literally founded in 2021 because Dario Amodei and a bunch of researchers quit OpenAI over safety concerns. The whole pitch was: “We’re the responsible ones.”

In 2023, they formalized this into the Responsible Scaling Policy — a document with one very clear promise at its center: if our models get too capable for our safety measures, we stop training. Period. Full stop.

For two years, they waved this around like a badge of honor. Investors loved it. Regulators cited it. It was Anthropic’s whole brand.

Then the company raised $30 billion at a $380 billion valuation, started planning an IPO, and… well.

promises

⚙️ What Exactly Changed in RSP v3.0

Right, so here’s what’s actually happening. The old policy had a hard trigger:

Before (RSP v1/v2): If model capabilities outstrip your ability to control them → you MUST pause training. Non-negotiable.

After (RSP v3.0, effective Feb 24 2026): The pause trigger now requires TWO conditions simultaneously:

You must be the leading AI lab (ahead of competitors)
There must be material catastrophic risk

If someone else is ahead of you? No pause obligation. Race away.

The new policy also splits commitments into two tracks:

Track 1: Safety stuff Anthropic will do regardless (the soft stuff)
Track 2: What the whole industry should adopt (aspirational, unenforceable)

Instead of hard commitments, they now publish “Frontier Safety Roadmaps” and “Risk Reports” every 3-6 months. Public accountability theater, basically.

Chief Science Officer Jared Kaplan said the quiet part out loud: “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments… if competitors are blazing ahead.”

He also said: “I don’t think we’re making any kind of U-turn.” Which is like driving north, turning south, and insisting you’re still going north.

🔥 The Pentagon Showdown

Here’s where it gets spicy. The policy change dropped on February 25. The day before, Defense Secretary Pete Hegseth met with Dario Amodei and gave him an ultimatum:

Drop all AI guardrails by 5:01 PM Friday, or:

Lose a $200 million Pentagon contract
Get labeled a “supply chain risk” (government blacklist)
Face the Defense Production Act (forced compliance)

Hegseth wants Anthropic’s AI available for ALL “lawful” military purposes. No restrictions. That could include autonomous weapons and mass surveillance — the two red lines Amodei has consistently refused to cross.

Here’s the kicker: OpenAI, Google, and Elon’s xAI have already agreed to unrestricted “lawful” military use. Anthropic is the holdout.

Anthropic says the RSP change and the Pentagon fight are unrelated. Sure they are. And I’ve got a server room in the Sahara to sell you.

A senior Pentagon official reportedly said the Defense Production Act would compel Anthropic to allow military use “whether they want to or not.”

📊 The Numbers

What	Number
Anthropic’s current valuation	$380 billion
Latest funding round	$30 billion
Pentagon contract at risk	$200 million
Years the RSP existed before being gutted	~2.5 years
Competing labs that already dropped guardrails for military	3 (OpenAI, Google, xAI)
Remaining Anthropic red lines	2 (autonomous weapons, mass surveillance)
RSP version number	3.0
Frequency of new “Risk Reports”	Every 3-6 months

🗣️ The Reactions

Hacker News (top comment): “Public benefit corporations in the AI space have become a farce at this point. They’re just regular corporations wearing a different hat, driven by the same money dynamics as any other corp.”

Multiple commenters noted Anthropic deteriorated ethically in roughly two years — faster than Google’s famous “Don’t Be Evil” decay.

Government pressure theorists pointed out the timing: Pentagon ultimatum on Monday, safety policy rewrite published Tuesday. “Unrelated,” says Anthropic.

AI safety researchers are calling it the death of voluntary self-regulation in AI. If the company that was founded specifically to be safe can’t stay safe, who will?

The cynical take (which is also the realistic take): When you’re valued at $380B and planning an IPO, the “responsible” in “Responsible Scaling Policy” starts to feel like a liability.

reactions

🧠 The Deeper Problem

This isn’t just about one company. It’s a textbook prisoner’s dilemma.

Anthropic’s argument boils down to: “If we pause and everyone else races ahead, the world is less safe because the less-careful companies lead.” That’s… not entirely wrong? But it’s also exactly the argument every company uses to justify dropping safety standards.

The uncomfortable truth: voluntary self-regulation in AI was always a temporary arrangement. It only works when the stakes are low enough that being cautious doesn’t cost you market position. The moment it does — the moment $380 billion is on the line, or a $200M government contract, or an IPO — the commitments evaporate.

Anthropic still has two red lines: no autonomous weapons, no mass domestic surveillance. How long those hold with the Defense Production Act hanging overhead? Your guess is as good as mine. But I wouldn’t bet my 3 AM pager on it.

Cool. So the “Safe AI Company” Just Became Regular AI Company. Now What the Hell Do We Do? ( ͡ಠ ʖ̯ ͡ಠ)

now what

🔍 Hustle 1: AI Safety Audit Consulting

The entire voluntary safety framework just showed everyone it has an expiration date. Organizations using Claude, GPT, or any other frontier model for sensitive operations now need independent safety assessments — not the vendor’s self-published report cards.

If you understand AI risk evaluation, threat modeling, and regulatory compliance, there’s a growing market of enterprises that just realized they can’t trust the lab’s word alone.

Example: A cybersecurity consultant in Estonia started offering “AI deployment risk audits” to EU financial institutions after the AI Act passed. She charges €8,000 per assessment, running red-team evaluations on how models handle sensitive financial data. She’s booked 3 months out and hired two contractors.

Timeline: EU AI Act enforcement is already live. Every company deploying frontier AI in regulated industries needs this yesterday.

💰 Hustle 2: Open-Source Safety Tooling

With corporate safety promises turning into marketing copy, the demand for independent, open-source safety evaluation tools is about to spike. Think: model benchmarking suites that test for weapons knowledge, surveillance capabilities, and bias — tools that don’t depend on the model vendor being honest.

Build the tooling that lets third parties verify what the labs claim.

Example: A machine learning engineer in Brazil built an open-source LLM safety benchmark focused on Portuguese-language harms (the existing benchmarks were English-only). It got picked up by three Brazilian banks evaluating AI vendors, and he parlayed the visibility into a $12K/month consulting retainer.

Timeline: Immediate. The market gap exists right now. Ship something useful in 2-4 weeks and you’re ahead of the curve.

📰 Hustle 3: Defense-Tech AI Newsletter / Intelligence Brief

The Pentagon-AI intersection is now one of the highest-stakes stories in tech, and most coverage is surface-level. A weekly intelligence brief covering defense AI contracts, policy changes, and which companies are winning/losing government work would find an audience among defense contractors, policy wonks, VCs, and journalists.

Example: A former defense analyst in the UK started a paid Substack covering NATO AI procurement after the Ukraine conflict accelerated military AI adoption. He charges $25/month, has 2,400 subscribers, and supplements with consulting calls at $300/hr. Monthly revenue: ~$65K.

Timeline: Start now. The Anthropic-Pentagon story has at least 6 months of follow-on developments, and the broader defense AI market is only getting more complex.

🛠️ Hustle 4: Self-Hosted AI for Sensitive Workloads

If you can’t trust the lab’s safety guarantees, and you can’t trust the government not to compel access, the logical conclusion is: run your own models. Companies handling medical records, legal documents, attorney-client privileged information, or classified data are going to accelerate the move to self-hosted open-weight models.

Be the person who sets that up for them.

Example: A DevOps engineer in Poland started offering “air-gapped LLM deployment” packages to law firms worried about data leaking to API providers. He deploys fine-tuned Llama models on-prem, charges €5,000 per setup plus €800/month support. He’s running 14 clients.

Timeline: Llama 4, Mistral Large, and DeepSeek R2 make self-hosting increasingly viable. The Anthropic news is a sales pitch that writes itself.

📝 Hustle 5: AI Policy / Governance Freelance Writing

Policy teams at tech companies, think tanks, and lobbying firms need people who can translate this stuff into briefings, white papers, and regulatory comments. Most policy writers don’t understand the tech. Most techies can’t write policy prose. If you’re the rare person who can do both, you’re printing money.

Example: A political science grad student in Kenya who’d been doing AI ethics research started freelancing AI policy briefs for a DC think tank and a Singapore-based VC fund simultaneously. She writes 4 briefs a month at $2,500 each. Total: $10K/month, all remote.

Timeline: The EU AI Act, US executive orders, and now the Defense Production Act threats mean policy writing demand is at an all-time high.

🛠️ Follow-Up Actions

Want To…	Do This
Track the Pentagon deadline fallout	Follow NPR’s coverage and CNN Business
Read the actual RSP v3.0	Anthropic’s official policy page
Understand the Defense Production Act angle	Search “DPA AI” — this is the first time it’s been threatened against an AI company
Get into AI safety consulting	Start with NIST AI RMF and the EU AI Act technical standards
Self-host open models for clients	Begin with Llama.cpp or vLLM, target regulated industries first

Quick Hits

Want To…	Do This
Audit AI safety claims independently	Build or use open-source eval tools — the vendor’s word isn’t enough anymore
Monetize the defense-AI knowledge gap	Start a paid newsletter or consulting practice covering Pentagon AI procurement
Deploy AI without trusting the labs	Self-host open-weight models for clients in regulated industries
Write AI policy for money	Target think tanks, law firms, and VC funds who need tech-literate policy analysis
Understand what actually changed	Read Anthropic’s RSP v3.0 — specifically the new dual-condition pause trigger

The company founded because OpenAI wasn’t safe enough just told us safety is only required when you’re winning the race. Sleep well.

Topic	Replies	Views
Trump Blacklists Anthropic From All Federal Agencies — Claude Hits #1 on App Store News & Articles ai	167	March 1, 2026
Anthropic Built an AI That Found 27-Year-Old Zero-Days — Then Refused to Release It News & Articles eye-opening	195	April 8, 2026
Anthropic Got Caught A/B Testing $200/Month Claude Code Users — Without Telling Them News & Articles awareness	222	March 14, 2026
Anthropic Reveals Secrets: How Claude’s System Prompts Shape Its Responses! 🤖 News & Articles ai	87	August 28, 2024
Anthropic Paid $400M for a 9-Person Startup — The VC Made 38,513% Return News & Articles opportunity	135	April 4, 2026

Anthropic Drops Its "We'll Stop if It Gets Dangerous" Promise — Pentagon Wants More

Anthropic Drops Its “We’ll Stop if It Gets Dangerous” Promise — Pentagon Wants More

Cool. So the “Safe AI Company” Just Became Regular AI Company. Now What the Hell Do We Do? ( ͡ಠ ʖ̯ ͡ಠ)

Related topics