On Anthropic, AI Safety, and How Crypto Can Help

(Mostly) All You Need to Know About Anthropic

The Last Two Weeks at Anthropic

News about the AI lab Anthropic has been dominating headlines recently, and the overall picture is quite contradictory:

The company ran a Super Bowl ad that boosted its app's popularity on app stores.
It raised $30B in a Series G funding, reaching a $380B valuation.
Anthropic released Claude Sonnet 4.6 - a model that amounts to a seismic repricing event for the AI industry, as it delivers near-flagship intelligence at mid-tier cost.
It partnered with the Indian IT giant Infosys to develop enterprise-grade AI agents, and to integrate Claude into banking, telecoms, and manufacturing, thus solidifying its position as the go-to AI for business provider.
It joined forces with Figma to bridge AI and design.
Banking giant Goldman Sachs is to extend its use of Anthropic’s Claude model from software development to trade accounting and client onboarding.
Its co-founder Daniela Amodei praised humanities majors, stating that in the age of AI, the ability to possess critical thinking skills will be more important than ever.
Anthropic's resident philosopher Amanda Askell became the target of another one of Elon Musk's countless misogynistic remarks - he claims Askell is not qualified because she doesn't have kids and no stake in the future. Yet, her answer shows much more empathy than Musk's 14 kids have managed to awaken in him...
The Wall Street Journal revealed that Claude was used by the US military during its operation to kidnap Nicolás Maduro from Venezuela, thanks to Anthropic's partnership with Palantir Technologies. Although Anthropic’s terms of use prohibit the use of Claude for violent ends, for the development of weapons, or for conducting surveillance, the US raid in Venezuela led to the killing of 83 people.
After months of negotiations with the Pentagon over the terms under which the military can use Claude, and although Anthropic is prepared to loosen its current terms of use, the US Department of War threatens to cut all ties with the company and designate it a "supply chain risk." The main point of contention: Anthropic wants to ensure its tools aren't used to spy on Americans en masse, or to develop weapons that fire with no human involvement, whereas the Pentagon considers these conditions "unduly restrictive." Meanwhile, the Pentagon is also negotiating with OpenAI, Google, and xAI, and requires that the military should be able to use their tools for "all lawful purposes." All three AI labs have agreed to remove their safeguards for use in the military's unclassified systems, whereas they haven't yet been used for more sensitive classified work.

Reading through that list from top to bottom initially feels like watching a slick, well-funded marketing campaign, right up until the point where it becomes genuinely unsettling. When the US military, with its current leadership, insists on having all available AI tools at its disposal, without any guardrails, it gets really scary. In light of all this, we have to talk about AI safety.

The Origin Story

But why am I focusing the AI Safety conversation on Anthropic, you may ask? Let me remind you that Anthropic was founded by ex-OpenAI employees, who allegedly strived to create a more safety-oriented lab than its rivals. Yet, that's how OpenAI started, too.

In his recent inquiry on Anthropic for The New Yorker, Gideon Lewis-Kraus offered a detailed retrospective, laced with just a gentle touch of mockery and an unflinching awareness of the deep irony throughout:

In 2010, a mild-mannered polymath named Demis Hassabis co-founded DeepMind, a secretive startup with a mission "to solve intelligence, and then use that to solve everything else."[...] Elon Musk and Sam Altman claimed to mistrust Hassabis, who seemed likelier than anyone to invent a machine of unlimited flexibility - perhaps the most potent technology in history. They estimated that the only people poised to prevent this outcome were upstanding, benign actors like themselves. They launched OpenAI as a public-spirited research alternative to the threat of Google's closed-shop monopoly.

Their pitch - to treat AI as a scientific project rather than as a commercial one - was irresistibly earnest, if dubiously genuine, and it allowed them to raid Google's roster. Among their early hires was a young researcher named Dario Amodei.[...]

OpenAI had been founded on the fear that AI could easily get out of hand. By late 2020, however, Sam Altman himself had come to seem about as trustworthy as the average corporate megalomaniac. He made noise about AI safety, but his actions suggested a vulgar desire to win.

The company [i.e., Anthropic], which they [the Amodei siblings, along with five fellow-dissenters] pitched as a foil for OpenAI, sounded an awful lot like the company Altman had pitched as a foil for Google.

Anthropic's founders adopted a special corporate structure to vouchsafe their integrity. Then again, so had OpenAI.

What Is Claude? Anthropic Doesn't Know, Either

Researchers at the company are trying to understand their A.I. system's mind-examining its neurons, running it through psychology experiments, and putting it on the therapy couch.

https://www.newyorker.com

What Is Claude? Anthropic Doesn't Know, Either

Anthropic and AI Safety

Now, don't get me wrong - I'm not arguing that just because OpenAI has acted like a wolf in sheep's clothes vis-à-vis AI safety, Anthropic will necessarily follow the same path.

OpenAI has deleted the word 'safely' from its mission - and its new structure is a test for whether AI serves society or shareholders

OpenAI's restructuring may serve as a test case for how society oversees the work of organizations with the potential to both provide benefits and harm humanity.

https://theconversation.com

OpenAI has deleted the word 'safely' from its mission - and its new structure is a test for whether AI serves society or shareholders

Quite the opposite: given how aggressively U.S. Defense Secretary Pete Hegseth has been attacking Anthropic for trying to maintain at least some guardrails, I’m inclined to think it may be the only major AI lab still genuinely committed to them.

Exclusive: Pentagon warns Anthropic will "pay a price" as feud escalates

The Pentagon may label Anthropic a "supply chain risk," forcing all its vendors to sever ties.

https://www.axios.com

Exclusive: Pentagon warns Anthropic will "pay a price" as feud escalates

Recently, however, there have been so many mixed signals that even a regular observer like me is left wondering whether Anthropic truly prioritizes safety. So let’s step back and take a sober, impartial look at the bigger picture.

Pro-AI Safety Signs (Or So They Seem)

Just a few days ago, Anthropic announced a $20M donation to Public First Action, a new bipartisan 501(c)(4) that will support public education about AI, promote safeguards, and ensure America leads the AI race. This donation positions the company as a direct rival of OpenAI and Andreessen Horowitz - HackerNoon explains why:

Public First Action exists specifically to counter Leading the Future, a super PAC that’s raised over $125 million from OpenAI president Greg Brockman, Andreessen Horowitz, and other AI industry leaders. Leading the Future’s mission is blunt: oppose state-level AI regulation in favor of a light-touch federal framework—or ideally, no framework at all. They’re already running ads against candidates who support AI guardrails.

Anthropic’s donation puts it on the opposite side of that fight. They’re backing candidates who want transparency requirements, state-level protections, and restrictions on the most powerful AI models.

Anthropic's $380 Billion Question About Safety and Growth | HackerNoon

Anthropic recently announced a $30 billion funding round and another move that reveals a deeper question about its identity.

https://hackernoon.com

Anthropic's $380 Billion Question About Safety and Growth | HackerNoon

Meanwhile, as the HackerNoon article clarifies, regulating AI will mainly hurt small and independent AI research labs that don't possess the resources to navigate stringent compliance requirements, and will cement Anthropic's competitive edge:

When Anthropic advocates for regulations requiring AI companies to “demand real transparency from the companies building the most powerful AI models,” they’re advocating for requirements they already meet. Every new mandate raises the barrier to entry; every transparency requirement favors companies with existing measurement infrastructure.

Let's not forget that in 2023, OpenAI's Sam Altman was also urging the Senate to regulate the industry and to create "a combination of licensing and testing requirements" for AI companies:

Sam Altman: CEO of OpenAI calls for US to regulate artificial intelligence

Sam Altman says government regulation is "critical" to control the risks of artificial intelligence.

https://www.bbc.com

Sam Altman: CEO of OpenAI calls for US to regulate artificial intelligence

In conclusion, calling for state-level AI regulation is not necessarily a pro–AI safety stance. It can just as easily serve as a form of self-protection.

Anti-AI Safety Signs

Anthropic's partnership with Palantir - the company offering mass surveillance services and allegedly helping ICE to track and deport immigrants, is enough of a red flag for me. But there have been other worrisome indications of the company's concern for safety or lack thereof.

First, Anthropic published a risk report that delivered alarming findings about its Opus 4.6 model (more here):

In The New Yorker piece I linked above, Lewis-Kraus reinforces the impression that Anthropic doesn’t truly control its models; instead, the company appears to devote much of its resources to testing and experimentation in an effort to better understand their capabilities. That leads me to believe that the safeguards it has implemented are not enough to keep a right rein on them.

On top of all this, Anthropic's AI Safety researcher and head of Safeguards Research team, Mrinank Sharma, resigned. Sharma was working on defenses against AI-assisted bioterrorism and understanding AI sycophancy.

In his resignation letter shared on X, Mrinank Sharma told the firm he was leaving amid concerns about AI, bioweapons and the state of the wider world. [He also confirmed that Anthropic] constantly face pressures to set aside what matters most.

As HackerNoon concludes, "Sharma’s letter suggests something harder to address than specific failures: the gap between stated values and market pressures has grown wide enough that a senior safety researcher couldn’t reconcile them."

What If Crypto Can Help?

When both corporate self-regulation and governments are failing to ensure solid safety rules, what are we left with? As in other similar cases, we'd better rely on technology.

AI Safety is not a particularly trendy problem in Web3 at the moment, but there is still strong research being conducted on the matter. I'll highlight just one example today, although it is definitely not the only one.

My friends at Decentralized Cooperation Foundation (DCF) recently received a Foresight Institute Grant to support the next stage of work on Endo. Endo is an open-source object capability framework that provides a way for software, including AI systems, to collaborate while limiting authority and risk.

Instead of trusting every component with complete control, Endo uses object capabilities and the Principle of Least Authority (POLA) to limit what code can do. Each part receives only the authority it needs. When code is fallible, the damage stays contained. Developers still benefit from rapid progress. They also gain a foundation for safety that becomes increasingly important with each new wave of AI tools. Not only is the container more secure, but the Endo-aware code can enforce policies on other code, whether or not that code is Endo-aware.

The grant will be used to work on:

Controlled execution environment: Building on protocols like the Model Context Protocol to allow AI to execute JavaScript in highly restricted contexts.
User‑governed access: Creating a system where AI can request file access while keeping user approval at the center.

Read more below and let me know what you think.

Foresight Institute Grants DCF Funding to Advance Endo and Safe AI Code Execution - Decentralized Cooperation Foundation (DCF)

We are excited to share that the DCF has received a Foresight Institute Grant to support the next stage of work on Endo.Endo is an open-source object capability framework developed in cooperation with Agoric and stewarded by DCF. It gives people, AI systems, and other software a way to collaborate with confidence.

https://dcfoundation.io

Foresight Institute Grants DCF Funding to Advance Endo and Safe AI Code Execution - Decentralized Cooperation Foundation (DCF)

Thank you for reading! If you haven't done so yet, I invite you to subscribe to stay in the loop on the hottest dAI developments.

If you want to support the publication financially, you can either purchase my writer token $WEB3AI, or buy my creator token $ALBENA on ZORA.

I'm looking forward to connecting with fellow Crypto x AI enthusiasts, so don't hesitate to reach out on social media.

Disclaimer: None of this should or could be considered financial advice. You should not take my words for granted; rather, do your own research (DYOR) and share your thoughts to encourage a fruitful discussion.

The Web3 + AI Newsletter