Analysis - pDoom (Page 4)

Zac Boring a month ago Analysis

Claude Code, Codex and Agentic Coding #7: Auto Mode

via Substack Zvi [999] — As we all try to figure out what Mythos means for us down the line, the world of practical agentic coding continues, with the latest array of upgrades.

Zac Boring a month ago Analysis

Diary of a "Doomer": 12+ years arguing about AI risk (part 1)

via LessWrong AI [4] — How I learned about Deep Learning.As far as I know, I’m the second person ever to get into the field of AI largely because I was worried about the risk of human extinction.1In late 2012, while recovering from some minor heartbreak with the help of some…

Zac Boring a month ago Analysis

A Retrospective of Richard Ngo's 2022 List of Conceptual Alignment Projects

via LessWrong AI [8] — Written very quickly for the InkHaven Residency.In 2022, Richard Ngo wrote a list of 26 Conceptual Alignment Research Projects. Now that it’s 2026, I’d like to revisit this list of projects, note which ones have already been done, and give my thoughts on…

Zac Boring a month ago Analysis

Claude Mythos #3: Capabilities and Additions

via Substack Zvi [999] — To round out coverage of Mythos, today covers capabilities other than cyber, and anything else additional not covered by the first two posts, including new reactions and details.

Zac Boring a month ago Analysis

AI Safety's Biggest Talent Gap Isn't Researchers. It's Generalists.

via LessWrong AI [5] — This post was cross posted to the EA ForumTL;DR: One of the largest talent gaps in AI safety is competent generalists: program managers, fieldbuilders, operators, org leaders, chiefs of staff, founders. Ambitious, competent junior people could develop the…

Zac Boring a month ago Analysis

Political Violence Is Never Acceptable

via Substack Zvi [999] — Nor is the threat or implication of violence.

Zac Boring a month ago Analysis

Talk English, Think Something Else

via LessWrong AI [4] — There's an adage from programming in C++ which goes something like "Yes, you write C, but you imagine the machine code as you do." I assumed this was bullshit, that nobody actually does this. Am I supposed to imagine writing the machine code, and then…

Zac Boring a month ago Analysis

Daycare illnesses

via LessWrong AI [4] — Before I had a baby I was pretty agnostic about the idea of daycare. I could imagine various pros and cons but I didn’t have a strong overall opinion. Then I started mentioning the idea to various people. Every parent I spoke to brought up a consideration…

Zac Boring a month ago Analysis

Catching illicit distributed training operations during an AI pause

via LessWrong AI [3] — Last year, my colleagues on MIRI’s Technical Governance Team proposed an international agreement to halt risky development of superhuman artificial intelligence until it can be done safely. The agreement would require all clusters of AI chips with more…

Zac Boring a month ago Analysis

Pausing AI Is the Best Answer to Post-Alignment Problems

via LessWrong AI [5] — Even if we solve the AI alignment problem, we still face post-alignment problems, which are all the other existential problems [1] that AI may bring. People have identified various imposing problems that we may need to solve before developing ASI. An…

Zac Boring a month ago Analysis

Dario probably doesn't believe in superintelligence

via LessWrong AI [6] — But I had to get 500 words out! I think the 2013 conversation is interesting reading as a piece of history, separate from the top-level question, and recommend reading that.I think many people have a relationship with Anthropic that is premised on a false…

Zac Boring a month ago Analysis

Claude Mythos #2: Cybersecurity and Project Glasswing

via Substack Zvi [999] — Anthropic is not going to release its new most capable model, Claude Mythos, to the public any time soon.

Zac Boring a month ago Analysis

Have we already lost? Part 2: Reasons for Doom

via LessWrong AI [9] — Written very quickly for the Inkhaven Residency.As I take the time to reflect on the state of AI Safety in early 2026, one question feels unavoidable: have we, as the AI Safety community, already lost? That is, have we passed the point of no return, after…

Zac Boring a month ago Analysis

Claude Mythos: The System Card

via Substack Zvi [999] — Claude Mythos is different.

Zac Boring a month ago Analysis

Have we already lost? Part 1: The Plan in 2024

via LessWrong AI [5] — Written very quickly for the Inkhaven Residency.As I take the time to reflect on the state of AI Safety in early 2026, one question feels unavoidable: have we, as the AI Safety community, already lost? That is, have we passed the point of no return, after…

Zac Boring a month ago Analysis

101 Humans of New York on the Risks of AI

via LessWrong AI [4] — Nobody has ever done an in person door to door survey about AI risks[1]. What do people really think about AI? Like really? There have been some surveys on the risks from AI. But there’s a real difference between looking at numbers on page vs. the feeling…

Zac Boring a month ago Analysis

AI #163: Mythos Quest

via Substack Zvi [999] — There exists an AI model, Claude Mythos, that has discovered critical safety vulnerabilities in every major operating system and browser.

Zac Boring a month ago Analysis

Opus's Schelling Steganography Has Amplifiable Secrecy Against Weaker Eavesdroppers

via LessWrong AI [3] — Code: github.com/ElleNajt/Steganography_Wiretapping | Data: huggingface.co/datasets/lnajt/steganography-wiretapping Play the decoding game: can you eavesdrop on Claude Opus 4.6? tldr of post Frontier models (Opus and Gemini Pro) can agree on Schelling…

Zac Boring a month ago Analysis

An Alignment Journal: Features and policies

via LessWrong AI [5] — We previously announced a forthcoming research journal for AI alignment. This cross-post from our blog describes our tentative plans for the features and policies of the journal, including experiments like reviewer compensation and reviewer abstracts. It…

Zac Boring a month ago Analysis

OpenAI #16: A History and a Proposal

via Substack Zvi [999] — The real news today is that Anthropic has partnered with the top companies in cybersecurity to try and patch everyone’s systems to fix all the thousands of zero-day exploits found by their new model Claude Mythos.