Zac Boring - pDoom (Page 15)

Zac Boring 2 months ago Research

OpeFlo: Automated UX Evaluation via Simulated Human Web Interaction with GUI Grounding

via ArXiv cs.AI [4] — Evaluating web usability typically requires time-consuming user studies and expert reviews, which often limits iteration speed during product development, especially for small teams and agile workflows. We present OpenFlo, a user-experience evaluation agent…

Zac Boring 2 months ago Research

Anthropic repeatedly accidentally trained against the CoT, demonstrating inadequate processes

via Alignment Forum [999] — It turns out that Anthropic accidentally trained against the chain of thought of Claude Mythos Preview in around 8% of training episodes. This is at least the second independent incident in which Anthropic accidentally exposed their model's CoT to the…

Zac Boring 2 months ago Research

Summary: AI Governance to Avoid Extinction

via MIRI [999] — With AI capabilities rapidly increasing, humans appear close to developing AI systems that are better than human experts across all domains. This raises a series of questions about how the world will—and should—respond. In the research paper AI Governance to…

Zac Boring 2 months ago Analysis

AI Safety's Biggest Talent Gap Isn't Researchers. It's Generalists.

via LessWrong AI [5] — This post was cross posted to the EA ForumTL;DR: One of the largest talent gaps in AI safety is competent generalists: program managers, fieldbuilders, operators, org leaders, chiefs of staff, founders. Ambitious, competent junior people could develop the…

Zac Boring 2 months ago Industry

Read OpenAI’s latest internal memo about beating the competition — including Anthropic

via The Verge AI [4] — OpenAI's chief revenue officer, Denise Dresser, sent a four-page memo to employees on Sunday about the company's strategic direction, emphasizing the need to lock in users and grow its enterprise business. The memo, which was viewed by The Verge,…

Zac Boring 2 months ago Analysis

Political Violence Is Never Acceptable

via Substack Zvi [999] — Nor is the threat or implication of violence.

Zac Boring 2 months ago Analysis

Talk English, Think Something Else

via LessWrong AI [4] — There's an adage from programming in C++ which goes something like "Yes, you write C, but you imagine the machine code as you do." I assumed this was bullshit, that nobody actually does this. Am I supposed to imagine writing the machine code, and then…

Zac Boring 2 months ago Research

Sustained Impact of Agentic Personalisation in Marketing: A Longitudinal Case Study

via ArXiv cs.AI [4] — In consumer applications, Customer Relationship Management (CRM) has traditionally relied on the manual optimisation of static, rule-based messaging strategies. While adaptive and autonomous learning systems offer the promise of scalable personalisation, it…

Zac Boring 2 months ago Research

OpenKedge: Governing Agentic Mutation with Execution-Bound Safety and Evidence Chains

via ArXiv cs.AI [3] — The rise of autonomous AI agents exposes a fundamental flaw in API-centric architectures: probabilistic systems directly execute state mutations without sufficient context, coordination, or safety guarantees. We introduce OpenKedge, a protocol that…

Zac Boring 2 months ago Analysis

Daycare illnesses

via LessWrong AI [4] — Before I had a baby I was pretty agnostic about the idea of daycare. I could imagine various pros and cons but I didn’t have a strong overall opinion. Then I started mentioning the idea to various people. Every parent I spoke to brought up a consideration…

Zac Boring 2 months ago Analysis

Catching illicit distributed training operations during an AI pause

via LessWrong AI [3] — Last year, my colleagues on MIRI’s Technical Governance Team proposed an international agreement to halt risky development of superhuman artificial intelligence until it can be done safely. The agreement would require all clusters of AI chips with more…

Zac Boring 2 months ago Analysis

Pausing AI Is the Best Answer to Post-Alignment Problems

via LessWrong AI [5] — Even if we solve the AI alignment problem, we still face post-alignment problems, which are all the other existential problems [1] that AI may bring. People have identified various imposing problems that we may need to solve before developing ASI. An…

Zac Boring 2 months ago Analysis

Dario probably doesn't believe in superintelligence

via LessWrong AI [6] — But I had to get 500 words out! I think the 2013 conversation is interesting reading as a piece of history, separate from the top-level question, and recommend reading that.I think many people have a relationship with Anthropic that is premised on a false…

Zac Boring 3 months ago Analysis

Claude Mythos #2: Cybersecurity and Project Glasswing

via Substack Zvi [999] — Anthropic is not going to release its new most capable model, Claude Mythos, to the public any time soon.

Zac Boring 3 months ago Analysis

Have we already lost? Part 2: Reasons for Doom

via LessWrong AI [9] — Written very quickly for the Inkhaven Residency.As I take the time to reflect on the state of AI Safety in early 2026, one question feels unavoidable: have we, as the AI Safety community, already lost? That is, have we passed the point of no return, after…

Zac Boring 3 months ago Analysis

Claude Mythos: The System Card

via Substack Zvi [999] — Claude Mythos is different.

Zac Boring 3 months ago Analysis

Have we already lost? Part 1: The Plan in 2024

via LessWrong AI [5] — Written very quickly for the Inkhaven Residency.As I take the time to reflect on the state of AI Safety in early 2026, one question feels unavoidable: have we, as the AI Safety community, already lost? That is, have we passed the point of no return, after…

Zac Boring 3 months ago Research

SymptomWise: A Deterministic Reasoning Layer for Reliable and Efficient AI Systems

via ArXiv cs.AI [3] — AI-driven symptom analysis systems face persistent challenges in reliability, interpretability, and hallucination. End-to-end generative approaches often lack traceability and may produce unsupported or inconsistent diagnostic outputs in safety-critical…

Zac Boring 3 months ago Analysis

101 Humans of New York on the Risks of AI

via LessWrong AI [4] — Nobody has ever done an in person door to door survey about AI risks[1]. What do people really think about AI? Like really? There have been some surveys on the risks from AI. But there’s a real difference between looking at numbers on page vs. the feeling…

Zac Boring 3 months ago Industry

Meta is reentering the AI race with a new model called Muse Spark

via The Verge AI [4] — Meta Superintelligence Labs is launching its first model since Mark Zuckerberg spent billions overhauling the company's AI efforts. Called Muse Spark, the model now powers the Meta AI app and the Meta AI website in the US, per the company's announcement.…