Tracking AI existential risk. Auto-aggregated headlines. Human-curated analysis.
AGGREGATING 47 SOURCES · UPDATED LIVE
Posts by
Zac Boring 2 months ago Research
OpeFlo: Automated UX Evaluation via Simulated Human Web Interaction with GUI Grounding
via ArXiv cs.AI [4] — Evaluating web usability typically requires time-consuming user studies and expert reviews, which often limits iteration speed during product development, especially for small teams and agile workflows. We present OpenFlo, a user-experience evaluation agent…
Zac Boring 2 months ago Research
Anthropic repeatedly accidentally trained against the CoT, demonstrating inadequate processes
via Alignment Forum [999] — It turns out that Anthropic accidentally trained against the chain of thought of Claude Mythos Preview in around 8% of training episodes. This is at least the second independent incident in which Anthropic accidentally exposed their model's CoT to the…
Zac Boring 2 months ago Research
Summary: AI Governance to Avoid Extinction
via MIRI [999] — With AI capabilities rapidly increasing, humans appear close to developing AI systems that are better than human experts across all domains. This raises a series of questions about how the world will—and should—respond. In the research paper AI Governance to…
Zac Boring 2 months ago Analysis
AI Safety's Biggest Talent Gap Isn't Researchers. It's Generalists.
via LessWrong AI [5] — This post was cross posted to the EA ForumTL;DR: One of the largest talent gaps in AI safety is competent generalists: program managers, fieldbuilders, operators, org leaders, chiefs of staff, founders. Ambitious, competent junior people could develop the…
Zac Boring 2 months ago Industry
Read OpenAI’s latest internal memo about beating the competition — including Anthropic
via The Verge AI [4] — OpenAI's chief revenue officer, Denise Dresser, sent a four-page memo to employees on Sunday about the company's strategic direction, emphasizing the need to lock in users and grow its enterprise business. The memo, which was viewed by The Verge,…
Zac Boring 2 months ago Analysis
Political Violence Is Never Acceptable
via Substack Zvi [999] — Nor is the threat or implication of violence.
Zac Boring 2 months ago Analysis
Talk English, Think Something Else
via LessWrong AI [4] — There's an adage from programming in C++ which goes something like "Yes, you write C, but you imagine the machine code as you do." I assumed this was bullshit, that nobody actually does this. Am I supposed to imagine writing the machine code, and then…
Zac Boring 2 months ago Research
Sustained Impact of Agentic Personalisation in Marketing: A Longitudinal Case Study
via ArXiv cs.AI [4] — In consumer applications, Customer Relationship Management (CRM) has traditionally relied on the manual optimisation of static, rule-based messaging strategies. While adaptive and autonomous learning systems offer the promise of scalable personalisation, it…
Zac Boring 2 months ago Research
OpenKedge: Governing Agentic Mutation with Execution-Bound Safety and Evidence Chains
via ArXiv cs.AI [3] — The rise of autonomous AI agents exposes a fundamental flaw in API-centric architectures: probabilistic systems directly execute state mutations without sufficient context, coordination, or safety guarantees. We introduce OpenKedge, a protocol that…
Zac Boring 2 months ago Analysis
Daycare illnesses
via LessWrong AI [4] — Before I had a baby I was pretty agnostic about the idea of daycare. I could imagine various pros and cons but I didn’t have a strong overall opinion. Then I started mentioning the idea to various people. Every parent I spoke to brought up a consideration…
Zac Boring 2 months ago Analysis
Catching illicit distributed training operations during an AI pause
via LessWrong AI [3] — Last year, my colleagues on MIRI’s Technical Governance Team proposed an international agreement to halt risky development of superhuman artificial intelligence until it can be done safely. The agreement would require all clusters of AI chips with more…
Zac Boring 2 months ago Analysis
Pausing AI Is the Best Answer to Post-Alignment Problems
via LessWrong AI [5] — Even if we solve the AI alignment problem, we still face post-alignment problems, which are all the other existential problems [1] that AI may bring. People have identified various imposing problems that we may need to solve before developing ASI. An…
Zac Boring 2 months ago Analysis
Dario probably doesn't believe in superintelligence
via LessWrong AI [6] — But I had to get 500 words out! I think the 2013 conversation is interesting reading as a piece of history, separate from the top-level question, and recommend reading that.I think many people have a relationship with Anthropic that is premised on a false…
Zac Boring 3 months ago Analysis
Claude Mythos #2: Cybersecurity and Project Glasswing
via Substack Zvi [999] — Anthropic is not going to release its new most capable model, Claude Mythos, to the public any time soon.
Zac Boring 3 months ago Analysis
Have we already lost? Part 2: Reasons for Doom
via LessWrong AI [9] — Written very quickly for the Inkhaven Residency.As I take the time to reflect on the state of AI Safety in early 2026, one question feels unavoidable: have we, as the AI Safety community, already lost? That is, have we passed the point of no return, after…
Zac Boring 3 months ago Analysis
Claude Mythos: The System Card
via Substack Zvi [999] — Claude Mythos is different.
Zac Boring 3 months ago Analysis
Have we already lost? Part 1: The Plan in 2024
via LessWrong AI [5] — Written very quickly for the Inkhaven Residency.As I take the time to reflect on the state of AI Safety in early 2026, one question feels unavoidable: have we, as the AI Safety community, already lost? That is, have we passed the point of no return, after…
Zac Boring 3 months ago Research
SymptomWise: A Deterministic Reasoning Layer for Reliable and Efficient AI Systems
via ArXiv cs.AI [3] — AI-driven symptom analysis systems face persistent challenges in reliability, interpretability, and hallucination. End-to-end generative approaches often lack traceability and may produce unsupported or inconsistent diagnostic outputs in safety-critical…
Zac Boring 3 months ago Analysis
101 Humans of New York on the Risks of AI
via LessWrong AI [4] — Nobody has ever done an in person door to door survey about AI risks[1]. What do people really think about AI? Like really? There have been some surveys on the risks from AI. But there’s a real difference between looking at numbers on page vs. the feeling…
Zac Boring 3 months ago Industry
Meta is reentering the AI race with a new model called Muse Spark
via The Verge AI [4] — Meta Superintelligence Labs is launching its first model since Mark Zuckerberg spent billions overhauling the company's AI efforts. Called Muse Spark, the model now powers the Meta AI app and the Meta AI website in the US, per the company's announcement.…
Live Doom Meter
-- %
0% — We're fine 100% — GG
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...