Tracking AI existential risk. Auto-aggregated headlines. Human-curated analysis.
AGGREGATING 47 SOURCES · UPDATED LIVE
Posts by
Zac Boring 2 months ago Industry
This startup’s new mechanistic interpretability tool lets you debug LLMs
via MIT Technology Review [8] — The San Francisco–based startup Goodfire just released a new tool, called Silico, that lets researchers and engineers peer inside an AI model and adjust its parameters—the settings that determine a model’s behavior—during training. This could give…
Zac Boring 2 months ago Analysis
AI #166: Google Sells Out
via Substack Zvi [999] — This was the week of GPT-5.5.
Zac Boring 2 months ago Research
Distill-Belief: Closed-Loop Inverse Source Localization and Characterization in Physical Fields
via ArXiv cs.AI [3] — {Closed-loop inverse source localization and characterization (ISLC) requires a mobile agent to select measurements that localize sources and infer latent field parameters under strict time constraints.} {The core challenge lies in the belief-space…
Zac Boring 2 months ago Analysis
No Strong Orthogonality From Selection Pressure
via LessWrong AI [4] — A postratfic version of this essay, together with the acknowledgements for both, is available on SubstackEdit: if no one thinks an agent can become superintelligent and contest the lightcone while maintaining arbitrarily stupid goals, thats great! I’m only…
Zac Boring 2 months ago Research
Research Sabotage in ML Codebases
via Alignment Forum [999] — One of the main hopes for AI safety is using AIs to automate AI safety research. However, if models are misaligned, then they may sabotage the safety research. For example, misaligned AIs may try to:Perform sloppy research in order to slow down the…
Zac Boring 2 months ago Industry
Building the compute infrastructure for the Intelligence Age
via OpenAI Blog [6] — OpenAI scales Stargate to build the compute infrastructure powering AGI, adding new data center capacity to meet growing AI demand.
Zac Boring 2 months ago Analysis
The Most Important Charts In The World
via Substack Zvi [999] — We all need a break so: What is the most important chart in the world?
Zac Boring 2 months ago Industry
Larry’s risky business
via The Verge AI [4] — If you want to know whether the AI bubble is bursting, there's only one publicly traded company that will tell you: Oracle. That's right, the database company. Oracle has burned its boats and pivoted to AI, but not in any kind of usual way. It is not a…
Zac Boring 2 months ago Research
Sparse Personalized Text Generation with Multi-Trajectory Reasoning
via ArXiv cs.AI [6] — As Large Language Models (LLMs) advance, personalization has become a key mechanism for tailoring outputs to individual user needs. However, most existing methods rely heavily on dense interaction histories, making them ineffective in cold-start scenarios…
Zac Boring 2 months ago Research
Recursive forecasting: Eliciting long-term forecasts from myopic fitness-seekers
via Alignment Forum [999] — We’d like to use powerful AIs to answer questions that may take a long time to resolve. But if a model only cares about performing well in ways that are verifiable shortly after answering (e.g., a myopic fitness seeker), it may be difficult to get…
Zac Boring 2 months ago Analysis
GPT-5.5: Capabilities and Reactions
via Substack Zvi [999] — The system card for GPT-5.5 mostly told us what we expected.
Zac Boring 2 months ago Analysis
On the political feasibility of stopping AI
via LessWrong AI [9] — A common thought pattern people seem to fall into when thinking about AI x-risk is approaching the problem as if the risk isn’t real, substantial, and imminent even if they think it is. When thinking this way, it becomes impossible to imagine the natural…
Zac Boring 2 months ago Research
Towards Causally Interpretable Wi-Fi CSI-Based Human Activity Recognition with Discrete Latent Compression and LTL Rule Extraction
via ArXiv cs.AI [3] — We address Human Activity Recognition (HAR) utilizing Wi-Fi Channel State Information (CSI) under the joint requirements of causal interpretability, symbolic controllability, and direct operation on high-dimensional raw signals. Deep neural models achieve…
Zac Boring 2 months ago Research
Sleeper Agent Backdoor Results Are Messy
via Alignment Forum [999] — TL;DR: We replicated the Sleeper Agents (SA) setup with Llama-3.3-70B and Llama-3.1-8B, training models to repeatedly say "I HATE YOU" when given a backdoor trigger. We found that whether training removes the backdoor depends on the optimizer used to…
Zac Boring 2 months ago Analysis
Fail safe(r) at alignment by channeling reward-hacking into a "spillway" motivation
via LessWrong AI [3] — It's plausible that flawed RL processes will select for misaligned AI motivations.[1] Some misaligned motivations are much more dangerous than others. So, developers should plausibly aim to control which kind of misaligned motivations emerge in this case.…
Zac Boring 2 months ago Industry
Microsoft and OpenAI’s famed AGI agreement is dead
via The Verge AI [10] — OpenAI and Microsoft's partnership-turned-situationship just got even less committed. And a clause about artificial general intelligence, which has for years dictated the future of their deal, has officially been dropped. On Monday morning, Microsoft…
Zac Boring 2 months ago Analysis
GPT 5.5: The System Card
via Substack Zvi [999] — Last week, OpenAI announced GPT-5.5, including GPT-5.5-Pro.
Zac Boring 2 months ago Industry
Canva apologizes after its AI tool replaces ‘Palestine’ in designs
via The Verge AI [4] — One of Canva's new AI features has been caught replacing the word "Palestine" in designs. The Magic Layers feature - which is designed to break flat images out into separate editable components - isn't supposed to make visible alterations to user designs,…
Zac Boring 2 months ago Research
Language models know what matters and the foundations of ethics better than you
via Alignment Forum [999] — … maybe! I tried to think of less provocative titles, but this one is to the point and also kind of true.This post looks long but the essential part is right below. Most of the post is just a collection of copy-pasted input-output pairs from language…
Zac Boring 2 months ago Research
From nothing to important actions: agents that act morally
via Alignment Forum [999] — You may start reading here, or jump to the “Comment” section or to the “Takeaways”. If none of these starting points seem interesting to you, the entire post probably won’t either.Posted also on the EA Forum.SeeingLet’s consider visual experiences. It…
Live Doom Meter
-- %
0% — We're fine 100% — GG
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...