Tracking AI existential risk. Auto-aggregated headlines. Human-curated analysis.
AGGREGATING 47 SOURCES · UPDATED LIVE
Posts by
Zac Boring 8 days ago Analysis
Alignement pretraining could backfire
via LessWrong AI [3] — There has been recent interest in generating synthetic documents to upsample examples of aligned AI during LLM pretraining. See, for instance, Geodesic's Alignment Pretraining paper or Anthropic's "Teaching Claude Why."I worry that this strategy can work…
Zac Boring 8 days ago Analysis
The Once And Future Fable #3: Fix This Code
via Substack Zvi [999] — The mainstream media continues to sleep on the most important story in the world.
Zac Boring 8 days ago Industry
A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry
via OpenAI Blog [5] — OpenAI and Molecule.one show how a near-autonomous AI chemist using GPT-5.4 improved a key drug-making reaction, advancing medicinal chemistry research.
Zac Boring 8 days ago Research
SpeechDx: A Multi-Task Benchmark for Clinical Speech AI
via ArXiv cs.AI [4] — Speech offers a uniquely informative window into health by simultaneously engaging neurological, motor, respiratory, and vocal systems. Current clinical speech AI methods have largely progressed through isolated condition-specific studies, making results…
Zac Boring 9 days ago Research
Predicting LLM Safety Before Release by Simulating Deployment
via Alignment Forum [999] — Paper linkBefore releasing a new model, labs need to understand not just what it can do, but how it is likely to behave in real-world use, including where it might introduce new risks. This becomes even more important as capabilities increase. As part…
Zac Boring 9 days ago Analysis
Fable and Mythos: Model Welfare
via Substack Zvi [999] — Fable and Mythos are currently unavailable, but likely will return within a few weeks. I will continue to cover that fiasco, but in the meantime I will also finish my review of Fable, as if it were available, including use of the present tense.
Zac Boring 9 days ago Industry
SpaceX is officially buying Cursor for $60 billion
via The Verge AI [4] — Days after its massive IPO, SpaceX says it is spending $60 billion to buy Cursor - a bet designed to help Elon Musk's sprawling rocket / AI / social media behemoth win over lucrative enterprise customers and close the gap with AI rivals like Anthropic and…
Zac Boring 9 days ago Research
Fusion is not one-size-fits-all: Cross-Modal Representation Alignment for Time-to-Event Modeling
via ArXiv cs.AI [4] — Accurate time-to-event (TTE) prediction from multimodal clinical data remains challenging due to modality imbalance and distribution shift. We introduce a foundation model-driven framework for cross-modal representation alignment between CT imaging and…
Zac Boring 10 days ago Research
Synthetic document finetuning for instilling positive traits
via Alignment Forum [999] — This is the fifth in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The fourth post can be found here.TLDR: Via adapting the methods of Marks et al and Li et…
Zac Boring 10 days ago Industry
Big Tech’s desperate last push at AI regulation
via The Verge AI [3] — For months, Big Tech's Washington lobbyists have chased after the holy grail of pro-AI legislation: preemption. This would be a comprehensive federal law, passed in Congress and signed by the president, applying one set of AI rules across the entire…
Zac Boring 10 days ago Analysis
A frontier AI company should shut down
via LessWrong AI [4] — Prior discussion: niplav's shortform (2025); Planning for Extreme AI Risks (2025) by Joshua Clymer A frontier AI company (any one, I don't care which) should close shop and make an announcement along the lines of: Powerful AI could end the human race. We…
Zac Boring 10 days ago Analysis
The Once And Future Fable #2
via Substack Zvi [999] — On Friday evening the United States Government has forced Anthropic to take down all access to Fable and Mythos.
Zac Boring 10 days ago Research
Hybrid Open-Ended Tri-Evolution Makes Better Deep Researcher
via ArXiv cs.AI [4] — Deep research and agent evolution serve as de-facto tasks for AI agents in real-world applications toward artificial general intelligence. The former enables autonomous retrieval and integration of information in open-ended environments to tackle open-ended…
Zac Boring 11 days ago Research
Why Do Naive SFT Filters For Safety Properties Fail?
via Alignment Forum [999] — This is the fourth in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The third post can be found here.Since SFT is the cause for many safety relevant…
Zac Boring 12 days ago Analysis
American Government Takes Down Claude Fable
via Substack Zvi [999] — No good policy gets announced shortly after 5pm eastern on a Friday.
Zac Boring 12 days ago Analysis
The term “AGI” is almost useless at this point [Linkpost]
via LessWrong AI [7] — The reason I wanted to make this linkpost now rather than some other time is because discussions over AGI and whether or not LLMs are or aren't AGI, and the point of the linkpost is that the term AGI is for our purposes useless at this point, because we…
Zac Boring 12 days ago Research
SFT Drives Gemini’s Safety Properties
via Alignment Forum [999] — This is the third in a series of informal research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The second post can be found here.In this short post, we describe a surprising finding:…
Zac Boring 13 days ago Analysis
Simulating Simulators
via LessWrong AI [3] — Author’s I promised myself that when labs moved on to focusing on interpretability vector activations in place of reasoning traces for what invariably gets Goodharted, that it’d be a necessary disclosure as the risks in what might get trampled over…
Zac Boring 13 days ago Analysis
Citations Needed: Magic Encyclopedias to Save the World
via LessWrong AI [4] — Last week FLF launched a competition “to find the best workflows and methodologies for using AI to produce reliable, trustworthy knowledge bases”. I had (and have ongoing) a substantial role in that effort. Why do I think it’s so important? It’s a lot of…
Zac Boring 13 days ago Analysis
Reward Hacking at the 1937 World’s Fair
via LessWrong AI [3] — The "Paris 1937 World’s Fair" was a dick measuring contest. At the time, the world was on the verge of the worst war in history. The fair was an opportunity for powers to flex and intimidate each other. Who has more industrial might, more sophisticated…
Live Doom Meter
-- %
0% — We're fine 100% — GG
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...