[2510.20817] KL-Regularized Reinforcement Learning is Designed to Mode Collapse
I Figured Out How to Engineer Emergence - by Erik Hoel
A Retrospective on Active Inference
- [2006.10524] Reinforcement Learning as Iterative and Amortised Inference
- [2006.12964] On the Relationship Between Active Inference and Control as Inference
- [2007.05838] Control as Hybrid Inference
- [2103.06859] Understanding the Origin of Information-Seeking Exploration in Probabilistic Objectives for Control
Variational inference - Princeton cos597C 2011
Collective Intelligence with LLMs - by CIP
The Actuary's Final Word - by Ben Recht - arg min
Severity: Strong vs Weak | Error Statistics Philosophy
Guillotine: Hypervisors for Isolating Malicious AIs - guillotine.pdf
Stephen Shenker: Chaos, Black Holes, and Quantum Mechanics - YouTube
Building and evaluating alignment auditing agents
[2502.01492] Develop AI Agents for System Engineering in Factorio
Common Elements of Frontier AI Safety Policies
[2507.20964] Core Safety Values for Provably Corrigible Agents
Introduction to deep learning with applications to stochastic control and games - YouTube
Learning the natural history of human disease with generative transformers | Nature
How to Make the Future Better: Concrete Actions for Flourishing
Claude Code: Behind-the-scenes of the master agent loop
- core numerical methods derived from an exact theoretical principle
- contrast with popular optimizers like Adam, which have more heuristic origins
David Ha’s early work: ōtoro.net
Molnar: From Frequencies to Coverage: Rethinking What “Representative” Means
Molnar: Don’t fix your imbalanced data
- [lcamtuf: Minkowski dimension](How many dimensions is this? - lcamtuf’s thing)
GPT-oss from the Ground Up - by Cameron R. Wolfe, Ph.D.
Gemma 3 270M: Can Tiny Models Learn New Tasks?
Building CERN for AI - An institutional blueprint - Centre for Future Generations
Process knowledge is crucial to economic development
The Artificiality of Alignment - by jessica dai - Reboot
[2503.05336v3] Toward an Evaluation Science for Generative AI Systems
The Big LLM Architecture Comparison
On the criteria to be used in decomposing systems into modules - 361598.361623.pdf
Aurora GPT - Argonne National Lab
Evaluation Framework for AI Systems in "the Wild" | alphaXiv
Raft implemented in Go, Eli Bendersky
AI and the Everything in the Whole Wide World Benchmark
Tracing the thoughts of a large language model \ Anthropic
[2401.17173] Zero-Shot Reinforcement Learning via Function Encoders
Severe deviation in protein fold prediction by advanced AI: a case study | Scientific Reports
[2502.01706] Comply: Learning Sentences with Complex Weights inspired by Fruit Fly Olfaction
Learning with not Enough Data Part 1: Semi-Supervised Learning | Lil'Log
Demystifying Chains, Trees, and Graphs of Thoughts
Sequential decision making - Kevin Murphy, DeepMind
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch
International AI safety report - International_AI_Safety_Report_2025_accessible_f.pdf
A Recipe for Training Neural Networks
Evergreen re-reads
The Feynman Lectures on Physics
[2502.15657] Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?
Physics of Language Models - Allen Zhu
Bertsekas - RL courses and book - mit.edu/~dimitrib/RLbook.html
Stanford CS336 | Language Modeling from Scratch
Introduction | RLHF Book by Nathan Lambert
A Little Bit of Reinforcement Learning from Human Feedback
Causal Artificial Intelligence Book
[1807.02811] A Tutorial on Bayesian Optimization
CSES - CSES Problem Set - Tasks
There is only one model - by Jack Morris - Token for Token
- Ilya’s talk on compression as intelligence
- Shannon - A mathematical theory of communication (1948)
- Nyquist - Certain Factors Affecting Telegraph Speed (1924) (Internet Archive)
Jared Kaplan: ContemporaryMLforPhysicists (pdf)
- Starting on page 55, the Architectures section covers the structure and componentes of deep neural networks from a (mathematical and statistical) modeling perspective.