Building and evaluating alignment auditing agents
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats | alphaXiv
[2502.01492] Develop AI Agents for System Engineering in Factorio
Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety - 2507.11473v1.pdf
Common Elements of Frontier AI Safety Policies
[2507.20964] Core Safety Values for Provably Corrigible Agents
Justify your answer - by Ben Recht - arg min
Introduction to deep learning with applications to stochastic control and games - YouTube
Learning the natural history of human disease with generative transformers | Nature
Jared Kaplan: ContemporaryMLforPhysicists (pdf)
- Starting on page 55, the Architectures section covers the structure and componentes of deep neural networks from a (mathematical and statistical) modeling perspective.
Tips for Empirical Alignment Research — AI Alignment Forum
AI Rights for Human Flourishing
How to Make the Future Better: Concrete Actions for Flourishing
Claude Code: Behind-the-scenes of the master agent loop
- core numerical methods derived from an exact theoretical principle
- contrast with popular optimizers like Adam, which have more heuristic origins
David Ha’s early work: ōtoro.net
Molnar: From Frequencies to Coverage: Rethinking What “Representative” Means
Molnar: Don’t fix your imbalanced data
- [lcamtuf: Minkowski dimension](How many dimensions is this? - lcamtuf’s thing)
GPT-oss from the Ground Up - by Cameron R. Wolfe, Ph.D.
Gemma 3 270M: Can Tiny Models Learn New Tasks?
How to Think About GPUs | How To Scale Your Model
Building CERN for AI - An institutional blueprint - Centre for Future Generations
Process knowledge is crucial to economic development
Four places where you can put LLM monitoring — LessWrong
The Artificiality of Alignment - by jessica dai - Reboot
[2503.05336v3] Toward an Evaluation Science for Generative AI Systems
There is only one model - by Jack Morris - Token for Token
- Ilya’s talk on compression as intelligence
- Shannon - A mathematical theory of communication (1948)
- Nyquist - Certain Factors Affecting Telegraph Speed (1924) (Internet Archive)
The Big LLM Architecture Comparison
Cohere Labs: 2025 Summer School - recorded talks
Causal Artificial Intelligence Book
On the criteria to be used in decomposing systems into modules - 361598.361623.pdf
Stanford CS336 | Language Modeling from Scratch
Introduction | RLHF Book by Nathan Lambert
Aurora GPT - Argonne National Lab
Evaluation Framework for AI Systems in "the Wild" | alphaXiv
EvalEval 2024 - neurips 2024 workshop
Physics of Language Models - Allen Zhu
[2502.15657] Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?
Bertsekas - RL courses and book - mit.edu/~dimitrib/RLbook.html
Raft implemented in Go, Eli Bendersky
Layers of Memory, Layers of Compression - Tim Kellogg
CSES - CSES Problem Set - Tasks
Things that go wrong with disk IO | notes.eatonphil.com
AI and the Everything in the Whole Wide World Benchmark
A Philosophy of Software Design - John Ousterhout (pdf)
[1807.02811] A Tutorial on Bayesian Optimization
Tracing the thoughts of a large language model \ Anthropic
[2401.17173] Zero-Shot Reinforcement Learning via Function Encoders
Severe deviation in protein fold prediction by advanced AI: a case study | Scientific Reports
Building multi-source ingestion pipelines the right way
Taking a responsible path to AGI - Google DeepMind
[2502.01706] Comply: Learning Sentences with Complex Weights inspired by Fruit Fly Olfaction
Learning with not Enough Data Part 1: Semi-Supervised Learning | Lil'Log
MLOps system design is boring. - by Alexandru Vesa
Demystifying Chains, Trees, and Graphs of Thoughts
Sequential decision making - Kevin Murphy, DeepMind
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch
International AI safety report - International_AI_Safety_Report_2025_accessible_f.pdf
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
A Little Bit of Reinforcement Learning from Human Feedback
DeepSeek R1's recipe to replicate o1 and the future of reasoning LMs
A Mathematical Framework for Transformer Circuits
A Recipe for Training Neural Networks
Scaling and networking a modular photonic quantum computer | Nature