AI

Responsible AI

Risks / safety

Situational awareness: The Decade Ahead (Aschenbrenner 2024)

Interpretability

Mechanistic Interpretability for AI Safety A Review

Do All AI Systems Need to Be Explainable?

Self-explaining SAE features — LessWrong

JShollaj/awesome-llm-interpretability: A curated list of Large Language Model (LLM) Interpretability resources.

Temp - to review:

Oversight & standards

Lessons from the FDA for AI - AI Now Institute

Governing General Purpose AI — A Comprehensive Map of Unreliability, Misuse and Systemic Risks

Etc.

Yann LeCun - A Path Towards Autonomous Machine Intelligence

Reasoning through arguments against taking AI safety seriously: Yoshua Bengio 2024.07.09

Towards a Cautious Scientist AI with Convergent Safety Bounds: Yoshua Bengio 2024.02.26

ADD / XOR / ROL: Someone is wrong on the internet (AGI Doom edition)

lcamtuf @HalvarFlake FWIW… - Infosec Exchange

Defining AGI

The Turing Test and our shifting conceptions of intelligence | Science

02002-02029 (27 years): By 2029 no computer - or "machine intelligence" - will have passed the Turing Test. - Long Bets

Karpathy tweet 2024.03.18

Accepting not-G AI

Setting boundaries

Jaana Dogan: LLMs are tools to navigate a corpus based on a very biased and authoritative prompt.