Contents

Machine learning fundamentals

🌱 notes 🌱

Where to begin?

Just starting out in machine learning? Know your way around but want to dig in deeper? Either way, Vicky Boykis has a terrific resource for you.

Her Anti-hype LLM reading list gist starts with a timeline sketch of 1990s statistical learning through today’s LLMs, then provides curated resources for topics including LLM building blocks, deep learning, transformers/attention, GPT, open source models, training data, pre-training, RLHF and DPO, fine-tuning and compression, small LLMs, GPUs, evaluation, and UX.

It’s a fine place to get a lay of the land and then get cosy with ML fundamentals.

Vicky also shares her ML learning notes: Machine Learning Garden

Academic courses

Some well-known machine learning courses offered at universities such as Stanford, CMU, Harvard, MIT, etc, post their materials. Links to just a few of those:

Reading lists


Foundational skills

Machine learning: foundational skills list + resources - Brandon Rohrer


Neural networks

This is an extremely non-comprehensive collection of links to either foundational papers or detailed explanations of foundational concepts.

Embeddings

What are embeddings?

Activation functions

An Overview of Activation Functions | Papers With Code

Expanded Gating Ranges Improve Activation Functions

SwiGLU Explained | Papers With Code

ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse LLMs

Reward functions

‘forgetting’ parameter: The ants and the pheromones | the morning paper

Loss functions

Understanding Emergent Abilities of Language Models from the Loss Perspective

Backprop

The paper that started it all: Letters to Nature: Learning representation by back-propagating errors (pdf)

How to guess a gradient

Convolutional neural nets

ImageNet Classification with Deep Convolutional Neural Networks

LLMs

The Platonic Representation Hypothesis

Transformers

Attention Is All You Need

Understanding the attention mechanism in sequence models

Understanding the Transformer architecture for neural networks

Transformers from Scratch - Brandon Rohrer

A Mathematical Framework for Transformer Circuits

Vision Transformers vs CNNs at the Edge

Training

A Recipe for Training Neural Networks

Model evaluation

Evaluating a machine learning model.

Inspect

Fine tuning

Fine tuning refers to the process of starting with a generally-trained model, such as Gemini, Llama, etc, then training on a domain-specific dataset to produce better quality answers to general questions posed in that domain.

Fine tuning tools

Unsloth: more efficient fine tuning

Fine tuning considerations

Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining

GPUs/TPUs

Understanding GPU Memory 1: Visualizing All Allocations over Time | PyTorch

Robotics

How to Train Your Robot (Book) - Brandon Rohrer

ML playgrounds

AI Test Kitchen

Neural Network Concepts Animations

LLM prompting

What's the Magic Word? A Control Theory of LLM Prompting

Neural operators

Neural operators for accelerating scientific simulations and design | Nature Reviews Physics