Machine learning fundamentals
🌱 notes 🌱
Where to begin?
Just starting out in machine learning? Know your way around but want to dig in deeper? Either way, Vicky Boykis has a terrific resource for you.
Her Anti-hype LLM reading list gist starts with a timeline sketch of 1990s statistical learning through today’s LLMs, then provides curated resources for topics including LLM building blocks, deep learning, transformers/attention, GPT, open source models, training data, pre-training, RLHF and DPO, fine-tuning and compression, small LLMs, GPUs, evaluation, and UX.
It’s a fine place to get a lay of the land and then get cosy with ML fundamentals.
Vicky also shares her ML learning notes: Machine Learning Garden
Academic courses
Some well-known machine learning courses offered at universities such as Stanford, CMU, Harvard, MIT, etc, post their materials. Links to just a few of those:
Reading lists
Foundational skills
Machine learning: foundational skills list + resources - Brandon Rohrer
Neural networks
This is an extremely non-comprehensive collection of links to either foundational papers or detailed explanations of foundational concepts.
Embeddings
Activation functions
An Overview of Activation Functions | Papers With Code
Expanded Gating Ranges Improve Activation Functions
SwiGLU Explained | Papers With Code
ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse LLMs
Loss functions
Understanding Emergent Abilities of Language Models from the Loss Perspective
Backprop
The paper that started it all: Letters to Nature: Learning representation by back-propagating errors (pdf)
Convolutional neural nets
ImageNet Classification with Deep Convolutional Neural Networks
LLMs
The Platonic Representation Hypothesis
Transformers
Understanding the attention mechanism in sequence models
Understanding the Transformer architecture for neural networks
Transformers from Scratch - Brandon Rohrer
A Mathematical Framework for Transformer Circuits
Vision Transformers vs CNNs at the Edge
Training
A Recipe for Training Neural Networks
Model evaluation
Evaluating a machine learning model.
Fine tuning
Fine tuning refers to the process of starting with a generally-trained model, such as Gemini, Llama, etc, then training on a domain-specific dataset to produce better quality answers to general questions posed in that domain.
Fine tuning tools
Unsloth: more efficient fine tuning
Fine tuning considerations
Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining
GPUs/TPUs
Understanding GPU Memory 1: Visualizing All Allocations over Time | PyTorch
Robotics
How to Train Your Robot (Book) - Brandon Rohrer
ML playgrounds
LLM prompting
What's the Magic Word? A Control Theory of LLM Prompting
Neural operators
Neural operators for accelerating scientific simulations and design | Nature Reviews Physics