ML/AI in the wild

2024-08-05 503 words 3 minutes

Contents

🌱 Notes 🌱

… masquerading as a post. Initially intended to gather developers’ ML workflows, dev tools and discourse on AI/AGI got patched in. A bit of a kitchen sink of things relevant to practical uses of AI and real world perspectives on LLM + agent development.

Developers’ ML workflows

Nelson Elhage: Building personal software with Claude

David Crawshaw:

Nicholas Carlini: How I Use "AI"

Kevin Lynagh: Inventory software, useful LLMs, haunted stm32, casual modeling, minimalist workouts

Cursor

Erik Schluntz: Replacing my Right Hand with AI

Armin Ronacher: “Sometimes AI is good. I just took the PR description, some hints on the bad functions and cursor generated a functional fix without my involvement.”; GitHub issue: Fix lstrip_blocks being too eager by mitsuhiko · Pull Request #674 · mitsuhiko/minijinja

Thorsten Ball: They all use it

Tom Yedwab: How I write code using Cursor: A review

Isaac Miller: Why I bet on DSPy

Prompts in the wild

Contemplative reasoning response style for LLMs like Claude and GPT-4o

AI as used by companies

AI Engineering in the real world - by Gergely Orosz

Dev tools

🔪 JAX - The Sharp Bits 🔪 — JAX documentation

Learning JAX as a PyTorch developer · Patrick Kidger

TPU Research Cloud - About

LangWatch - Monitor, Evaluate and Optimize your LLM-apps

Build Compound AI Systems Faster with Databricks Mosaic AI | Databricks Blog

Modal: Serverless cloud infrastructure for AI, ML, and data

Bio tools

Through a Glass Darkly | Markov Bio

A Future History of Biomedical Progress | Markov Bio

Markov Microscope

[2301.08559] The Lost Art of Mathematical Modelling

Neural nets in nature

UChicago, Caltech study suggests that physical processes can have hidden neural network-like abilities | University of Chicago News

Training data and its (dis)content

James Betker (OpenAI): The “it” in AI models is the dataset —

Sufficiently large diffusion conv-unets produce the same images as ViT generators. AR sampling produces the same images as diffusion. This is a surprising observation! It implies that model behavior is not determined by architecture, hyperparameters, or optimizer choices. It’s determined by your dataset, nothing else. Everything else is a means to an end in efficiently delivery compute to approximating that dataset. Then, when you refer to “Lambda”, “ChatGPT”, “Bard”, or “Claude” then, it’s not the model weights that you are referring to. It’s the dataset.