Language Models articles

1/7/2026 • EN

Quoting Robin Sloan

A reflection on the arrival of Artificial General Intelligence (AGI), arguing that its 'general' nature distinguishes it from previous purpose-built AI models.

Agi ai Artificial General Intelligence Language Models Machine Learning

Simon Willison

1/7/2026 • EN

Quoting Robin Sloan

A reflection on the arrival of Artificial General Intelligence (AGI), arguing that its 'general' nature distinguishes it from all previous purpose-built AI models.

Agi ai Artificial General Intelligence Language Models Machine Learning

Simon Willison

12/30/2025 • EN

Quoting Liz Fong-Jones

Explores how AI language models shift a programmer's role from writing code to managing context and providing detailed specifications.

ai-assisted development Context Management Language Models Programming Workflow software engineering

Simon Willison

11/22/2025 • EN

LLM APIs are a Synchronization Problem

Analyzes LLM APIs as a distributed state synchronization problem, critiquing their abstraction and proposing a mental model based on token and cache state.

api design distributed systems Language Models llm State Synchronization

Armin Ronacher

9/14/2025 • EN

Training an LLM-RecSys Hybrid for Steerable Recs with Semantic IDs

Explores training a hybrid LLM-recommender system using Semantic IDs for steerable, explainable recommendations.

Hybrid Models Language Models llm Recommender Systems Semantic Ids

Eugene Yan

8/13/2025 • EN

Designing the Built-in AI Web APIs

A Chrome engineer discusses the design challenges and considerations for creating new built-in AI web APIs, focusing on the prompt API and task-based models.

AI Models Browser Apis Language Models Prompt API web apis

Domenic Denicola

8/9/2025 • EN

The Worst AI Metric

Critique of the 'how many r's in strawberry' test as a poor benchmark for AI intelligence, arguing it measures irrelevant trivia.

ai artificial intelligence Benchmarks Language Models metrics

Daniel Miessler

6/20/2025 • EN

AIs that break down questions reason better

Explores how advanced AIs use 'chains of thought' reasoning to break complex problems into simpler steps, improving accuracy and performance.

artificial intelligence Conversational AI Deepseek Language Models Reasoning Models

Gael Varoquaux

6/19/2025 • EN

Benchmark models using OpenAI-compatible APIs

A guide to benchmarking language models using a Jupyter Notebook that supports any OpenAI-compatible API, including Ollama and Foundry Local.

benchmarking Jupyter Notebook Language Models Openai API Prompty

Waldek Mastykarz

6/17/2025 • EN

Language model benchmarks only tell half a story

Explains why standard language model benchmarks are insufficient and how to build custom benchmarks for specific application needs.

Benchmarks Dev Proxy Language Models Ollama Openai API

Waldek Mastykarz

6/16/2025 • EN

Building Your Own Mini-ChatGPT with R: From Markov Chains to Transformers!

A tutorial on building a transformer-based language model in R from scratch, covering tokenization, self-attention, and text generation.

Language Models Machine Learning Natural Language Processing R Transformers

Holger K. von Jouanne-Diedrich

1/20/2025 • EN

Calculate the number of language model tokens for a string

Learn how to accurately calculate token counts for strings using language models with a provided Jupyter Notebook tool.

Hugging Face Jupyter Notebook Language Models Openai Tokenization

Waldek Mastykarz

11/28/2024 • EN

Reward Hacking in Reinforcement Learning

Explores reward hacking in reinforcement learning, where AI agents exploit reward function flaws, and its critical impact on RLHF and language model alignment.

Alignment Language Models Reinforcement Learning Reward Hacking Rlhf

Lilian Weng

8/27/2024 • EN

World Model + Next Token Prediction = Answer Prediction

A philosophical and technical exploration of how Large Language Models (LLMs) transform 'next token prediction' into meaningful answer generation.

AI Reasoning Language Models llm Next Token Prediction World Models

Daniel Miessler

8/17/2024 • EN

New LLM Pre-training and Post-training Paradigms

Analyzes the latest pre-training and post-training methodologies used in state-of-the-art LLMs like Qwen 2, Apple's models, Gemma 2, and Llama 3.1.

Fine Tuning Language Models llm Post Training Pre Training

Sebastian Raschka

4/24/2023 • EN

Prompt Engineering is for Transactional Prompting

The article distinguishes between interactive and transactional prompting, arguing that prompt engineering is most valuable for transactional, objective tasks with LLMs.

Interactive Prompting Language Models llm prompt engineering Transactional Prompting

Mitchell Hashimoto

4/14/2023 • EN

Prompt Engineering vs. Blind Prompting

Explores the difference between rigorous prompt engineering and amateur 'blind prompting' for language models, advocating for a systematic, test-driven approach.

ai development Language Models LLM Applications prompt engineering Testing Methodologies

Mitchell Hashimoto