Speculative Sampling
Explains speculative sampling, a technique using a draft and target model to accelerate large language model text generation.
Jay Mody is a software engineer and writer sharing clear, concise explanations of machine learning concepts and numerical computing. His blog focuses on intuition-driven deep dives into topics like GPT, attention, and efficient NumPy implementations.
5 articles from this blog
Explains speculative sampling, a technique using a draft and target model to accelerate large language model text generation.
A technical guide to implementing a GPT model from scratch using only 60 lines of NumPy code, including loading pre-trained GPT-2 weights.
Explains numerical instability in naive softmax and cross-entropy implementations and provides stable alternatives for deep learning.
A technical explanation of the attention mechanism in transformers, building intuition from key-value lookups to the scaled dot product equation.
A technical guide on computing distance matrices using NumPy, focusing on Euclidean distance and its application in machine learning algorithms like k-Nearest Neighbors.