Sebastian Raschka

SebastianRaschka.com is the personal blog of Sebastian Raschka, PhD, an LLM research engineer whose work bridges academia and industry in AI and machine learning. On his blog and notes section he publishes deep, well-documented articles on topics such as LLMs (large language models), reasoning models, machine learning in Python, neural networks, data science workflows, and deep learning architecture. Recent posts explore advanced themes like “reasoning LLMs”, comparisons of modern open-weight transformer architectures, and guides for building, training, or analyzing neural networks and model internals.

https://sebastianraschka.com/

RSS Feed

11/29/2025

ai machine learning python deep learning llm

Articles from this Blog

103 articles from this blog

2/1/2026 • EN

State of AI 2026 with Sebastian Raschka, Nathan Lambert, and Lex Fridman

A 4.5-hour interview discussing the state of AI in 2026, covering LLMs, geopolitics, training, open vs. closed models, AGI timelines, and industry implications.

Machine Learning artificial intelligence large language models

1/24/2026 • EN

Categories of Inference-Time Scaling for Improved LLM Reasoning

An overview of inference-time scaling methods for improving LLM reasoning, categorizing techniques and highlighting recent research.

llm Reasoning Inference Scaling

12/30/2025 • EN

The State Of LLMs 2025: Progress, Problems, and Predictions

A 2025 year-in-review analysis of large language models (LLMs), covering key developments in reasoning, architecture, costs, and predictions for 2026.

Machine Learning artificial intelligence llm

12/30/2025 • EN

LLM Research Papers: The 2025 List (July to December)

A curated list of notable LLM research papers from the second half of 2025, categorized by topics like reasoning, training, and multimodal models.

Machine Learning artificial intelligence llm

12/8/2025 • EN

From Random Forests to RLVR: A Short History of ML/AI Hello Worlds

A timeline of beginner-friendly 'Hello World' examples in machine learning and AI, from Random Forests in 2013 to modern RLVR models in 2025.

Machine Learning artificial intelligence history

12/3/2025 • EN

A Technical Tour of the DeepSeek Models from V3 to V3.2

A technical analysis of the DeepSeek model series, from V3 to the latest V3.2, covering architecture, performance, and release timeline.

llm Reinforcement Learning Deepseek

11/12/2025 • EN

Recommendations for Getting the Most Out of a Technical Book

Author's method for effectively reading technical books, including multiple read-throughs, coding along, and doing exercises.

LLM Development Technical Reading Learning Strategies

11/4/2025 • EN

Beyond Standard LLMs

An overview of alternative LLM architectures beyond standard transformers, including linear attention hybrids, text diffusion models, and code world models.

Autoregressive Models LLM Architectures Transformer Alternatives

10/29/2025 • EN

DGX Spark and Mac Mini for Local PyTorch Development

Compares DGX Spark and Mac Mini for local PyTorch development, focusing on LLM inference and fine-tuning performance benchmarks.

llm benchmark Gpu

10/5/2025 • EN

Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)

A guide to the four main methods for evaluating Large Language Models, including code examples and practical implementation details.

benchmarking LLM Evaluation Model Comparison

9/6/2025 • EN

Understanding and Implementing Qwen3 From Scratch

A hands-on guide to understanding and implementing the Qwen3 large language model architecture from scratch using pure PyTorch.

llm Pytorch Transformer

8/9/2025 • EN

From GPT-2 to gpt-oss: Analyzing the Architectural Advances

Analysis of OpenAI's new gpt-oss models, comparing architectural improvements from GPT-2 and examining optimizations like MXFP4 and Mixture-of-Experts.

Mixture Of Experts Transformer Architecture LLM Optimization

7/19/2025 • EN

The Big LLM Architecture Comparison

A detailed comparison of architectural developments in major large language models (LLMs) released in 2024-2025, focusing on structural changes beyond benchmarks.

Mixture Of Experts LLM Architecture Transformer Models