Sebastian Raschka

Sebastian Raschka, PhD, is an LLM Research Engineer and AI expert bridging academia and industry, specializing in large language models, high-performance AI systems, and practical, code-driven machine learning.

https://sebastianraschka.com

RSS Feed

1/5/2026

Large Language Models Artificial Intelligence Machine Learning LLM Research AI Engineering

Articles from this Blog

97 articles from this blog

1/23/2025 • EN

Noteworthy LLM Research Papers of 2024

A curated list of 12 influential LLM research papers from each month of 2024, covering topics like Mixture of Experts, LoRA, and scaling laws.

Lora Ppo Dpo

1/17/2025 • EN

Implementing A Byte Pair Encoding (BPE) Tokenizer From Scratch

A step-by-step educational guide to building a Byte Pair Encoding (BPE) tokenizer from scratch, as used in models like GPT and Llama.

llm algorithm NLP

11/3/2024 • EN

Understanding Multimodal LLMs

Explains how multimodal LLMs work, reviews recent models like Llama 3.2, and compares different architectural approaches.

large language models computer vision AI Research

9/21/2024 • EN

Building A GPT-Style LLM Classifier From Scratch

A guide to transforming pretrained LLMs into text classifiers, with insights from the author's new book on building LLMs from scratch.

llm classification Gpt

9/1/2024 • EN

Building LLMs from the Ground Up: A 3-hour Coding Workshop

A 3-hour coding workshop video covering the implementation, training, and use of Large Language Models (LLMs) from scratch.

Machine Learning llm Deep Learning

8/17/2024 • EN

New LLM Pre-training and Post-training Paradigms

A technical review of the latest pre-training and post-training methodologies used in state-of-the-art large language models (LLMs) like Qwen 2 and Llama 3.1.

llm large language models ai

7/20/2024 • EN

Instruction Pretraining LLMs

Explores recent research on instruction finetuning for LLMs, including cost-effective data generation methods and an overview of new models like Gemma 2.

llm Instruction Finetuning Alignment Data

6/2/2024 • EN

LLM Research Insights: Instruction Masking and New LoRA Finetuning Experiments?

Explores new research on instruction masking and LoRA finetuning techniques for improving large language models (LLMs).

llm Finetuning Lora

6/2/2024 • EN

Developing an LLM: Building, Training, Finetuning

A 1-hour video presentation covering the full development cycle of Large Language Models, from architecture and pretraining to finetuning and evaluation.

Tokenization Finetuning LLM Development

5/12/2024 • EN

How Good Are the Latest Open LLMs? And Is DPO Better Than PPO?

A review and comparison of the latest open LLMs (Mixtral, Llama 3, Phi-3, OpenELM) and a study on DPO vs. PPO for LLM alignment.

llm Reinforcement Learning Transformer

3/31/2024 • EN

Tips for LLM Pretraining and Evaluating Reward Models

Discusses strategies for continual pretraining of LLMs and evaluating reward models for RLHF, based on recent research papers.

AI Research Reinforcement Learning LLM Pretraining

3/3/2024 • EN

Research Papers in February 2024

A summary of key AI research papers from February 2024, focusing on new open-source LLMs, small fine-tuned models, and efficient fine-tuning techniques.

open source llm AI Research

2/18/2024 • EN

Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch

A technical guide implementing DoRA, a new low-rank adaptation method for efficient model finetuning, from scratch in PyTorch.

Pytorch Finetuning Lora

9/15/2023 • EN

Optimizing LLMs From a Dataset Perspective

Explores dataset-centric strategies for fine-tuning LLMs, focusing on instruction datasets to improve model performance without altering architecture.

llm Neural Networks Dataset

8/10/2023 • EN

The NeurIPS 2023 LLM Efficiency Challenge Starter Guide

A guide to participating in the NeurIPS 2023 LLM Efficiency Challenge, covering setup, rules, and strategies for efficient LLM fine-tuning on limited hardware.

Neural Networks Gpu Optimization Finetuning

7/1/2023 • EN

Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch

A guide to 9 PyTorch techniques for drastically reducing memory usage when training vision transformers and LLMs, enabling training on consumer hardware.

memory optimization Transformers Deep Learning