Sebastian Raschka

Sebastian Raschka, PhD, is an LLM Research Engineer and AI expert bridging academia and industry, specializing in large language models, high-performance AI systems, and practical, code-driven machine learning.

https://sebastianraschka.com

RSS Feed

1/5/2026

Large Language Models Artificial Intelligence Machine Learning LLM Research AI Engineering

Articles from this Blog

97 articles from this blog

3/28/2023 • EN

Finetuning Large Language Models On A Single GPU Using Gradient Accumulation

A guide to finetuning large language models like BLOOM on a single GPU using gradient accumulation to overcome memory limits.

large language models Gradient Accumulation Gpu Memory

3/23/2023 • EN

Keeping Up With AI Research And News

A guide on managing the overwhelming volume of AI/ML research, sharing strategies and tools for prioritizing and staying updated effectively.

Machine Learning productivity artificial intelligence

2/23/2023 • EN

Some Techniques To Make Your PyTorch Models Train (Much) Faster

Techniques to accelerate PyTorch model training by 8x using PyTorch Lightning, with a DistilBERT fine-tuning example.

performance optimization Pytorch Model Training

2/9/2023 • EN

Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch

A technical guide to coding the self-attention mechanism from scratch, as used in transformers and large language models.

Neural Networks Deep Learning Natural Language Processing

2/7/2023 • EN

Understanding Large Language Models -- A Transformative Reading List

A curated reading list of key academic papers for understanding the development and architecture of large language models and transformers.

Machine Learning large language models Transformers

2/1/2023 • EN

What Are the Different Approaches for Detecting Content Generated by LLMs Such As ChatGPT? And How Do They Work and Differ?

Explains four methods for detecting AI-generated text, including OpenAI's AI Classifier, DetectGPT, GPTZero, and watermarking, and how they work.

Text Classification AI Detection Detectgpt

1/29/2023 • EN

Comparing Different Automatic Image Augmentation Methods in PyTorch

A comparison of four automatic image augmentation methods (AutoAugment, RandAugment, AugMix, TrivialAugment) in PyTorch for reducing overfitting in deep learning.

Pytorch Image Augmentation Autoaugment

1/16/2023 • EN

Curated Resources and Trustworthy Experts: The Key Ingredients for Finding Accurate Answers to Technical Questions in the Future

Discusses the limitations of AI chatbots like ChatGPT in providing accurate technical answers and proposes curated resources and expert knowledge as future solutions.

large language models Chatgpt AI Training

1/15/2023 • EN

Training an XGBoost Classifier Using Cloud GPUs Without Worrying About Infrastructure

A guide to training XGBoost models on cloud GPUs using the Lightning AI framework, bypassing complex infrastructure setup.

Machine Learning cloud computing Gpu

1/5/2023 • EN

Open Source Highlights 2022 for Machine Learning & AI

A curated list of the top 10 open-source releases in Machine Learning & AI for 2022, including PyTorch 2.0 and scikit-learn 1.2.

Machine Learning open source ai

1/3/2023 • EN

Influential Machine Learning Papers Of 2022

A review of the top 10 influential machine learning research papers from 2022, including ConvNeXt and MaxViT, highlighting key advancements in AI.

Machine Learning Convolutional Neural Networks Convnext

10/15/2022 • EN

Ahead Of AI, And What's Next?

Author announces a new monthly AI newsletter, 'Ahead Of AI,' and shares updates on a passion project and conference appearances.

Machine Learning newsletter Deep Learning

7/24/2022 • EN

A Short Chronology Of Deep Learning For Tabular Data

A curated list and summary of recent research papers exploring deep learning methods specifically designed for tabular data.

Machine Learning tabular data Deep Learning

7/5/2022 • EN

No, We Don't Have to Choose Batch Sizes As Powers Of 2

Examines the common practice of using powers of 2 for neural network batch sizes, questioning its necessity with practical and theoretical insights.

memory alignment Neural Networks Deep Learning

6/30/2022 • EN

Sharing Deep Learning Research Models with Lightning Part 2: Leveraging the Cloud

A guide to deploying and sharing deep learning research demos on the cloud using the Lightning framework, including model training.

Deep Learning Lightning Framework Cloud Deployment

6/17/2022 • EN

Sharing Deep Learning Research Models with Lightning Part 1: Building A Super Resolution App

A tutorial on building a Super Resolution GAN demo app using the Lightning framework to share deep learning research models.

Deep Learning Super Resolution Gan

6/12/2022 • EN

Taking Datasets, DataLoaders, and PyTorch’s New DataPipes for a Spin

A hands-on exploration of PyTorch's new DataPipes for efficient data loading, comparing them to traditional Datasets and DataLoaders.

dataloader Dataset Pytorch

5/18/2022 • EN

Running PyTorch on the M1 GPU

A hands-on review and benchmark of PyTorch's new official GPU support for Apple's M1 chips, covering installation and performance.

Neural Networks Deep Learning Pytorch

4/25/2022 • EN

Creating Confidence Intervals for Machine Learning Classifiers

A technical guide explaining methods for creating confidence intervals to measure uncertainty in machine learning model performance.

Machine Learning statistics performance metrics

4/4/2022 • EN

Losses Learned

Explains cross-entropy loss in PyTorch for binary and multiclass classification, highlighting common implementation pitfalls and best practices.

Neural Networks Deep Learning Pytorch

Previous 1 2 3 4 5 Next

Sebastian Raschka

Articles from this Blog

Select Language