Deep Learning articles

5/30/2023 • EN

Mortal Komputation: On Hinton's argument for superhuman AI.

Analyzes Geoffrey Hinton's technical argument comparing biological and digital intelligence, concluding digital AI will surpass human capabilities.

Agi AI Safety artificial intelligence Deep Learning Neural Networks

Ferenc Huszár

5/21/2023 • EN

Some Intuition on Attention and the Transformer

Explains the intuition behind the Attention mechanism and Transformer architecture, focusing on solving issues in machine translation and language modeling.

Attention Mechanism Deep Learning llm NLP Transformer

Eugene Yan

5/11/2023 • EN

Accelerating Large Language Models with Mixed-Precision Techniques

Explores how mixed-precision training techniques can speed up large language model training and inference by up to 3x, reducing memory use.

Deep Learning Floating Point Precision Gpu Optimization large language models Mixed Precision Training

Sebastian Raschka

5/11/2023 • EN

Accelerating Large Language Models with Mixed-Precision Techniques

Exploring mixed-precision techniques to speed up large language model training and inference by up to 3x without losing accuracy.

Deep Learning Floating Point Precision Gpu Optimization large language models Mixed Precision Training

Sebastian Raschka

4/25/2023 • EN

Setting up AWS Trainium for Hugging Face Transformers

A guide to setting up an AWS Trainium instance using the Hugging Face Neuron Deep Learning AMI to fine-tune a BERT model for text classification.

AWS Trainium Deep Learning Ec2 Instances Hugging Face Transformers Neuron Dlami

Philipp Schmid

3/22/2023 • EN

We May be Surprised Again: Why I take LLMs seriously.

A reflection on past skepticism of deep learning and why similar dismissal of Large Language Models (LLMs) might be a mistake.

artificial intelligence Deep Learning llm Machine Learning Statistical Learning Theory

Ferenc Huszár

3/16/2023 • EN

Getting started with Pytorch 2.0 and Hugging Face Transformers

A tutorial on fine-tuning a BERT model for text classification using the new PyTorch 2.0 framework and the Hugging Face Transformers library.

aws Deep Learning Model Fine Tuning Pytorch Transformers

Philipp Schmid

2/9/2023 • EN

Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch

A technical guide to coding the self-attention mechanism from scratch, as used in transformers and large language models.

Deep Learning Natural Language Processing Neural Networks Self Attention Transformer

Sebastian Raschka

1/27/2023 • EN

The Transformer Family Version 2.0

An updated, comprehensive overview of the Transformer architecture and its many recent improvements, including detailed notation and attention mechanisms.

Attention Mechanisms Deep Learning Natural Language Processing Neural Networks Transformer Architecture

Lilian Weng

1/5/2023 • EN

Open Source Highlights 2022 for Machine Learning & AI

A curated list of the top 10 open-source machine learning and AI projects released or updated in 2022, including PyTorch 2.0 and scikit-learn 1.2.

Deep Learning Machine Learning Neural Networks open source Pytorch

Sebastian Raschka

1/5/2023 • EN

Open Source Highlights 2022 for Machine Learning & AI

A curated list of the top 10 open-source releases in Machine Learning & AI for 2022, including PyTorch 2.0 and scikit-learn 1.2.

ai Deep Learning Machine Learning open source Pytorch

Sebastian Raschka

12/15/2022 • EN

Numerically Stable Softmax and Cross Entropy

Explains numerical instability in naive softmax and cross-entropy implementations and provides stable alternatives for deep learning.

Cross Entropy Deep Learning Machine Learning Numerical Stability Softmax

Jay Mody

12/11/2022 • EN

Autoencoders and Diffusers: A Brief Comparison

Compares autoencoders and diffusers, explaining their architectures, learning paradigms, and key differences in deep learning.

Autoencoders Deep Learning Denoising Diffusion Models Neural Networks

Eugene Yan

11/27/2022 • EN

Text-to-Image: Diffusion, Text Conditioning, Guidance, Latent Space

Explains core concepts behind modern text-to-image AI models like DALL-E 2 and Stable Diffusion, including diffusion, text conditioning, and latent space.

Classifier Guidance Deep Learning Diffusion Models Latent Space Text To Image

Eugene Yan

10/22/2022 • EN

An Intuition for Attention

A technical explanation of the attention mechanism in transformers, building intuition from key-value lookups to the scaled dot product equation.

Attention Mechanism Deep Learning Machine Learning Neural Networks Transformers

Jay Mody

10/15/2022 • EN

Ahead Of AI, And What's Next?

Author announces a new monthly AI newsletter, 'Ahead Of AI,' and shares updates on a passion project and conference appearances.

AI Trends Deep Learning Machine Learning newsletter Production Systems

Sebastian Raschka

10/15/2022 • EN

Ahead Of AI, And What's Next?

Author announces the launch of 'Ahead of AI', a monthly newsletter covering AI trends, educational content, and personal updates on machine learning projects.

AI Newsletter Deep Learning Diffusion Models Machine Learning Neural Networks

Sebastian Raschka