Neural Networks articles

11/20/2020 • EN

Some Intuition on the Neural Tangent Kernel

Explains the Neural Tangent Kernel concept through simple 1D regression examples to illustrate how neural networks evolve during training.

Gradient Descent Linear Models Machine Learning Neural Networks Neural Tangent Kernel

Ferenc Huszár

10/29/2020 • EN

How to Build an Open-Domain Question Answering System?

A technical overview of approaches for building open-domain question answering systems using pretrained language models and neural networks.

AI Assistant Language Models Neural Networks Open Domain Question Answering Transformer Models

Lilian Weng

10/7/2020 • EN

Is deep learning a new kind of programming? Operationalistic look at programming

Explores whether deep learning creates a new kind of program, using the philosophy of operationalism to compare it with traditional programming.

Deep Learning Neural Networks Operationalism programming paradigms Software Philosophy

Tomas Petricek

8/16/2020 • EN

NLP for Supervised Learning - A Brief Survey

A chronological survey of key NLP models and techniques for supervised learning, from early RNNs to modern transformers like BERT and T5.

Deep Learning Machine Learning Natural Language Processing Neural Networks Supervised Learning

Eugene Yan

8/6/2020 • EN

Neural Architecture Search

An overview of Neural Architecture Search (NAS), covering its core components: search space, algorithms, and evaluation strategies for automating AI model design.

Automl Deep Learning Machine Learning Neural Architecture Search Neural Networks

Lilian Weng

7/27/2020 • EN

How GPT3 Works - Visualizations and Animations

A visual guide explaining how GPT-3 is trained and generates text, breaking down its transformer architecture and massive scale.

Attention Mechanism Gpt3 Language Models Neural Networks Transformers

Jay Alammar

4/7/2020 • EN

The Transformer Family

An updated overview of the Transformer model family, covering improvements for longer attention spans, efficiency, and new architectures since 2020.

Attention Mechanism Machine Learning Neural Networks NLP Transformer

Lilian Weng

2/9/2020 • EN

Welcome to my blog about ML and data

A personal blog about machine learning, data annotation projects, and professional experiences in deep learning and AI product development.

Data Annotation Deep Learning Image Segmentation Machine Learning Neural Networks

Igor Susmelj

1/29/2020 • EN

Curriculum for Reinforcement Learning

Explores curriculum learning strategies for training reinforcement learning models more efficiently, from simple to complex tasks.

Curriculum Learning Machine Learning Neural Networks Reinforcement Learning Training Strategies

Lilian Weng

12/1/2019 • EN

How many giraffes?

A review of Janelle Shane's AI humor book, discussing neural network limitations and the real-world impact of class imbalance in machine learning.

ai Class Imbalance Data Science Machine Learning Neural Networks

Thomas Lumley

7/20/2019 • EN

Dreams, Drugs & ConvNets

Explores the visual similarities between images generated by neural networks and human experiences in dreams or under psychedelics.

Cognitive Science Convolutional Neural Networks Deep Learning Machine Learning Neural Networks

Piotr Migdał

7/15/2019 • EN

Exploring human vs machine learning (one blogpost at a time)

A blog post exploring the parallels and differences between human cognition and machine learning, including biases and inspirations.

artificial intelligence Cognitive Science Deep Learning Machine Learning Neural Networks

Piotr Migdał

4/1/2019 • EN

Mixture of Variational Autoencoders - a Fusion Between MoE and VAE

Explores an unsupervised approach combining Mixture of Experts (MoE) with Variational Autoencoders (VAE) for conditional data generation without labels.

Generative Models Mixture Of Experts Neural Networks Unsupervised Learning Variational Autoencoder

Yoel Zeldes

3/30/2019 • EN

Understanding Multilayer Perceptron in Depth

A deep dive into designing and implementing a Multilayer Perceptron from scratch, exploring the core concepts of neural network architecture and training.

Deep Learning Machine Learning Multilayer Perceptron Neural Networks software design

Julien Jerphanion

3/14/2019 • EN

Are Deep Neural Networks Dramatically Overfitted?

Explores the paradox of why deep neural networks generalize well despite having many parameters, discussing theories like Occam's Razor and the Lottery Ticket Hypothesis.

Deep Learning Generalization Machine Learning Neural Networks Overfitting

Lilian Weng

11/11/2018 • EN

Variational Autoencoders Explained in Detail

A detailed technical tutorial on implementing a Variational Autoencoder (VAE) with TensorFlow, including code and conditioning on digit types.

Deep Learning Mnist Neural Networks Tensorflow Variational Autoencoder

Yoel Zeldes

10/28/2018 • EN

How to Engineer Your Way Out of Slow Models

A technical case study on optimizing a slow multi-modal ML model for production using caching, async processing, and a microservices architecture.

caching Embedding Inference Speed Model Optimization Neural Networks

Yoel Zeldes

9/15/2018 • EN

Simple diagrams of convoluted neural networks

An overview of tools and techniques for creating clear and insightful diagrams to visualize complex neural network architectures.

Cnn Deep Learning Diagrams Neural Networks Visualization

Piotr Migdał

8/21/2018 • EN

Uncertainty for CTR Prediction: One Model to Clarify Them All

Explains how Taboola built a unified neural network model to predict CTR and estimate prediction uncertainty for recommender systems.

Ctr Prediction Machine Learning Neural Networks Recommender Systems Uncertainty Estimation

Yoel Zeldes

8/12/2018 • EN

From Autoencoder to Beta-VAE

Explores the evolution from basic Autoencoders to Beta-VAE, covering their architecture, mathematical notation, and applications in dimensionality reduction.

Beta Vae Deep Learning Generative Models Neural Networks Variational Autoencoder

Lilian Weng