Gpu Optimization articles

8/10/2023 • EN

The NeurIPS 2023 LLM Efficiency Challenge Starter Guide

A guide to participating in the NeurIPS 2023 LLM Efficiency Challenge, covering setup, rules, and strategies for efficient LLM fine-tuning on limited hardware.

Finetuning Gpu Optimization LLM Efficiency Neural Networks Neurips

Sebastian Raschka

5/11/2023 • EN

Accelerating Large Language Models with Mixed-Precision Techniques

Exploring mixed-precision techniques to speed up large language model training and inference by up to 3x without losing accuracy.

Deep Learning Floating Point Precision Gpu Optimization large language models Mixed Precision Training

Sebastian Raschka

5/11/2023 • EN

Accelerating Large Language Models with Mixed-Precision Techniques

Explores how mixed-precision training techniques can speed up large language model training and inference by up to 3x, reducing memory use.

Deep Learning Floating Point Precision Gpu Optimization large language models Mixed Precision Training

Sebastian Raschka

11/8/2022 • EN

Accelerate Stable Diffusion inference with DeepSpeed-Inference on GPUs

Learn to optimize Stable Diffusion for faster GPU inference using DeepSpeed-Inference and Hugging Face Diffusers.

aws ec2 Deepspeed Inference Gpu Optimization Hugging Face Diffusers stable diffusion

Philipp Schmid

9/13/2022 • EN

Accelerate GPT-J inference with DeepSpeed-Inference on GPUs

Learn to optimize GPT-J inference using DeepSpeed-Inference and Hugging Face Transformers for faster GPU performance.

Deepspeed Inference Gpt J Gpu Optimization large language models Transformer Models

Philipp Schmid

7/13/2022 • EN

Optimizing Transformers for GPUs with Optimum

Learn to optimize Hugging Face Transformers models for GPU inference using Optimum and ONNX Runtime to reduce latency.

Distilbert Gpu Optimization Onnx Runtime Optimum Transformers

Philipp Schmid

7/5/2022 • EN

No, We Don't Have to Choose Batch Sizes As Powers Of 2

Challenges the common practice of using powers of 2 for neural network batch sizes, examining the theory and practical benchmarks.

Batch Size Deep Learning Gpu Optimization memory alignment Neural Networks

Sebastian Raschka

7/5/2022 • EN

No, We Don't Have to Choose Batch Sizes As Powers Of 2

Examines the common practice of using powers of 2 for neural network batch sizes, questioning its necessity with practical and theoretical insights.

Batch Size Deep Learning Gpu Optimization memory alignment Neural Networks

Sebastian Raschka

Gpu Optimization Articles

The NeurIPS 2023 LLM Efficiency Challenge Starter Guide

Accelerating Large Language Models with Mixed-Precision Techniques

Accelerating Large Language Models with Mixed-Precision Techniques

Accelerate Stable Diffusion inference with DeepSpeed-Inference on GPUs

Accelerate GPT-J inference with DeepSpeed-Inference on GPUs

Optimizing Transformers for GPUs with Optimum

No, We Don't Have to Choose Batch Sizes As Powers Of 2

No, We Don't Have to Choose Batch Sizes As Powers Of 2

Select Language

We use cookies