Large language models articles

3/9/2024 • EN

Using AI tools for coding: good or bad?

Explores the balanced use of AI coding tools like GitHub Copilot, discussing benefits, risks of hallucinations, and best practices for developers.

AI Coding Tools Chatgpt code generation Github Copilot large language models

Andrea Grandi

12/11/2023 • EN

Retrieval-Augmented Generation (RAG) simply explained

A simple explanation of Retrieval-Augmented Generation (RAG), covering its core components: LLMs, context, and vector databases.

large language models llm Rag Retrieval Augmented Generation Vector Databases

Luc van Donkersgoed

10/25/2023 • EN

Adversarial Attacks on LLMs

Explores adversarial attacks and jailbreak prompts that can make large language models produce unsafe or undesired outputs, bypassing safety measures.

Adversarial Attacks Jailbreak Prompts large language models llm security

Lilian Weng

10/18/2023 • EN

Building Intelligent Enterprise-Grade applications with Azure OpenAI and Microsoft Data Platform

Explores building enterprise applications using Azure OpenAI and Microsoft's data platform for secure, integrated AI solutions.

Azure Openai Enterprise Applications generative ai large language models Microsoft Data Platform

Hugo Barona

10/12/2023 • EN

Deploy Idefics 9B and 80B on Amazon SageMaker

A technical guide on deploying Hugging Face's IDEFICS visual language models (9B & 80B parameters) to Amazon SageMaker using the LLM DLC.

Amazon Sagemaker Idefics large language models Model Deployment Multimodal AI

Philipp Schmid

9/26/2023 • EN

Llama 2 on Amazon SageMaker a Benchmark

A benchmark analysis of deploying Meta's Llama 2 models on Amazon SageMaker using Hugging Face's LLM Inference Container, evaluating cost, latency, and throughput.

Amazon Sagemaker benchmark large language models Llama 2 Model Deployment

Philipp Schmid

9/20/2023 • EN

Fine-tune Falcon 180B with DeepSpeed ZeRO, LoRA and Flash Attention

A technical guide on fine-tuning the massive Falcon 180B language model using DeepSpeed ZeRO, LoRA, and Flash Attention for efficient training.

Deepspeed Falcon 180b Flash Attention large language models Lora

Philipp Schmid

6/1/2023 • EN

Semantic Kernel Planner 101

An introduction to Semantic Kernel's Planner, a tool for automatically generating and executing complex AI tasks using plugins and natural language goals.

AI Integration large language models Planner plugins Semantic Kernel

Geert Baeke

5/31/2023 • EN

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

Guide to deploying open-source LLMs like BLOOM and Open Assistant to Amazon SageMaker using Hugging Face's new LLM Inference Container.

Amazon Sagemaker Hugging Face large language models LLM Inference Text Generation Inference

Philipp Schmid

5/11/2023 • EN

Accelerating Large Language Models with Mixed-Precision Techniques

Exploring mixed-precision techniques to speed up large language model training and inference by up to 3x without losing accuracy.

Deep Learning Floating Point Precision Gpu Optimization large language models Mixed Precision Training

Sebastian Raschka

5/11/2023 • EN

Accelerating Large Language Models with Mixed-Precision Techniques

Explores how mixed-precision training techniques can speed up large language model training and inference by up to 3x, reducing memory use.

Deep Learning Floating Point Precision Gpu Optimization large language models Mixed Precision Training

Sebastian Raschka

5/7/2023 • EN

Open-LLMs - A list of LLMs for Commercial Use

A curated list of open-source Large Language Models (LLMs) available for commercial use, including community-contributed updates and details.

Commercial License Finetuning large language models Machine Learning open source

Eugene Yan

5/2/2023 • EN

How to scale LLM workloads to 20B+ with Amazon SageMaker using Hugging Face and PyTorch FSDP

A technical tutorial on fine-tuning a 20B+ parameter LLM using PyTorch FSDP and Hugging Face on Amazon SageMaker's multi-GPU infrastructure.

Amazon Sagemaker Hugging Face large language models Model Fine Tuning Pytorch Fsdp

Philipp Schmid

4/12/2023 • EN

Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters

A guide to parameter-efficient finetuning methods for large language models, covering techniques like prefix tuning and LLaMA-Adapters.

Adapters large language models Llama Adapter Parameter Efficient Finetuning Prefix Tuning

Sebastian Raschka

4/12/2023 • EN

Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters

Explains parameter-efficient finetuning methods for large language models, covering techniques like prefix tuning and LLaMA-Adapters.

Adapters large language models Llama Adapter Parameter Efficient Finetuning Prefix Tuning

Sebastian Raschka

4/4/2023 • EN

Introducing IGEL an instruction-tuned German large Language Model

Introduces IGEL, an instruction-tuned German large language model based on BLOOM, for NLP tasks like translation and QA.

Bloom German NLP Hugging Face Instruction Tuning large language models

Philipp Schmid

3/28/2023 • EN

Finetuning Large Language Models On A Single GPU Using Gradient Accumulation

Guide to finetuning large language models on a single GPU using gradient accumulation to overcome memory limitations.

Finetuning Gpu Memory Gradient Accumulation large language models Transformers

Sebastian Raschka

3/28/2023 • EN

Finetuning Large Language Models On A Single GPU Using Gradient Accumulation

A guide to finetuning large language models like BLOOM on a single GPU using gradient accumulation to overcome memory limits.

Bloom Finetuning Gpu Memory Gradient Accumulation large language models

Sebastian Raschka

3/23/2023 • EN

Efficient Large Language Model training with LoRA and Hugging Face

A technical guide on fine-tuning the large FLAN-T5 XXL model efficiently using LoRA and Hugging Face libraries on a single GPU.

Flan T5 Hugging Face large language models Lora Parameter Efficient Fine Tuning

Philipp Schmid

2/22/2023 • EN

Combine Amazon SageMaker and DeepSpeed to fine-tune FLAN-T5 XXL

Guide to fine-tuning the large FLAN-T5 XXL model using Amazon SageMaker managed training and DeepSpeed for optimization.

Amazon Sagemaker Deepspeed Fine Tuning Flan T5 large language models

Philipp Schmid

Large language models Articles

Using AI tools for coding: good or bad?

Retrieval-Augmented Generation (RAG) simply explained

Adversarial Attacks on LLMs

Building Intelligent Enterprise-Grade applications with Azure OpenAI and Microsoft Data Platform

Deploy Idefics 9B and 80B on Amazon SageMaker

Llama 2 on Amazon SageMaker a Benchmark

Fine-tune Falcon 180B with DeepSpeed ZeRO, LoRA and Flash Attention

Semantic Kernel Planner 101

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

Accelerating Large Language Models with Mixed-Precision Techniques

Accelerating Large Language Models with Mixed-Precision Techniques

Open-LLMs - A list of LLMs for Commercial Use

How to scale LLM workloads to 20B+ with Amazon SageMaker using Hugging Face and PyTorch FSDP

Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters

Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters

Introducing IGEL an instruction-tuned German large Language Model

Finetuning Large Language Models On A Single GPU Using Gradient Accumulation

Finetuning Large Language Models On A Single GPU Using Gradient Accumulation

Efficient Large Language Model training with LoRA and Hugging Face

Combine Amazon SageMaker and DeepSpeed to fine-tune FLAN-T5 XXL

Select Language