Hugging Face articles

1/23/2024 • EN

How to Fine-Tune LLMs in 2024 with Hugging Face

A practical guide to fine-tuning open-source large language models (LLMs) using Hugging Face's TRL and Transformers libraries in 2024.

Datasets Hugging Face LLM Fine Tuning Transformers Trl

Philipp Schmid

1/11/2024 • EN

Scale LLM Inference on Amazon SageMaker with Multi-Replica Endpoints

Guide to scaling LLM inference on Amazon SageMaker using new multi-replica endpoints for improved throughput and cost efficiency.

Amazon Sagemaker Hugging Face LLM Inference Multi Replica Endpoints Text Generation Inference

Philipp Schmid

12/21/2023 • EN

Fine-tune Llama 7B on AWS Trainium

A technical tutorial on fine-tuning the Llama 2 7B large language model using AWS Trainium instances and Hugging Face libraries.

AWS Trainium Hugging Face Llama 2 Model Fine Tuning Optimum Neuron

Philipp Schmid

12/12/2023 • EN

Deploy Mixtral 8x7B on Amazon SageMaker

A technical guide on deploying the Mixtral 8x7B open-source LLM from Mistral AI to Amazon SageMaker using the Hugging Face LLM DLC.

Amazon Sagemaker Hugging Face LLM Deployment Mixture Of Experts Text Generation Inference

Philipp Schmid

10/30/2023 • EN

Evaluate LLMs and RAG a practical example using Langchain and Hugging Face

A hands-on guide to evaluating LLMs and RAG systems using Langchain and Hugging Face, covering criteria-based and pairwise evaluation methods.

Gpt 4 Hugging Face Langchain LLM Evaluation Rag

Philipp Schmid

10/5/2023 • EN

Train and Deploy Mistral 7B with Hugging Face on Amazon SageMaker

A technical guide on fine-tuning the Mistral 7B large language model using QLoRA and deploying it on Amazon SageMaker with Hugging Face tools.

Amazon Sagemaker Hugging Face LLM Fine Tuning Mistral 7b Qlora

Philipp Schmid

9/7/2023 • EN

Deploy Falcon 180B on Amazon SageMaker

A technical guide on deploying the Falcon 180B open-source large language model to Amazon SageMaker using the Hugging Face LLM DLC.

Amazon Sagemaker Falcon 180b Hugging Face LLM Deployment Text Generation Inference

Philipp Schmid

8/31/2023 • EN

Optimize open LLMs using GPTQ and Hugging Face Optimum

A guide to using GPTQ quantization with Hugging Face Optimum to compress open-source LLMs for efficient deployment on smaller hardware.

Gptq Hugging Face llm Optimum Quantization

Philipp Schmid

8/15/2023 • EN

LLMOps: Deploy Open LLMs using Infrastructure as Code with AWS CDK

A technical guide on deploying open-source LLMs like Llama 2 using Infrastructure as Code with AWS CDK and the Hugging Face LLM construct.

AWS Cdk Hugging Face Infrastructure As Code Llama 2 Llmops

Philipp Schmid

8/7/2023 • EN

Deploy Llama 2 7B/13B/70B on Amazon SageMaker

A technical guide on deploying Meta's Llama 2 large language models (7B, 13B, 70B) on Amazon SageMaker using the Hugging Face LLM DLC.

Amazon Sagemaker Hugging Face Llama 2 LLM Deployment Text Generation Inference

Philipp Schmid

7/13/2023 • EN

Train LLMs using QLoRA on Amazon SageMaker

A technical guide on using QLoRA to efficiently fine-tune the Falcon 40B large language model on Amazon SageMaker.

Amazon Sagemaker Hugging Face LLM Fine Tuning Parameter Efficient Fine Tuning Qlora

Philipp Schmid

7/4/2023 • EN

Deploy LLMs with Hugging Face Inference Endpoints

A guide to deploying open-source Large Language Models (LLMs) like Falcon using Hugging Face's managed Inference Endpoints service.

api Hugging Face Inference Endpoints LLM Deployment Machine Learning

Philipp Schmid

6/20/2023 • EN

Securely deploy LLMs inside VPCs with Hugging Face and Amazon SageMaker

A technical guide on deploying open-source Large Language Models (LLMs) from Amazon S3 to Amazon SageMaker using Hugging Face's LLM Inference Container within a VPC.

Amazon Sagemaker AWS Vpc Hugging Face LLM Deployment Model Inference

Philipp Schmid

6/7/2023 • EN

Deploy Falcon 7B and 40B on Amazon SageMaker

A technical guide on deploying the open-source Falcon 7B and 40B large language models to Amazon SageMaker using the Hugging Face LLM Inference Container.

Amazon Sagemaker Falcon 40b Hugging Face LLM Inference Model Deployment

Philipp Schmid