Amazon Sagemaker articles

12/3/2024 • EN

Deploy QwQ-32B-Preview the best open Reasoning Model on AWS with Hugging Face

A technical guide on deploying the QwQ-32B-Preview open-source reasoning model on AWS SageMaker using Hugging Face's tools.

Amazon Sagemaker aws Hugging Face LLM Deployment Text Generation Inference

Philipp Schmid

10/17/2024 • EN

Deploy Llama 3.2 Vision on Amazon SageMaker

A technical guide on deploying Meta's Llama 3.2 Vision model on Amazon SageMaker using the Hugging Face LLM DLC.

Amazon Sagemaker Hugging Face large language models Llama 32 Model Deployment

Philipp Schmid

9/23/2024 • EN

AWS Certified AI Practitioner: My exam experience

A developer shares their experience taking the AWS Certified AI Practitioner beta exam, covering study methods, key topics, and exam structure.

AI Practitioner Amazon Bedrock Amazon Sagemaker AWS Certification Machine Learning

Diego Marques

8/5/2024 • EN

Deploy open LLMs with Terraform and Amazon SageMaker

A guide to deploying open-source LLMs like Llama 3 to Amazon SageMaker using Terraform for Infrastructure as Code.

Amazon Sagemaker Infrastructure As Code LLM Deployment Machine Learning Terraform

Philipp Schmid

6/25/2024 • EN

Train and Deploy open Embedding Models on Amazon SageMaker

A guide to fine-tuning and deploying custom embedding models for RAG applications on Amazon SageMaker using Sentence Transformers v3.

Amazon Sagemaker Embedding Models Hugging Face Rag Sentence Transformers

Philipp Schmid

6/18/2024 • EN

Deploy Mixtral 8x7B on AWS Inferentia2 with Hugging Face Optimum

A technical guide on deploying the Mixtral 8x7B LLM on AWS Inferentia2 using Hugging Face Optimum and Amazon SageMaker.

Amazon Sagemaker AWS Inferentia2 Hugging Face Optimum LLM Deployment Mixtral 8x7b

Philipp Schmid

6/11/2024 • EN

Fine-tune Llama 3 with PyTorch FSDP and Q-Lora on Amazon SageMaker

A technical guide on fine-tuning the Llama 3 LLM using PyTorch FSDP and Q-Lora on Amazon SageMaker for efficient training.

Amazon Sagemaker Fine Tuning Llama 3 Pytorch Fsdp Q Lora

Philipp Schmid

5/23/2024 • EN

Deploy Llama 3 70B on AWS Inferentia2 with Hugging Face Optimum

A technical guide on deploying Meta's Llama 3 70B Instruct model on AWS Inferentia2 using Hugging Face Optimum and Amazon SageMaker.

Amazon Sagemaker AWS Inferentia2 Hugging Face Optimum LLM Deployment Meta Llama 3

Philipp Schmid

4/18/2024 • EN

Deploy Llama 3 on Amazon SageMaker

A technical guide on deploying Meta's Llama 3 70B model on Amazon SageMaker using the Hugging Face LLM DLC and Text Generation Inference.

Amazon Sagemaker Hugging Face large language models Llama 3 Model Deployment

Philipp Schmid

4/2/2024 • EN

Accelerate Mixtral 8x7B with Speculative Decoding and Quantization on Amazon SageMaker

A technical guide on accelerating the Mixtral 8x7B LLM using speculative decoding (Medusa) and quantization (AWQ) for deployment on Amazon SageMaker.

Amazon Sagemaker LLM Inference Mixtral 8x7b Quantization Speculative Decoding

Philipp Schmid

3/26/2024 • EN

Deploy Llama 2 70B on AWS Inferentia2 with Hugging Face Optimum

A technical guide on deploying Meta's Llama 2 70B large language model on AWS Inferentia2 hardware using Hugging Face Optimum and SageMaker.

Amazon Sagemaker AWS Inferentia2 Hugging Face Optimum Llama 2 LLM Deployment

Philipp Schmid

3/12/2024 • EN

Fine-Tune and Evaluate LLMs in 2024 with Amazon SageMaker

A technical guide on fine-tuning and evaluating open-source Large Language Models (LLMs) using Amazon SageMaker and Hugging Face libraries.

Amazon Sagemaker Hugging Face LLM Fine Tuning Model Evaluation Trl

Philipp Schmid

3/5/2024 • EN

Evaluate LLMs with Hugging Face Lighteval on Amazon SageMaker

A tutorial on evaluating Large Language Models using Hugging Face's Lighteval library on Amazon SageMaker, focusing on benchmarks like TruthfulQA.

Amazon Sagemaker benchmarking Hugging Face Lighteval LLM Evaluation

Philipp Schmid

1/11/2024 • EN

Scale LLM Inference on Amazon SageMaker with Multi-Replica Endpoints

Guide to scaling LLM inference on Amazon SageMaker using new multi-replica endpoints for improved throughput and cost efficiency.

Amazon Sagemaker Hugging Face LLM Inference Multi Replica Endpoints Text Generation Inference

Philipp Schmid

12/12/2023 • EN

Deploy Mixtral 8x7B on Amazon SageMaker

A technical guide on deploying the Mixtral 8x7B open-source LLM from Mistral AI to Amazon SageMaker using the Hugging Face LLM DLC.

Amazon Sagemaker Hugging Face LLM Deployment Mixture Of Experts Text Generation Inference

Philipp Schmid

11/21/2023 • EN

Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker

Tutorial on deploying embedding models using AWS Inferentia2 and Amazon SageMaker for accelerated inference performance.

Amazon Sagemaker aws AWS Inferentia2 Embeddings Model Optimum Neuron

Philipp Schmid

11/14/2023 • EN

Deploy Llama 2 7B on AWS inferentia2 with Amazon SageMaker

A tutorial on deploying Meta's Llama 2 7B model on AWS Inferentia2 using Amazon SageMaker and the optimum-neuron library.

Amazon Sagemaker AWS Inferentia2 Llama 2 Model Deployment Optimum Neuron

Philipp Schmid

11/7/2023 • EN

Deploy Stable Diffusion XL on AWS inferentia2 with Amazon SageMaker

A tutorial on deploying Stable Diffusion XL for accelerated inference using AWS Inferentia2 and Amazon SageMaker.

Amazon Sagemaker AWS Inferentia2 Deep Learning Inference Model Deployment stable diffusion

Philipp Schmid

10/12/2023 • EN

Deploy Idefics 9B and 80B on Amazon SageMaker

A technical guide on deploying Hugging Face's IDEFICS visual language models (9B & 80B parameters) to Amazon SageMaker using the LLM DLC.

Amazon Sagemaker Idefics large language models Model Deployment Multimodal AI

Philipp Schmid

10/5/2023 • EN

Train and Deploy Mistral 7B with Hugging Face on Amazon SageMaker

A technical guide on fine-tuning the Mistral 7B large language model using QLoRA and deploying it on Amazon SageMaker with Hugging Face tools.

Amazon Sagemaker Hugging Face LLM Fine Tuning Mistral 7b Qlora

Philipp Schmid

Amazon Sagemaker Articles

Deploy QwQ-32B-Preview the best open Reasoning Model on AWS with Hugging Face

Deploy Llama 3.2 Vision on Amazon SageMaker

AWS Certified AI Practitioner: My exam experience

Deploy open LLMs with Terraform and Amazon SageMaker

Train and Deploy open Embedding Models on Amazon SageMaker

Deploy Mixtral 8x7B on AWS Inferentia2 with Hugging Face Optimum

Fine-tune Llama 3 with PyTorch FSDP and Q-Lora on Amazon SageMaker

Deploy Llama 3 70B on AWS Inferentia2 with Hugging Face Optimum

Deploy Llama 3 on Amazon SageMaker

Accelerate Mixtral 8x7B with Speculative Decoding and Quantization on Amazon SageMaker

Deploy Llama 2 70B on AWS Inferentia2 with Hugging Face Optimum

Fine-Tune and Evaluate LLMs in 2024 with Amazon SageMaker

Evaluate LLMs with Hugging Face Lighteval on Amazon SageMaker

Scale LLM Inference on Amazon SageMaker with Multi-Replica Endpoints

Deploy Mixtral 8x7B on Amazon SageMaker

Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker

Deploy Llama 2 7B on AWS inferentia2 with Amazon SageMaker

Deploy Stable Diffusion XL on AWS inferentia2 with Amazon SageMaker

Deploy Idefics 9B and 80B on Amazon SageMaker

Train and Deploy Mistral 7B with Hugging Face on Amazon SageMaker

Select Language