Multi-Container Endpoints with Hugging Face Transformers and Amazon SageMaker
Guide to deploying multiple Hugging Face Transformer models as a cost-optimized Multi-Container Endpoint using Amazon SageMaker.
Philipp Schmid is a Staff Engineer at Google DeepMind, building AI Developer Experience and DevRel initiatives. He specializes in LLMs, RLHF, and making advanced AI accessible to developers worldwide.
199 articles from this blog
Guide to deploying multiple Hugging Face Transformer models as a cost-optimized Multi-Container Endpoint using Amazon SageMaker.
Guide to deploying Hugging Face Transformers models for asynchronous inference using Amazon SageMaker, including setup and configuration.
A technical guide on deploying a DistilBERT model to production using Hugging Face Transformers, Amazon SageMaker, and Infrastructure as Code with Terraform.
A tutorial on using task-specific knowledge distillation to compress a BERT model for text classification with Transformers and Amazon SageMaker.
A guide to accelerating multilingual BERT fine-tuning using Hugging Face Transformers with distributed training on Amazon SageMaker.
A tutorial on fine-tuning a Hugging Face Transformer model for financial text summarization using Keras and Amazon SageMaker.
A guide to deploying the GPT-J 6B language model for production inference using Hugging Face Transformers and Amazon SageMaker.
A tutorial on fine-tuning a Vision Transformer (ViT) model for satellite image classification using Hugging Face Transformers and Keras.
A workshop series on using Hugging Face Transformers with Amazon SageMaker for enterprise-scale NLP, covering training, deployment, and MLOps.
A tutorial on fine-tuning a non-English BERT model using Hugging Face Transformers and Keras for Named Entity Recognition tasks.
Guide to deploying Hugging Face Transformer models using Amazon SageMaker Serverless Inference for cost-effective ML prototypes.
Guide to fine-tuning a Hugging Face BERT model for text classification using Amazon SageMaker and the new Training Compiler to accelerate training.
Learn how to integrate the Hugging Face Hub as a model registry with Amazon SageMaker for MLOps, including training and deployment.
A guide to attending AWS re:Invent 2021 machine learning and NLP sessions remotely, featuring keynotes and top session recommendations.
Guide to building an end-to-end MLOps pipeline for Hugging Face Transformers using Amazon SageMaker Pipelines, from training to deployment.
A guide to deploying and auto-scaling Hugging Face Transformer models for real-time inference using Amazon SageMaker.
A tutorial on deploying the BigScience T0_3B language model to AWS and Amazon SageMaker for production use.
A tutorial on deploying Hugging Face Transformer models to production using AWS SageMaker, Lambda, and CDK for scalable, secure inference endpoints.
A guide to implementing few-shot learning using the GPT-Neo language model and Hugging Face's inference API for NLP tasks.
A tutorial on using Hugging Face Transformers and Amazon SageMaker for distributed training of BART/T5 models on a text summarization task.