Machine Learning Deployment articles

6/28/2023 • EN

A tutorial on optimizing and deploying a BERT model for low-latency inference using AWS Inferentia2 accelerators and Amazon SageMaker.

Amazon Sagemaker AWS Inferentia2 Bert Machine Learning Deployment Model Optimization

2/22/2022 • EN

Guide to deploying multiple Hugging Face Transformer models as a cost-optimized Multi-Container Endpoint using Amazon SageMaker.

Amazon Sagemaker Hugging Face Transformers Inference Optimization Machine Learning Deployment Multi Container Endpoint

2/15/2022 • EN

Guide to deploying Hugging Face Transformers models for asynchronous inference using Amazon SageMaker, including setup and configuration.

Amazon Sagemaker Asynchronous Inference aws Hugging Face Transformers Machine Learning Deployment

10/29/2021 • EN

A guide to deploying and auto-scaling Hugging Face Transformer models for real-time inference using Amazon SageMaker.

Amazon Sagemaker Autoscaling Hugging Face Machine Learning Deployment Transformers

5/8/2020 • EN

A step-by-step tutorial on deploying a custom PyTorch machine learning model to production using AWS Lambda and the Serverless Framework.

AWS Lambda Machine Learning Deployment production Pytorch serverless

Machine Learning Deployment Articles