Philipp Schmid 11/21/2023

Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker

Read Original

This technical guide provides an end-to-end tutorial for deploying embedding models (specifically BAAI/bge-base-en-v1.5) on AWS Inferentia2 accelerators using Amazon SageMaker. It covers converting models with optimum-neuron, creating custom inference scripts, uploading to S3, deploying a real-time endpoint, and evaluating inference performance.

Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser