Philipp Schmid

Philipp Schmid is a Staff Engineer at Google DeepMind, building AI Developer Experience and DevRel initiatives. He specializes in LLMs, RLHF, and making advanced AI accessible to developers worldwide.

https://www.philschmid.de

RSS Feed

1/22/2026

AI LLMs developer experience Google DeepMind RLHF

Articles from this Blog

183 articles from this blog

1/23/2024 • EN

How to Fine-Tune LLMs in 2024 with Hugging Face

A practical guide to fine-tuning open-source large language models (LLMs) using Hugging Face's TRL and Transformers libraries in 2024.

Transformers Hugging Face Datasets

1/23/2024 • EN

RLHF in 2024 with DPO and Hugging Face

A technical guide on using Direct Preference Optimization (DPO) with Hugging Face's TRL library to align and improve open-source large language models in 2024.

llm Hugging Face Dpo

1/11/2024 • EN

Scale LLM Inference on Amazon SageMaker with Multi-Replica Endpoints

Guide to scaling LLM inference on Amazon SageMaker using new multi-replica endpoints for improved throughput and cost efficiency.

Hugging Face LLM Inference Amazon Sagemaker

12/21/2023 • EN

Fine-tune Llama 7B on AWS Trainium

A technical tutorial on fine-tuning the Llama 2 7B large language model using AWS Trainium instances and Hugging Face libraries.

Hugging Face Model Fine Tuning AWS Trainium

12/20/2023 • EN

Programmatically manage 🤗 Inference Endpoints

Learn to programmatically manage Hugging Face Inference Endpoints using the huggingface_hub Python library for automated model deployment.

Python generative ai Infrastructure As Code

12/12/2023 • EN

Deploy Mixtral 8x7B on Amazon SageMaker

A technical guide on deploying the Mixtral 8x7B open-source LLM from Mistral AI to Amazon SageMaker using the Hugging Face LLM DLC.

Hugging Face Mixture Of Experts Amazon Sagemaker

11/21/2023 • EN

Deploy Embedding Models on AWS inferentia2 with Amazon SageMaker

Tutorial on deploying embedding models using AWS Inferentia2 and Amazon SageMaker for accelerated inference performance.

aws Amazon Sagemaker Optimum Neuron

11/14/2023 • EN

Deploy Llama 2 7B on AWS inferentia2 with Amazon SageMaker

A tutorial on deploying Meta's Llama 2 7B model on AWS Inferentia2 using Amazon SageMaker and the optimum-neuron library.

Model Deployment Amazon Sagemaker Optimum Neuron

11/7/2023 • EN

Deploy Stable Diffusion XL on AWS inferentia2 with Amazon SageMaker

A tutorial on deploying Stable Diffusion XL for accelerated inference using AWS Inferentia2 and Amazon SageMaker.

stable diffusion Model Deployment Amazon Sagemaker

11/3/2023 • EN

Amazon Bedrock: How good (bad) is Titan Embeddings?

An evaluation of Amazon Titan Embeddings on the MTEB benchmark, analyzing its performance, use cases, and lack of transparency.

Vector Embeddings Amazon Bedrock Text Embeddings

10/30/2023 • EN

Evaluate LLMs and RAG a practical example using Langchain and Hugging Face

A hands-on guide to evaluating LLMs and RAG systems using Langchain and Hugging Face, covering criteria-based and pairwise evaluation methods.

Langchain Rag Gpt 4

10/12/2023 • EN

Deploy Idefics 9B and 80B on Amazon SageMaker

A technical guide on deploying Hugging Face's IDEFICS visual language models (9B & 80B parameters) to Amazon SageMaker using the LLM DLC.

large language models Multimodal AI Model Deployment

10/5/2023 • EN

Train and Deploy Mistral 7B with Hugging Face on Amazon SageMaker

A technical guide on fine-tuning the Mistral 7B large language model using QLoRA and deploying it on Amazon SageMaker with Hugging Face tools.

Hugging Face Amazon Sagemaker Qlora

9/26/2023 • EN

Llama 2 on Amazon SageMaker a Benchmark

A benchmark analysis of deploying Meta's Llama 2 models on Amazon SageMaker using Hugging Face's LLM Inference Container, evaluating cost, latency, and throughput.

large language models benchmark Model Deployment

9/20/2023 • EN

Fine-tune Falcon 180B with DeepSpeed ZeRO, LoRA and Flash Attention

A technical guide on fine-tuning the massive Falcon 180B language model using DeepSpeed ZeRO, LoRA, and Flash Attention for efficient training.

large language models Lora Deepspeed

9/12/2023 • EN

Fine-tune Falcon 180B with QLoRA and Flash Attention on Amazon SageMaker

A technical guide on fine-tuning the massive Falcon 180B language model using QLoRA and Flash Attention on Amazon SageMaker.

Amazon Sagemaker Qlora LLM Fine Tuning

9/7/2023 • EN

Deploy Falcon 180B on Amazon SageMaker

A technical guide on deploying the Falcon 180B open-source large language model to Amazon SageMaker using the Hugging Face LLM DLC.

Hugging Face Amazon Sagemaker Text Generation Inference

8/31/2023 • EN

Optimize open LLMs using GPTQ and Hugging Face Optimum

A guide to using GPTQ quantization with Hugging Face Optimum to compress open-source LLMs for efficient deployment on smaller hardware.

llm Hugging Face Quantization

8/15/2023 • EN

LLMOps: Deploy Open LLMs using Infrastructure as Code with AWS CDK

A technical guide on deploying open-source LLMs like Llama 2 using Infrastructure as Code with AWS CDK and the Hugging Face LLM construct.

Hugging Face Infrastructure As Code AWS Cdk

8/7/2023 • EN

Deploy Llama 2 7B/13B/70B on Amazon SageMaker

A technical guide on deploying Meta's Llama 2 large language models (7B, 13B, 70B) on Amazon SageMaker using the Hugging Face LLM DLC.

Hugging Face Amazon Sagemaker Text Generation Inference

Previous 1 2 3 4 5 6 ... 10 Next

Philipp Schmid

Articles from this Blog

Select Language