Philipp Schmid

Philipp Schmid is a Staff Engineer at Google DeepMind, building AI Developer Experience and DevRel initiatives. He specializes in LLMs, RLHF, and making advanced AI accessible to developers worldwide.

https://www.philschmid.de

RSS Feed

1/22/2026

AI LLMs developer experience Google DeepMind RLHF

Articles from this Blog

183 articles from this blog

9/30/2024 • EN

How to Fine-Tune Multimodal Models or VLMs with Hugging Face TRL

A technical guide on fine-tuning Vision-Language Models (VLMs) using Hugging Face's TRL library for custom applications like image-to-text generation.

Hugging Face Fine Tuning Multimodal Models

9/24/2024 • EN

Evaluate open LLMs with Vertex AI and Gemini

A technical guide on using Google's Vertex AI Gen AI Evaluation Service with Gemini to evaluate open LLM models like Llama 3.1.

Gemini LLM Evaluation Model Deployment

9/19/2024 • EN

Evaluate LLMs using Evaluation Harness and Hugging Face TGI/vLLM

A guide to evaluating Large Language Models (LLMs) using the Evaluation Harness framework and optimized serving tools like Hugging Face TGI and vLLM.

benchmarking LLM Evaluation Evaluation Harness

8/5/2024 • EN

Deploy open LLMs with Terraform and Amazon SageMaker

A guide to deploying open-source LLMs like Llama 3 to Amazon SageMaker using Terraform for Infrastructure as Code.

Machine Learning Infrastructure As Code Terraform

7/11/2024 • EN

LLM Evaluation doesn't need to be complicated

A guide to simplifying LLM evaluation workflows using clear metrics, chain-of-thought, and few-shot prompts, inspired by real-world examples.

generative ai large language models Chatbot

6/28/2024 • EN

Evaluating Open LLMs with MixEval: The Closest Benchmark to LMSYS Chatbot Arena

Introduces MixEval, a cost-effective LLM benchmark with high correlation to Chatbot Arena, for evaluating open-source language models.

open source large language models benchmark

6/25/2024 • EN

Train and Deploy open Embedding Models on Amazon SageMaker

A guide to fine-tuning and deploying custom embedding models for RAG applications on Amazon SageMaker using Sentence Transformers v3.

Rag Hugging Face Amazon Sagemaker

6/18/2024 • EN

Deploy Mixtral 8x7B on AWS Inferentia2 with Hugging Face Optimum

A technical guide on deploying the Mixtral 8x7B LLM on AWS Inferentia2 using Hugging Face Optimum and Amazon SageMaker.

Amazon Sagemaker Hugging Face Optimum LLM Deployment

6/11/2024 • EN

Fine-tune Llama 3 with PyTorch FSDP and Q-Lora on Amazon SageMaker

A technical guide on fine-tuning the Llama 3 LLM using PyTorch FSDP and Q-Lora on Amazon SageMaker for efficient training.

Llama 3 Fine Tuning Amazon Sagemaker

6/4/2024 • EN

Fine-tune Embedding models for Retrieval Augmented Generation (RAG)

A guide to fine-tuning embedding models for RAG applications using Sentence Transformers 3, featuring Matryoshka Representation Learning for efficiency.

Rag Fine Tuning Sentence Transformers

5/27/2024 • EN

Understanding the Cost of Generative AI Models in Production

Analyzes the complex total cost of ownership for deploying generative AI models in production, beyond just raw compute expenses.

generative ai Kubernetes Production Deployment

5/23/2024 • EN

Deploy Llama 3 70B on AWS Inferentia2 with Hugging Face Optimum

A technical guide on deploying Meta's Llama 3 70B Instruct model on AWS Inferentia2 using Hugging Face Optimum and Amazon SageMaker.

Amazon Sagemaker Hugging Face Optimum LLM Deployment

5/2/2024 • EN

Deploy open LLMs with vLLM on Hugging Face Inference Endpoints

A tutorial on deploying open-source large language models (LLMs) like Llama 3 using the vLLM framework on Hugging Face Inference Endpoints.

large language models Hugging Face Inference Endpoints

4/22/2024 • EN

Efficiently fine-tune Llama 3 with PyTorch FSDP and Q-Lora

A technical guide on fine-tuning the Llama 3 70B model using PyTorch FSDP and Q-Lora for efficient training on limited GPU hardware.

large language models Llama 3 Fine Tuning

4/18/2024 • EN

Deploy Llama 3 on Amazon SageMaker

A technical guide on deploying Meta's Llama 3 70B model on Amazon SageMaker using the Hugging Face LLM DLC and Text Generation Inference.

large language models Hugging Face Llama 3

4/2/2024 • EN

Accelerate Mixtral 8x7B with Speculative Decoding and Quantization on Amazon SageMaker

A technical guide on accelerating the Mixtral 8x7B LLM using speculative decoding (Medusa) and quantization (AWQ) for deployment on Amazon SageMaker.

Quantization LLM Inference Amazon Sagemaker

3/26/2024 • EN

Deploy Llama 2 70B on AWS Inferentia2 with Hugging Face Optimum

A technical guide on deploying Meta's Llama 2 70B large language model on AWS Inferentia2 hardware using Hugging Face Optimum and SageMaker.

Amazon Sagemaker Hugging Face Optimum LLM Deployment

3/12/2024 • EN

Fine-Tune and Evaluate LLMs in 2024 with Amazon SageMaker

A technical guide on fine-tuning and evaluating open-source Large Language Models (LLMs) using Amazon SageMaker and Hugging Face libraries.

Hugging Face Model Evaluation Amazon Sagemaker

3/5/2024 • EN

Evaluate LLMs with Hugging Face Lighteval on Amazon SageMaker

A tutorial on evaluating Large Language Models using Hugging Face's Lighteval library on Amazon SageMaker, focusing on benchmarks like TruthfulQA.

benchmarking Hugging Face LLM Evaluation

3/1/2024 • EN

How to fine-tune Google Gemma with ChatML and Hugging Face TRL

A technical guide on fine-tuning Google's Gemma open LLMs using the ChatML format and Hugging Face's TRL library for efficient training on consumer GPUs.

Hugging Face LLM Fine Tuning Trl

Previous 1 2 3 4 5 ... 10 Next

Philipp Schmid

Articles from this Blog

Select Language