Saeed Esmaili

Saeed Esmaili is a data scientist in Amsterdam working at Spotify, focused on developer productivity, platform strategy, LLMs, and recommendation systems, sharing notes and insights along the way.

https://saeedesmaili.com

RSS Feed

12/30/2025

data science machine learning LLMs recommendation systems developer productivity platform strategy Spotify tech notes

Articles from this Blog

31 articles from this blog

4/26/2025 • EN

Never Been Easier to Learn

Discusses how LLMs like ChatGPT can boost self-learning by helping understand problems and verify solutions, making skill acquisition easier.

programming education llm

4/16/2025 • EN

Released a new tool: llm-url-markdown

Introducing llm-url-markdown, a new plugin for Simon Willison's llm CLI tool that fetches web content as markdown for use as LLM context fragments.

markdown llm Plugin

3/26/2025 • EN

Building a Personal Content Recommendation System, Part Two: Data Processing and Cleaning

Part two of building a personal recommendation system, covering data collection from Pocket and content extraction using the Jina Reader API.

github recommendation systems data processing

3/18/2025 • EN

Enhancing Text-to-SQL With Synthetic Summaries

Explains a technique using AI-generated summaries of SQL queries to improve the accuracy of text-to-SQL systems with LLMs.

llm Retrieval Augmented Generation Text To SQL

3/16/2025 • EN

Building a Personal Content Recommendation System, Part One: Introduction

A developer documents the first steps in building a personalized content recommendation system using saved articles, text embeddings, and algorithms.

recommendation systems data processing API Integration

3/3/2025 • EN

Add Logprobs to Openai Structured Output

Explains how to extract logprobs from OpenAI's structured JSON outputs using the structured-logprobs Python library for better LLM confidence insights.

llm Openai API Structured Output

1/2/2025 • EN

Adding new entries to a Supabase postgres table via REST API

A tutorial on using Python to insert data into a Supabase PostgreSQL table via its REST API.

Python Backend postgresql

12/19/2024 • EN

Label-Studio: Annotate Text and Image Data for AI and ML training

Introduces Label-Studio, an open-source tool for annotating text, image, audio, and video data for AI/ML projects, highlighting its ease of use and features.

Machine Learning docker Data Annotation

12/19/2024 • EN

Quickly Filter and Aggregate Python Lists

Introduces the 'leopards' Python library for filtering and aggregating lists, offering a lightweight alternative to pandas for basic data operations.

Python data processing Data Filtering

12/19/2024 • EN

Pydantic Logfire for LLM and API Observability

Introducing Logfire, Pydantic's new observability tool for Python, with easy integration for OpenAI LLM calls, FastAPI, and logging.

api llm observability

12/4/2024 • EN

Build a search engine, not a vector DB

Argues that building a good search engine is more critical for effective RAG than just using a vector database, as poor retrieval misleads AI.

llm Search Engine Rag

12/1/2024 • EN

Access Google Gemini LLM via OpenAI Python Library

Learn how to use the OpenAI Python library to interact with Google's Gemini LLM for text generation, images, and more.

AI Integration Openai Python Library Google Gemini

6/29/2024 • EN

Understanding Input Masking in LLM Finetuning

Explains the concept and purpose of input masking in LLM fine-tuning, using a practical example with Axolotl for a code PR classification task.

Training Data Model Training LLM Finetuning

6/20/2024 • EN

Control a smart light with multiple motion sensors in Home Assistant

A guide to using Home Assistant groups to control a single smart light with multiple motion sensors, including adding media player state.

automation iot Home Assistant

6/2/2024 • EN

To Chunk or Not to Chunk With the Long Context Single Embedding Models

An experiment comparing retrieval performance of chunked vs. non-chunked documents using long-context embedding models like BGE-M3.

Rag Embeddings Retrieval

5/27/2024 • EN

Lessons After a Half Billion Gpt Tokens

Practical lessons from integrating LLMs into a product, focusing on prompt design pitfalls like over-specification and handling null responses.

llm prompt engineering API Integration

4/22/2024 • EN

Running Python on a serverless GPU instance for machine learning inference

A guide to running Python code on serverless GPU instances using Modal.com for faster machine learning inference, demonstrated with a speech-to-text example.

Machine Learning serverless Gpu