Geert Baeke 7/11/2024

Token consumption in Microsoft’s Graph RAG

Read Original

This technical article provides a detailed analysis of token usage when querying Microsoft's Graph RAG knowledge graph. It covers setup for logging LLM calls using LiteLLM as a proxy and Langfuse for tracing, comparing performance between GPT-4o and GPT-4o-mini. The post explains the process for both local queries (using embeddings and similarity search) and global queries, including system prompts and data structures involved.

Token consumption in Microsoft’s Graph RAG

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser

Top of the Week

1
The Beautiful Web
Jens Oliver Meiert 2 votes
2
Container queries are rad AF!
Chris Ferdinandi 2 votes
3
Wagon’s algorithm in Python
John D. Cook 1 votes
5
Top picks — 2026 January
Paweł Grzybek 1 votes
6
In Praise of –dry-run
Henrik Warne 1 votes
8
Vibe coding your first iOS app
William Denniss 1 votes