Geert Baeke • 7/11/2024

Token consumption in Microsoft’s Graph RAG

This technical article provides a detailed analysis of token usage when querying Microsoft's Graph RAG knowledge graph. It covers setup for logging LLM calls using LiteLLM as a proxy and Langfuse for tracing, comparing performance between GPT-4o and GPT-4o-mini. The post explains the process for both local queries (using embeddings and similarity search) and global queries, including system prompts and data structures involved.

0 comments

#llm #Litellm #Microsoft Graph Rag