Introduction to Data Engineering Concepts | Metadata, Lineage, and Governance
Explains core data engineering concepts: metadata, data lineage, and governance, and their importance for scalable, compliant data systems.
Alex Merced — Developer and technical writer sharing in-depth insights on data engineering, Apache Iceberg, data lakehouse architectures, Python tooling, and modern analytics platforms, with a strong focus on practical, hands-on learning.
418 articles from this blog
Explains core data engineering concepts: metadata, data lineage, and governance, and their importance for scalable, compliant data systems.
An introductory guide to data engineering, explaining its role, key concepts, and how it differs from data science in the modern data ecosystem.
An introduction to data engineering concepts, focusing on data sources and ingestion strategies like batch vs. streaming.
Explains core data engineering concepts, comparing ETL and ELT data pipeline strategies and their use cases.
Explains batch processing fundamentals for data engineering, covering concepts, tools, and its ongoing relevance in data workflows.
Explores workflow orchestration in data engineering, covering DAGs, tools, and best practices for managing complex data pipelines.
Explains how Sampling and Prompts in the Model Context Protocol (MCP) enable smarter, safer, and more controlled AI agent workflows.
Explains how Tools in the Model Context Protocol (MCP) enable LLMs to execute actions like running commands or calling APIs, moving beyond just reading data.
Explains how the Model Context Protocol (MCP) uses 'Resources' to securely serve structured data from systems like files and databases to LLMs.
Explains the architecture of the Model Context Protocol (MCP), detailing its client-server model, core components, and message flow for connecting AI models to tools and data.
Explains the Model Context Protocol (MCP), an open standard for connecting AI agents and LLMs to external data sources and tools, enabling interoperability.
Explores AI agent frameworks, their benefits, limitations, and introduces the Model Context Protocol (MCP) for more modular AI systems.
Explores AI agents, their core components, differences from LLMs, and real-world applications, positioning them as the future of autonomous AI systems.
Explores three key methods to enhance LLM performance: fine-tuning, prompt engineering, and RAG, detailing their use cases and trade-offs.
Explains how LLMs work by converting words to numerical embeddings, using vector spaces for semantic understanding, and managing context windows.
Explores the evolution of AI from symbolic systems to modern Large Language Models (LLMs), detailing their capabilities and limitations.
A tutorial on building a beginner-friendly Model Context Protocol (MCP) server in Python to connect Claude AI with local CSV and Parquet files.
A guide to using Helm, the package manager for Kubernetes, covering Helm charts, installation, deployment, and best practices.
A guide to building AI applications using the LangChain framework, covering core concepts, installation, and practical examples.
A comprehensive 2025 guide to Apache Iceberg, covering its architecture, ecosystem, and practical use for data lakehouse management.