Alex Merced

Alex Merced — Developer and technical writer sharing in-depth insights on data engineering, Apache Iceberg, data lakehouse architectures, Python tooling, and modern analytics platforms, with a strong focus on practical, hands-on learning.

https://tuts.alexmercedcoder.dev

RSS Feed

12/31/2025

data engineering apache iceberg data lakehouse python analytics

Articles from this Blog

501 articles from this blog

5/24/2026 • EN

Bringing MLflow and Data Pipelines Closer Together

Explores integrating MLflow 3 with data pipelines for unified observability, covering data lineage, drift detection, and CI/CD for ML.

Data Pipelines Mlop Data Lineage

5/24/2026 • EN

Clean Rooms for Privacy-Preserving Analytics

Explores data clean rooms for privacy-preserving analytics, covering core guarantees, platforms like Databricks and AWS, and real-world use cases.

Data Security Clean Rooms Privacy Preserving Analytics

5/24/2026 • EN

Lance and Iceberg for Multimodal AI Data

Explores using Lance and Iceberg formats for multimodal AI data, addressing scan-heavy analytics vs. random-access retrieval for ML training.

Vector Database Multimodal AI Data Architecture

5/24/2026 • EN

Automating Table Maintenance Before Small Files Accumulate

A guide on automating Iceberg table maintenance to prevent small file accumulation, covering compaction, vacuuming, and modern tools.

Apache Iceberg Table Maintenance Compaction

5/24/2026 • EN

Building Composable Query Engines with Rust Runtimes

Explores building modular query engines using Rust runtimes like Apache DataFusion, focusing on composability over monolithic designs.

rust Query Engine Composable Architecture

5/24/2026 • EN

Choosing the Right Iceberg Control Plane: Polaris vs. Unity Catalog vs. Cloud REST

Comparison of Iceberg catalog control planes: Polaris, Unity Catalog, and Cloud REST for lakehouse architecture.

Metadata Management Apache Iceberg Data Lakehouse

5/24/2026 • EN

OpenLineage as the Spine of Data Observability

Explains how OpenLineage provides a standardized API for data lineage, enabling faster incident investigation and data observability across the stack.

Data Observability Lineage Tracking Openlineage

5/24/2026 • EN

Real-Time Lakehouse Patterns with Apache Flink and Iceberg

A technical guide on building real-time lakehouse architectures using Apache Flink 2.1 and the Dynamic Iceberg Sink, addressing schema drift, file proliferation, and operational rigidity.

Kafka Apache Flink Streaming Data

5/24/2026 • EN

Single-Node Data Engineering: DuckDB, DataFusion, Polars, and LakeSail

Explores modern single-node data engineering tools like DuckDB, DataFusion, Polars, and LakeSail built on Apache Arrow for high-performance analytics.

Apache Arrow Duckdb Polars

5/24/2026 • EN

Policy as Code for Lakehouse Governance

Explores policy-as-code for lakehouse governance using ABAC, OPA, and cloud-native tools to replace RBAC with scalable, query-time data access controls.

Abac Rbac Data Access Control

5/23/2026 • EN

An In-Depth Overview of the Apache Iceberg 1.11.0 Release

Overview of Apache Iceberg 1.11.0 release, covering new features like metadata encryption, pluggable file formats, and query optimizations.

metadata Encryption Apache Iceberg

4/29/2026 • EN

Writing to an Apache Iceberg Table: How Commits and ACID Actually Work

Explains how Apache Iceberg table writes work, including commit steps and ACID guarantees on object storage.

Acid Transactions Apache Iceberg Table Formats

4/29/2026 • EN

What Are Table Formats and Why Were They Needed?

Explains why table formats like Apache Iceberg and Delta Lake are essential for reliable data lakes, solving atomic commits, schema evolution, and time travel.

metadata Apache Iceberg Table Formats

4/29/2026 • EN

The Metadata Structure of Modern Table Formats

A technical deep dive comparing metadata structures of modern table formats like Apache Iceberg, Delta Lake, and Hudi for data lakes.

Apache Iceberg Table Formats Apache Hudi

4/29/2026 • EN

Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans

Explains how Apache Iceberg's hidden partitioning prevents accidental full table scans by automatically mapping source column filters to partition values.

Apache Iceberg Data Lake Hidden Partitioning