Alex Merced

Alex Merced — Developer and technical writer sharing in-depth insights on data engineering, Apache Iceberg, data lakehouse architectures, Python tooling, and modern analytics platforms, with a strong focus on practical, hands-on learning.

https://tuts.alexmercedcoder.dev

RSS Feed

12/31/2025

data engineering apache iceberg data lakehouse python analytics

Articles from this Blog

333 articles from this blog

11/15/2024 • EN

Deep Dive into Dremio's File-based Auto Ingestion into Apache Iceberg Tables

A guide to setting up and using Dremio's Auto-Ingest feature for automated, event-driven data loading into Apache Iceberg tables from cloud storage.

Data Ingestion Data Pipeline Apache Iceberg

11/8/2024 • EN

Intro to SQL using Apache Iceberg and Dremio

A tutorial on using SQL with Apache Iceberg tables in the Dremio data lakehouse platform, covering setup and core operations.

sql docker Apache Iceberg

11/5/2024 • EN

Introduction to Cargo and cargo.toml

A guide to understanding and using the cargo.toml file, the central configuration file for managing Rust projects and dependencies with Cargo.

package management dependency management build system

11/5/2024 • EN

Dremio, Apache Iceberg and their role in AI-Ready Data

Explores how Dremio and Apache Iceberg create AI-ready data by ensuring accessibility, scalability, and governance for machine learning workloads.

Data Management Apache Iceberg Data Lakehouse

11/1/2024 • EN

Leveraging Python's Pattern Matching and Comprehensions for Data Analytics

Explores using Python's pattern matching and comprehensions for efficient data cleaning, transformation, and analysis.

Python data transformation Comprehensions

10/31/2024 • EN

Hands-on with Apache Iceberg & Dremio on Your Laptop within 10 Minutes

A hands-on tutorial for setting up a local data lakehouse with Apache Iceberg, Dremio, and Nessie using Docker in under 10 minutes.

docker Apache Iceberg Data Lakehouse

10/30/2024 • EN

Data Modeling - Entities and Events

Explores the differences between event and entity data modeling, when to use each approach, and practical design considerations for structuring data effectively.

analytics Events Data Modeling

10/21/2024 • EN

All About Parquet Part 08 - Reading and Writing Parquet Files in Python

A practical guide to reading and writing Parquet files in Python using PyArrow and FastParquet libraries.

Python Data Engineering Parquet

10/21/2024 • EN

All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet

Final guide in a series covering performance tuning and best practices for optimizing Apache Parquet files in big data workflows.

performance tuning Big Data Data Compression

10/21/2024 • EN

All About Parquet Part 09 - Parquet in Data Lake Architectures

Explores why Parquet is the ideal columnar file format for optimizing storage and query performance in modern data lake and lakehouse architectures.

Big Data Parquet Apache Iceberg

10/21/2024 • EN

All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency

Explores how metadata in Parquet files improves data efficiency and query performance, covering file, row group, and column-level metadata.

metadata Query Performance Parquet

10/21/2024 • EN

All About Parquet Part 05 - Compression Techniques in Parquet

Explores compression algorithms in Parquet files, comparing Snappy, Gzip, Brotli, Zstandard, and LZO for storage and performance.

Gzip Data Compression Parquet

10/21/2024 • EN

All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage

Explains encoding techniques in Parquet files, including dictionary, RLE, bit-packing, and delta encoding, to optimize storage and performance.

data encoding Parquet Dictionary Encoding

10/21/2024 • EN

All About Parquet Part 04 - Schema Evolution in Parquet

Explains how Parquet handles schema evolution, including adding/removing columns and changing data types, for data engineers.

Data Management File Format Data Engineering

10/21/2024 • EN

All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns

Explains the hierarchical structure of Parquet files, detailing how pages, row groups, and columns optimize storage and query performance.

Big Data File Format Data Engineering

10/21/2024 • EN

All About Parquet Part 02 - Parquet's Columnar Storage Model

Explains Parquet's columnar storage model, detailing its efficiency for big data analytics through faster queries, better compression, and optimized aggregation.

Data Compression Parquet Data Format