Apache Iceberg articles

6/24/2025 • EN

Writing to Apache Iceberg on S3 using Flink SQL with Glue catalog

A technical guide on using Flink SQL to write data to Apache Iceberg tables stored on AWS S3, with metadata managed by the AWS Glue Data Catalog.

Amazon S3 Apache Iceberg AWS Glue Data Catalog Flink SQL

Robin Moffatt

5/23/2025 • EN

Interesting links - May 2025

A monthly roundup of curated links and articles covering data engineering, Kafka, stream processing, and AI, with top picks highlighted.

Apache Iceberg Data Engineering Data Modeling Kafka Snowflake

Robin Moffatt

5/2/2025 • EN

Introduction to Data Engineering Concepts | What is Data Engineering?

An introductory guide to data engineering, explaining its role, key concepts, and how it differs from data science in the modern data ecosystem.

Apache Iceberg Data Engineering Data Infrastructure Data Pipelines Data Warehouse

Alex Merced

5/2/2025 • EN

Introduction to Data Engineering Concepts | Batch Processing Fundamentals

Explains batch processing fundamentals for data engineering, covering concepts, tools, and its ongoing relevance in data workflows.

Apache Iceberg Batch Processing Data Engineering Data Pipelines Data Workflows

Alex Merced

5/2/2025 • EN

Introduction to Data Engineering Concepts | Data Warehousing Fundamentals

An introduction to data warehousing concepts, covering architecture, components, and performance optimization for analytical workloads.

Apache Iceberg Data Architecture Data Engineering Data Warehousing performance optimization

Alex Merced

5/2/2025 • EN

Introduction to Data Engineering Concepts | Data Lakes Explained

Explains data lakes, their key characteristics, and how they differ from data warehouses in modern data architecture.

Apache Iceberg cloud storage Data Architecture Data Engineering Data Lakes

Alex Merced

5/2/2025 • EN

Introduction to Data Engineering Concepts | Data Quality and Validation

Explores the importance of data quality and validation in data engineering, covering key dimensions and tools for reliable pipelines.

Apache Iceberg Data Engineering Data Pipelines Data Quality Data Validation

Alex Merced

5/2/2025 • EN

Introduction to Data Engineering Concepts | Metadata, Lineage, and Governance

Explains core data engineering concepts: metadata, data lineage, and governance, and their importance for scalable, compliant data systems.

Apache Iceberg Data Engineering Data Governance Data Lineage metadata

Alex Merced

5/2/2025 • EN

Introduction to Data Engineering Concepts | Storage Formats and Compression

Explains the importance of data storage formats and compression for performance and cost in large-scale data engineering systems.

Apache Iceberg Columnar Storage compression Data Engineering Storage Formats

Alex Merced

5/2/2025 • EN

Introduction to Data Engineering Concepts | Building Scalable Pipelines

Explores core principles of scalable data engineering, including parallelism, minimizing data movement, and designing adaptable pipelines for growing data volumes.

Apache Iceberg Data Architecture Data Engineering parallelism Scalable Pipelines

Alex Merced

5/2/2025 • EN

Introduction to Data Engineering Concepts | Data Lakehouse Architecture Explained

Explains the data lakehouse architecture, a unified approach combining data lake scalability with warehouse management features like ACID transactions.

Apache Iceberg Data Architecture Data Engineering Data Lakehouse Data Management

Alex Merced

5/2/2025 • EN

Introduction to Data Engineering Concepts | Apache Iceberg, Arrow, and Polaris

Explores Apache Iceberg, Arrow, and Polaris—three key technologies powering modern, high-performance data lakehouse platforms.

Apache Arrow Apache Iceberg Apache Polaris Data Lakehouse Table Format

Alex Merced

4/10/2025 • EN

Journey from AI to LLMs and MCP - 6 - Enter the Model Context Protocol (MCP) — The Interoperability Layer for AI Agents

Explains the Model Context Protocol (MCP), an open standard for connecting AI agents and LLMs to external data sources and tools, enabling interoperability.

AI Agents Apache Iceberg LLM Interoperability Model Context Protocol Open Protocol

Alex Merced