The State of Apache Iceberg v4 - October 2025 Edition
Overview of key proposals in Apache Iceberg v4, focusing on performance, metadata efficiency, and portability for modern data workloads.
Overview of key proposals in Apache Iceberg v4, focusing on performance, metadata efficiency, and portability for modern data workloads.
A comprehensive guide to the data lakehouse architecture, its core components (Iceberg, Delta, Hudi, Paimon), and the surrounding ecosystem for modern data platforms.
A guide to building an autonomous, self-healing optimization pipeline for Apache Iceberg tables to maintain performance and cost efficiency.
Explores challenges and best practices for managing partition evolution and compaction in Apache Iceberg to maintain query performance.
Explains how to use Apache Iceberg's metadata tables to dynamically trigger data compaction based on file size, manifest health, and snapshot patterns.
A guide to scheduling compaction and snapshot expiration in Apache Iceberg tables based on workload patterns and infrastructure constraints.
Explains how to manage Apache Iceberg table metadata by expiring old snapshots and rewriting manifests to prevent performance and cost issues.
Explains techniques for incremental, non-disruptive compaction in Apache Iceberg tables under continuous streaming data ingestion.
A monthly roundup of data engineering links covering Apache Iceberg, Kafka, Debezium, Spark, and lakehouse architecture.
Explains how Apache Iceberg tables degrade without optimization, covering small files, fragmented manifests, and performance impacts.
Explains the importance of table maintenance in Apache Iceberg for data lakehouses, covering metadata and file management.
A guide on how to find, join, and organize community meetups focused on Apache Iceberg and modern data lakehouse architectures.
A monthly roundup of tech links covering data lakehouses (DuckLake, Iceberg), Kafka, event streaming, and stream processing developments.
Explores Apache Iceberg, Arrow, and Polaris—three key technologies powering modern, high-performance data lakehouse platforms.
Explains the data lakehouse architecture, a unified approach combining data lake scalability with warehouse management features like ACID transactions.
A comprehensive 2025 guide to Apache Iceberg, covering its architecture, ecosystem, and practical use for data lakehouse management.
Explores solutions like Apache XTable and Delta Lake Uniform for enabling interoperability between different data lakehouse table formats.
A developer shares the story of building Pangolin, an open-source lakehouse catalog, using an AI coding agent during a holiday break.
A technical guide on designing and implementing a modern data lakehouse architecture using the Apache Iceberg table format in 2025.
A look at 10 upcoming features and enhancements for the Apache Iceberg data lakehouse table format, expected in 2025.