Understanding the Apache Iceberg Manifest
Explains the role and structure of Apache Iceberg manifest files, key metadata components for tracking data files and optimizing queries in data lakehouses.
Explains the role and structure of Apache Iceberg manifest files, key metadata components for tracking data files and optimizing queries in data lakehouses.
Explains the role and structure of the Apache Iceberg Manifest List file in managing table snapshots and optimizing data lakehouse queries.
Explains the critical role and structure of the metadata.json file in Apache Iceberg, the open-source table format for data lakehouses.
An introduction to data lakehouses, explaining what they are, why they're used, and how to migrate to this modern data architecture.
Explores Polaris, an open-source catalog service for managing Apache Iceberg tables in data lakehouses, covering its architecture, entities, and security.
Explains how Apache Iceberg's design ensures data reliability, atomic operations, and serializable isolation for large-scale data lakehouses.
A list of upcoming tech talks and events by Alex Merced, focusing on Apache Iceberg, data lakehouses, and data engineering topics.
Explains the data lakehouse architecture, its layers (storage, table format, catalog, processing), and its advantages over traditional data warehouses.
An introduction to Apache Iceberg, a table format for data lakehouses, explaining its architecture and providing learning resources.
Explores the evolution of Apache Iceberg catalogs, focusing on the current REST Catalog and future proposals for server-side optimizations.
A hands-on tutorial on building a data lakehouse pipeline using Spark, Dremio, and Superset to move and analyze data.
An overview of five impactful open-source data projects, including Apache Iceberg and Arrow, that are revolutionizing data management and analytics.
Explains why Dremio is a top platform for Apache Iceberg lakehouses, highlighting features like dataset promotion and data reflections.
Explores Apache Iceberg's catalog system, its role in data lakehouse architecture, and key considerations for choosing the right catalog.
Explains the role, types, and selection criteria for catalogs in Apache Iceberg, a key component for managing data lakehouse tables.
Explores 10 reasons to adopt Apache Iceberg and Dremio for building a modern, flexible, and cost-effective data lakehouse architecture.
Explains the data lakehouse architecture and the roles of Apache Iceberg, Nessie, and Dremio in modern data management.
Table of Contents Context Introduction Short Version for Quick Readers My Journey with Table Formats and Lakehouses Ecosystem Over Features Key Takeaw
Explores the Data Lakehouse architecture and the roles of Apache Iceberg and Dremio in modern, integrated data management.
A comprehensive directory of resources for learning about and building Open Lakehouses using Apache Iceberg, Nessie, and Dremio.