An Exploration of the Commercial Iceberg Catalog Ecosystem
Explores the commercial Apache Iceberg catalog ecosystem, focusing on REST Catalog standards, optimization strategies, and architectural trade-offs.
Explores the commercial Apache Iceberg catalog ecosystem, focusing on REST Catalog standards, optimization strategies, and architectural trade-offs.
A guide to building an autonomous, self-healing optimization pipeline for Apache Iceberg tables to maintain performance and cost efficiency.
Explores challenges and best practices for managing partition evolution and compaction in Apache Iceberg to maintain query performance.
Explains how to manage Apache Iceberg table metadata by expiring old snapshots and rewriting manifests to prevent performance and cost issues.
Explains how Apache Iceberg tables degrade without optimization, covering small files, fragmented manifests, and performance impacts.
Introduces Nessie as a self-managed catalog alternative to Hive & JDBC for Apache Iceberg, addressing limitations and new features.
Project Nessie is a version control system for data lakes, bringing Git-like operations to manage and track changes in data assets.
An overview of Kafka's new KRaft mode, which removes the ZooKeeper dependency for metadata management and controller election.
Overview of Kafka's new KRaft mode, which removes the ZooKeeper dependency for metadata management and controller election.
An analysis of data discovery platforms, their key features, and available open-source solutions to improve data findability in organizations.