Alex Merced • 4/29/2026

Approaches to Streaming Data into Apache Iceberg Tables

This article is Part 13 of a 15-part Apache Iceberg Masterclass, covering three primary approaches to streaming data into Iceberg tables: Spark Structured Streaming, Apache Flink Iceberg Sink, and Kafka Connect Iceberg Sink. It discusses the streaming + compaction cycle, the latency vs. maintenance trade-off, and production streaming architecture. The content includes code examples for Spark and Flink, highlighting how each approach handles small file problems and commit frequency. It is aimed at data engineers and developers working with real-time data ingestion into Iceberg-based lakehouses, offering guidance on choosing the right approach and monitoring streaming health.

0 comments

#streaming #Kafka Connect #Apache Flink