You Gotta Push If You Wanna Pull
Explores the shift from traditional pull queries to using materialized views and data duplication for better performance, format, and location in data systems.
Gunnar Morling is a Java Champion and open-source software engineer specializing in Java and data streaming. He works at Confluent, contributes to projects like Hibernate and Debezium, and shares his expertise through blogs, talks, and conferences.
91 articles from this blog
Explores the shift from traditional pull queries to using materialized views and data duplication for better performance, format, and location in data systems.
Explains idempotency keys in distributed systems, comparing UUIDs and monotonic sequences for duplicate detection and exactly-once processing.
Explores building a basic Durable Execution engine in Java using SQLite to persist workflow state and resume from failures.
Argues against oversimplified advice to replace Kafka with Postgres, explaining they are different tools for different problems.
Explores how Java's new Generational ZGC garbage collector reduces tail latencies compared to the default G1 collector in a microservice benchmark.
Explains the difference between confirmed_flush_lsn and restart_lsn in PostgreSQL replication slots for troubleshooting and optimization.
Explores using Java 21+ virtual threads to elegantly convert legacy Future objects into modern, composable CompletableFuture instances.
Best practices for managing PostgreSQL replication slots to prevent WAL bloat and ensure reliable CDC pipelines in production.
Explores building AI Agents as streaming SQL queries using platforms like Apache Flink for improved consistency, scalability, and developer experience.
Explains challenges with Postgres TOAST columns in Debezium CDC events and solutions using Debezium's reselect processor and Apache Flink.
Argues that 'Streaming vs. Batch' is a misleading dichotomy; the real distinction is between push and pull data semantics in processing systems.
Explores reimagining Apache Kafka as a cloud-native event log, proposing features like partitionless design, key-centric access, and topic hierarchies.
Explores methods for ingesting Debezium CDC events from Kafka into Apache Flink using different SQL connectors and data formats.
A guide to building a native Apache Kafka binary on macOS using GraalVM for faster startup times, based on the KIP-974 configuration.
Explores JEP 483: Ahead-of-Time Class Loading & Linking in Java 24, part of Project Leyden, to reduce application startup times.
Explains the concept of a 'synchrony budget' for distributed systems, advocating for minimizing synchronous calls to improve performance and availability.
Explores KIP-932, a proposal to add queue semantics and share groups to Apache Kafka for improved message processing.
Part 2 of a guide on running Apache Flink on Kubernetes, covering fault tolerance, high availability, savepoints, and observability.
A technical guide on installing Apache Flink's Kubernetes operator and deploying your first Flink job, with a focus on automation and setup.
Explains how Postgres 17 introduces built-in failover replication slots, enabling seamless logical replication during primary database failovers.