You Gotta Push If You Wanna Pull
Explores the shift from traditional pull queries to using materialized views and data duplication for better performance, format, and location in data systems.
Explores the shift from traditional pull queries to using materialized views and data duplication for better performance, format, and location in data systems.
Explores the limitations of traditional pull queries in data systems and advocates for using materialized views and data duplication to improve performance.
Argues against oversimplified advice to replace Kafka with Postgres, explaining they are different tools for different problems.
A tutorial on setting up and running PyFlink streaming data jobs on a Kubernetes cluster, including prerequisites and deployment steps.
A tutorial on setting up and running PyFlink streaming data jobs on a Kubernetes cluster, including installation and deployment steps.
Explores whether Change Data Capture (CDC) breaks software encapsulation and discusses strategies like data contracts to mitigate risks.
Explores whether Change Data Capture (CDC) breaks application encapsulation and discusses strategies like data contracts to mitigate risks.
A software engineer explains their decision to join Decodable, a startup building a serverless real-time data platform, focusing on stream processing.
An analysis of current trends in the Apache Kafka ecosystem, focusing on connector growth, self-service data pipelines, and stream processing adoption.
An analysis of key trends in the Apache Kafka ecosystem, including connector growth, self-service data pipelines, and stream processing adoption.
An overview of Kafka Streams, a new library for building stream processing applications using Apache Kafka, with key concepts and a code example.
Explains Lambda Architecture for Big Data, combining batch processing (Hadoop) and real-time stream processing (Spark, Storm) to handle large datasets.