Durable Execution for AI workflows
Explores using durable execution engines like Azure Durable Task Scheduler to build robust, long-running AI workflows, such as summarizing articles and generating newsletters.
Explores using durable execution engines like Azure Durable Task Scheduler to build robust, long-running AI workflows, such as summarizing articles and generating newsletters.
Explores the concepts of knowledge and common knowledge in distributed systems, starting with the classic muddy children puzzle.
Explains how to replace brittle, synchronous side-effects in endpoints with a resilient, event-based system using queues for better error handling and performance.
An introduction to distributed systems, covering core challenges and recommended learning resources like the book 'Designing Data-Intensive Applications' and the MIT course.
A personal journey from aspiring dancer to Python programmer and eventually a distributed systems researcher, detailing career transitions and technical growth.
A software engineer reflects on the human challenges of tech work, including burnout, team attrition, and the pressure to refactor legacy systems.
Explores the reliability of timers in distributed algorithms like Raft, arguing they are viable with safety margins for mechanisms like leader leases.
Explores extending TLA+ for performance modeling using queueing theory and simulation, moving beyond just correctness verification.
Summary of talks from the 2025 TLA+ Community Event, focusing on formal methods and model-guided fuzzing for distributed systems.
Explores transactions as a protocol that can be added to any storage system, not an intrinsic feature, with examples from Delta Lake, Epoxy, and Two-Phase Commit.
Explains the concept of a 'synchrony budget' for designing distributed systems, advocating for asynchronous communication to improve performance and availability.
Explains the concept of a 'synchrony budget' for distributed systems, advocating for minimizing synchronous calls to improve performance and availability.
Explores KIP-932, a proposal to add queue semantics and share groups to Apache Kafka for improved message processing.
Explores KIP-932, a proposal to add queue semantics and share groups to Apache Kafka for improved message processing.
A review of SwiftPaxos, a new Paxos variant designed for fast, geo-replicated state machines in high-latency networks.
A critique of semantic versioning in observability marketing, arguing that terms like 'Observability 2.0' describe a real technical shift despite overuse.
A guide to using the Porcupine library to check for linearizability in distributed systems like registers and key-value stores, implemented in Go.
Explores the future of PostgreSQL, focusing on the power of extensions like pg_stat_statements, Citus, and pg_search to add new capabilities.
Phil Eaton explains the core concepts and intuitions behind distributed consensus systems in a technical talk.
How to configure and use Azure Service Bus as a custom pub/sub broker with Diagrid Catalyst for distributed applications.