You may be building for availability, but are you building for resiliency?
Explains the difference between high availability and high resiliency in system design, and why both are crucial.
Explains the difference between high availability and high resiliency in system design, and why both are crucial.
Explains operational flags, long-lived runtime controls for system resiliency, as opposed to temporary feature flags for releases.
Explains why over-reliance on automatic retries can harm low-latency platforms and advocates for fundamental resiliency practices.
Discusses operational best practices and ownership in serverless architecture, emphasizing responsibility despite outsourcing.
A guide to using the 'retrying' Python library for implementing robust retry logic in applications dealing with external failures.