Tale of a Kubernetes node-feature-discovery incident
Read OriginalA detailed post-mortem of a Kubernetes incident where upgrading the node-feature-discovery (NFD) component caused major scale issues. The new version's architectural shift to using NodeFeature custom resources consumed excessive etcd storage (~140 KB per node) in large production clusters, breaking pod scheduling. The article covers the decision to roll back and provides lessons on evaluating off-the-shelf components for large-scale operations.
Comments
No comments yet
Be the first to share your thoughts!
Browser Extension
Get instant access to AllDevBlogs from your browser
Top of the Week
No top articles yet