John D. Cook 4/18/2026

Gaussian distributed weights for LLMs

Read Original

This article discusses NF4 and FP4 4-bit floating point formats used for quantizing large language model (LLM) weights, particularly in bitsandbytes and Hugging Face models. It explains why NF4 uses Gaussian-distributed values to better match the distribution of LLM parameters, unlike FP4 which uses evenly spaced values. The author critiques the QLoRA paper's description of NF4, highlighting ambiguities in the definition of quantile-based indexing and issues with representing zero. The article also mentions attempts to reproduce NF4 values from the paper's appendix, noting discrepancies.

Gaussian distributed weights for LLMs

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser

Top of the Week

No top articles yet