Automatic Stats Updates Don’t Always Invalidate Cached Plans
Explains a SQL Server edge case where automatic statistics updates don't always invalidate cached execution plans, focusing on system-created stats.
Explains a SQL Server edge case where automatic statistics updates don't always invalidate cached execution plans, focusing on system-created stats.
Explores whether predictive statistical models require causal relationships to be useful, using examples from data sampling and real-world scenarios.
Explores the Mills ratio, comparing tail behavior of Student t and normal distributions to illustrate fat-tailed vs. thin-tailed distributions.
Explores the probability of extreme 'six sigma' events using the Student t distribution, showing it's not monotonic and depends heavily on degrees of freedom.
A blog post arguing that statistical inference is often used as a tool of rhetoric and persuasion, rather than pure objective science.
An introduction to stylometry, the statistical analysis of writing style, with examples from historical texts and natural language processing.
A statistical analysis of the classic board game Snakes & Ladders, modeling it as a Markov chain to calculate the expected game length.
A year-in-review blog post reflecting on machine learning course blogging, revisiting 'The Bitter Lesson', and critiquing trends in ML and economics.
Critique of causal inference in statistics, highlighting the flawed assumption that treatments have no impact on future outcomes, using cancer screening trials as an example.
A machine learning professor critiques the foundational concept of a 'data-generating distribution' and shares insights from teaching a truly distribution-free course.
Discusses handling class imbalance in predictive modeling, using medical and zebra analogies to explain adjusting for prior probabilities and error costs.
A blog archive listing posts about data visualization, statistical analysis, and data science using the R programming language.
A blog archive listing numerous data visualization projects, tutorials, and analyses, primarily focused on demographic, social, and public data.
A lecture on the foundational statistical concept of orderings and ordinal data, exploring their analysis and complications in fields like health research.
Explains the statistical concept of included-variable bias in regression models, challenging the traditional 'omitted-variable bias' framing.
A detailed proof walkthrough of the De Moivre–Laplace theorem, the earliest version of the central limit theorem for the binomial distribution.
Explores the challenges of analyzing ordinal data, focusing on transformation invariance and the limitations of statistical comparisons.
Announces 7 new free R programming books added to the Big Book of R collection, covering topics like machine learning, data science, and software engineering.
Analyzes four datasets with high collinearity between predictors, demonstrating statistical diagnostics and modeling approaches using R.
The Big Book of R adds 10 new books, including Spanish titles and English works on data science, statistics, and fantasy football analytics using R.