Thomas Lumley

Thomas Lumley writes thoughtful, in-depth articles on statistics, data analysis, and statistical modeling. His blog explores topics like survey methods, regression, simulations, and inference with a rigorous yet reflective approach.

https://notstatschat.rbind.io

RSS Feed

1/25/2026

statistics data analysis statistical modeling applied mathematics research methods

Articles from this Blog

215 articles from this blog

7/16/2024 • EN

A Bayesian t-test, again

Explores Bayesian alternatives to the frequentist t-test for comparing two means, discussing non-parametric and resampling-based approaches.

data analysis Bayesian Statistics Statistical Inference

6/27/2024 • EN

Estimator vs estimate

Explains the statistical nuance of sandwich variance estimators, focusing on the difference between an estimator and its realized value in a sample.

Statistical Inference Variance Estimation Robust Standard Errors

6/15/2024 • EN

Automatic transformation of standard errors?

Explores automatic delta-method transformations for variance estimates in R's survey package, enabling correct standard errors after mathematical operations.

statistics R Survey Package

6/4/2024 • EN

S3 method dispatch on other arguments

Explains how to use S3 method dispatch on arguments other than the first in R, using the survey package's svytotal function as an example.

R Statistical Computing Survey Package

5/21/2024 • EN

Crossvalidation in complex survey data

Explores experimental cross-validation methods for complex survey data using replicate-weight decompositions to respect sampling structure.

Statistical Modeling R Programming Survey Data

5/10/2024 • EN

Choosing frame weights in dual-frame surveys

Explores methods for choosing optimal weighting parameters (θ) in dual-frame survey sampling to minimize variance in population estimates.

R Programming Statistical Analysis Survey Sampling

4/29/2024 • EN

Another update on non-transitive dice

An update on the polymath research project about non-transitive dice and its statistical implications for the Wilcoxon/Mann-Whitney test.

mathematics statistics Probability

4/26/2024 • EN

Multiple frame sampling

Explains statistical methods for handling overlapping sampling frames in surveys, using a monster analogy for mobile and landline phone samples.

R Package Survey Sampling Multiple Frames

4/19/2024 • EN

Importance weights

Explains a fourth type of statistical weight for dual-frame surveys, addressing overlap to avoid double-counting in population estimates.

Statistical Analysis Survey Methodology Sampling Weights

4/14/2024 • EN

Assumptions

Discusses the nuanced role of assumptions in statistics, distinguishing between necessary and sufficient conditions, and their impact on interpreting models like linear regression.

statistics Linear Regression Assumptions

4/1/2024 • EN

Symbolically nested

Explains the concept of 'symbolically nested' statistical models, their computational advantages, and their importance in survey analysis.

Statistical Modeling Regression Analysis Nested Models

3/22/2024 • EN

Factors as factors

Explains the importance of factors in R for data analysis, covering when and how to convert strings to factors to avoid errors.

data types Strings R

3/9/2024 • EN

Small-area estimates by smoothing direct estimates

Explains statistical methods for estimating means in small domains or subpopulations, focusing on smoothing direct estimates using models like Fay-Herriot.

Statistical Modeling Linear Mixed Models Survey Estimation

1/9/2024 • EN

Asymptotics for linear mixed models

Explores the asymptotic behavior of parameter estimates in linear mixed models, focusing on the loglikelihood as a quadratic form in Gaussian variables.

statistics Asymptotic Theory Maximum Likelihood

12/18/2023 • EN

Why do the Rao-Scott tests have good size?

Explains why Rao-Scott statistical tests maintain good size control in survey data analysis, compared to standard chi-squared tests.

Hypothesis Testing Survey Data Statistical Inference

12/14/2023 • EN

How good is the leading eigenvalue approximation to quadratic forms?

Analyzes the accuracy of a leading eigenvalue approximation for quadratic forms in Gaussian variables, comparing it to traditional methods.

statistics Numerical Methods Quadratic Forms

12/12/2023 • EN

Why not REML?

Explains why the svylme package uses maximum likelihood instead of REML for survey-weighted linear mixed models, focusing on design and sampling constraints.

statistics Mixed Models Svylme