A Bayesian t-test, again
Explores Bayesian alternatives to the frequentist t-test for comparing two means, discussing non-parametric and resampling-based approaches.
Thomas Lumley writes thoughtful, in-depth articles on statistics, data analysis, and statistical modeling. His blog explores topics like survey methods, regression, simulations, and inference with a rigorous yet reflective approach.
215 articles from this blog
Explores Bayesian alternatives to the frequentist t-test for comparing two means, discussing non-parametric and resampling-based approaches.
Explains the statistical nuance of sandwich variance estimators, focusing on the difference between an estimator and its realized value in a sample.
Explores automatic delta-method transformations for variance estimates in R's survey package, enabling correct standard errors after mathematical operations.
Explains how to use S3 method dispatch on arguments other than the first in R, using the survey package's svytotal function as an example.
Explores experimental cross-validation methods for complex survey data using replicate-weight decompositions to respect sampling structure.
Explores methods for choosing optimal weighting parameters (θ) in dual-frame survey sampling to minimize variance in population estimates.
An update on the polymath research project about non-transitive dice and its statistical implications for the Wilcoxon/Mann-Whitney test.
Explains statistical methods for handling overlapping sampling frames in surveys, using a monster analogy for mobile and landline phone samples.
Explains a fourth type of statistical weight for dual-frame surveys, addressing overlap to avoid double-counting in population estimates.
Discusses the nuanced role of assumptions in statistics, distinguishing between necessary and sufficient conditions, and their impact on interpreting models like linear regression.
Explains the concept of 'symbolically nested' statistical models, their computational advantages, and their importance in survey analysis.
Explains the importance of factors in R for data analysis, covering when and how to convert strings to factors to avoid errors.
Explains statistical methods for estimating means in small domains or subpopulations, focusing on smoothing direct estimates using models like Fay-Herriot.
Explores the asymptotic behavior of parameter estimates in linear mixed models, focusing on the loglikelihood as a quadratic form in Gaussian variables.
Explains why Rao-Scott statistical tests maintain good size control in survey data analysis, compared to standard chi-squared tests.
Analyzes the accuracy of a leading eigenvalue approximation for quadratic forms in Gaussian variables, comparing it to traditional methods.
Explains why the svylme package uses maximum likelihood instead of REML for survey-weighted linear mixed models, focusing on design and sampling constraints.
Explores sparse correlation structures in statistical models and the conditions under which the Central Limit Theorem holds for dependent data.
Announcing a preprint for the svylme package, introducing the svy2lme function for fitting linear mixed models to complex survey data.
A technical tutorial on fitting linear mixed models using pairwise pseudolikelihood in R with the svylme package, using educational survey data.