Causal inference 4: Causal Diagrams, Markov Factorization, Structural Equation Models
Explores the relationship between causal and statistical models, focusing on causal diagrams, Markov factorization, and structural equation models.
Explores the relationship between causal and statistical models, focusing on causal diagrams, Markov factorization, and structural equation models.
Explores the distinction between using regression models for causal inference versus predictive inference, and the role of generalizability in prediction.
A statistical analysis of multicollinearity in regression models, discussing its impact on coefficient interpretation and prediction.
Explains Neyman allocation for optimal stratified sampling and its exact integer solution, linking it to US Electoral College apportionment.
A guide to implementing a simple anomaly detection system using only SQL and basic statistics, aimed at developers.
Explains the three main types of statistical weights (precision, frequency, sampling), their uses, and the software documentation challenges they create.
Overview of new features in version 4.0 of the R survey package, focusing on improved contrast estimation and replicate handling.
Explores the statistical challenges and potential bias when adjusting stratification variables during multi-wave sampling for population estimation.
A data scientist's journey from dogmatic Bayesianism to a pragmatic, 'secular' use of Bayesian tools without requiring belief in the model's literal existence.
A critique of the Oxford-Munich Code of Conduct for Data Scientists, focusing on its technical recommendations on sampling and data retention.
Explains the theory behind linear regression models, a fundamental machine learning algorithm for predicting continuous numerical values.
A technical guide exploring workarounds to update SQL Server statistics on secondary replicas in Availability Groups, including scripts and methods.
A statistical re-analysis of a published study on the mouse microbiome and autism, examining data and p-values from behavioral experiments.
A statistical analysis discussing the limitations of confidence intervals, using examples from small-area sampling to illustrate their weak properties.
A data scientist clarifies common misconceptions about the field, explaining that machine learning is only a small part of the job and advanced degrees aren't always required.
A technical analysis verifying a statistical calculation from an XKCD comic, involving normal distribution probabilities and R code.
A technical analysis of bus punctuality using Auckland Transport API data, with R code for data processing and visualization.
A guide to six statistical methods (frequentist and Bayesian) for comparing group means, with R and Stan code examples.
Announcement for a lecture series on machine learning, covering topics like Weka, deep learning, algorithmic fairness, and sparse supervised learning.
A tutorial on using the infer package in R for hypothesis testing through simulation, following a modern statistical approach.