Thomas Lumley 3/5/2018

Faster generalised linear models in largeish data

Read Original

This technical article discusses an optimization for fitting generalized linear models (GLMs) on large datasets. It proposes using a starting estimator from a subsample, followed by a single Newton-Raphson iteration computed via a single database query, to achieve asymptotic efficiency. This approach aims to be faster than iterative methods like `bigglm` in R, especially when data resides in a database, and includes a practical example with a logistic regression on a vehicle dataset.

Faster generalised linear models in largeish data

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser