2025 highlights: AI research and code
A 2025 AI research review covering tabular machine learning, the societal impacts of AI scale, and open-source data-science tools.
A 2025 AI research review covering tabular machine learning, the societal impacts of AI scale, and open-source data-science tools.
Explores whether predictive statistical models require causal relationships to be useful, using examples from data sampling and real-world scenarios.
The author announces their new role as Probabl's CSO to accelerate development of the scikit-learn machine learning library and its ecosystem.
An infrastructure engineer explores AI Engineering, defining the role and its focus on using pre-trained models, prompt engineering, and practical application building.
A machine learning professor critiques the foundational concept of a 'data-generating distribution' and shares insights from teaching a truly distribution-free course.
Discusses handling class imbalance in predictive modeling, using medical and zebra analogies to explain adjusting for prior probabilities and error costs.
A keynote on trustworthy data visualization, exploring trust in an era of fake results, AI confabulation, and data infrastructure decay.
A tutorial on using the {fs} package in R for easier file path manipulation, extension management, and directory information retrieval.
Announces 9 new free and paid books added to the Big Book of R collection, covering data science, visualization, and package development.
Interview with Dr. Nick Feamster on network measurement, machine learning, and the Internet Equity Initiative's work on broadband access.
Announces 7 new free R programming books added to the Big Book of R collection, covering topics like machine learning, data science, and software engineering.
The Big Book of R adds 10 new books, including Spanish titles and English works on data science, statistics, and fantasy football analytics using R.
A guide for R users to learn basics of Python, HTML, CSS, JS, and C++ to enhance their data science and web development projects.
Positron is a new data science IDE from Posit that combines features from RStudio and VS Code, offering a specialized environment for R and Python.
Announces a major update to the Big Book of R website, including a migration to Quarto, a new Psychology chapter, and the addition of new R programming books.
Explains the key difference between AI models and algorithms, using linear regression and OLS as examples.
The article discusses the spin-off of scikit-learn's open-source development from Inria to a new mission-driven enterprise, Probabl, focusing on sustainable funding and growth.
A guide to useful RStudio shortcuts, settings, and tips to improve productivity and code readability for R programming.
Announces the addition of 6 new R programming books to the Big Book of R collection, covering statistics, machine learning, and data science.
The Big Book of R, a curated collection of free R programming books, celebrates a milestone of over 400 entries and requests community support for hosting costs.