R workflow fun
A roundup of blog posts and resources discussing various data analysis workflows and tools in the R programming language.
A roundup of blog posts and resources discussing various data analysis workflows and tools in the R programming language.
A critique of data visualization choices in a KCSE exam analysis, comparing heat maps to line graphs for clarity.
A technical guide on using the rgoodreads R package to analyze personal Goodreads reading data and critique the 5-star rating system.
A guide to the Lomb-Scargle periodogram, explaining its use, common misconceptions, and practical considerations for analyzing astronomical data.
Explores implementing group-by operations from scratch in Python, comparing performance of Pandas, NumPy, and SciPy for data aggregation.
A technical analysis using sentiment analysis on Warren Buffett's shareholder letters from 1977-2016 to identify trends in tone and market influence.
A quick PowerShell script to count the frequency of first letters in a list of surnames from a text file.
A video series on transitioning from interactive Jupyter data exploration to reproducible, packaged, and tested code for data analysis.
Explores SQL-on-Hadoop engines like Apache Drill for analyzing ETL data processed with Spark on Amazon EMR, focusing on performance and flexibility.
A technical guide on analyzing personal Google Location History data using Python, Pandas, and visualization libraries to map and gain insights from location data.
A tutorial on using Apache Drill to query and analyze JSON files with SQL, using blog analytics as a practical example.
Analyzing the relationship between age and desired job roles among new coders using the 2016 Kaggle survey data.
A technical guide on using Google BigQuery to analyze GitHub pull request data, including SQL queries for repository statistics.
The author reflects on R's rise in programming language rankings and its unexpected adoption across diverse fields over 20 years.
A curated list of insightful programming blogs covering topics like JVM internals, performance, ML, engineering culture, and computer architecture.
Release notes for RSiteCatalyst 1.4.8, featuring segment stacking, date range parameters, and bug fixes for the Adobe Analytics R package.
A technical guide demonstrating how to call the RSiteCatalyst R package from Python using the rpy2 library for data analysis.
Release notes for RSiteCatalyst versions 1.4.6 and 1.4.7, detailing bug fixes and new features for the Adobe Analytics R package.
A data analysis of a radio station's song rotation patterns using vector math and statistical methods to test anecdotal claims about repetitive playtimes.
Analyzing a classic probability problem involving dice rolls, its historical context with Newton and Pepys, and the mathematical intuition behind it.