The most popular blogs of Hacker News in 2025
Analysis of the most popular personal blogs on Hacker News in 2025, based on a tracking project that ranks domains by their performance on the platform.
Analysis of the most popular personal blogs on Hacker News in 2025, based on a tracking project that ranks domains by their performance on the platform.
Explains the key differences between the = and <- assignment operators in the R programming language, focusing on scoping and side effects.
A tutorial for R users on mastering data wrangling in 5 progressive levels, using the dplyr package and the Ames housing dataset.
A comparison of the native Base R pipe (|>) and the {magrittr} pipe (%>%), covering their syntax, strictness, and use cases for data analysis.
A technical guide explaining why ggplot2 line charts sometimes appear blank and how to fix the issue, focusing on data structure and grouping.
A tutorial on the six most fundamental R functions for data cleaning, using the tidyverse and palmerpenguins dataset.
Argues against using LLMs to generate SQL queries for novel business questions, highlighting the importance of human analysts for precision.
Introduces BigQuery's new GROUP BY ALL syntax, which automatically infers grouping keys from SELECT items to simplify complex SQL queries.
Analysis of using numerical inputs vs. brackets for survey questions like age and income, focusing on UX and data analysis trade-offs.
A quick-start guide to using the R programming language for data analysis, covering installation, data exploration, and basic plotting with the iris dataset.
Explores a future AI-assisted computer interface model inspired by sci-fi, where AI highlights data anomalies for human specialist review.
The State of CSS 2022 survey is now open, gathering developer feedback on new CSS features, pain points, and usage patterns.
An analysis of futurist prediction methods, comparing accurate forecasters with those who have poor track records.
Analyzes if NPM package popularity correlates with quality using data from npms.io, finding it can be an indicator but not a guarantee.
An overview of the Pandas library for data analysis, covering data reading, filtering, merging, and visualization.
An analysis of user-created Sankey diagrams from Reddit, visualizing personal Tinder match data and dating outcomes.
An experiment testing if players with feminine usernames receive different in-game chat comments than those with masculine names in Overwatch.
A guide on using the ELK Stack (Elasticsearch, Logstash, Kibana) to analyze and triage large-scale Nmap scan results for penetration testing and offensive security.
Presentation slides for a Power BI tips and tricks talk at DataBISummit, available for download.
Analyzes decision-making quality in sports and board games, where clear data reveals the high cost of poor choices.