Entry Point Data
A Python tutorial covering essential tools and techniques for machine learning, including data visualization, PCA, LDA, and classification.
A Python tutorial covering essential tools and techniques for machine learning, including data visualization, PCA, LDA, and classification.
A tutorial on using Python tools for machine learning, covering data loading, visualization, preprocessing, and classification with scikit-learn.
A practical guide to implementing Bayesian analysis in Python using MCMC packages like emcee, PyMC, and PyStan, with a line-fitting example.
A data scientist analyzes Seattle's bicycle counter data using Python to determine if cycling is truly increasing or just affected by good weather.
Article critiques a misleading report claiming no gender pay gap in tech, using evidence from the AAUW study to refute the claim.
A technical guide on using SQL window functions, specifically LAG, to calculate month-over-month revenue growth percentages for SaaS or recurring billing analysis.
An explanation of Microsoft Azure HDInsights, a managed Apache Hadoop service for processing big data on Azure.
A tutorial on handling dates and times in R, covering essential classes like Date and POSIXct, formatting, calculations, and sequences.
RSiteCatalyst v1.3 adds regex search, Realtime API support, and configurable request timing for the Adobe Analytics R package.
Final tutorial on analyzing airline data with Hadoop using Hive for SQL queries and Pig for scripting, covering setup and basic analytics.
A developer's side project to analyze PyPI download logs, extracting insights about Python versions, installers, and operating systems used by package consumers.
A developer shares their journey learning Python, including recommended courses, books, and IDEs, and their decision to take a university course.
Exploring function pointers in IDL (Interactive Data Language) for refactoring legacy scientific code, with insights into the language's syntax and quirks.
Authors respond to critique of their computational linguistics paper on analyzing movie characters, discussing interdisciplinary research methods.
A comprehensive, curated list of Python programming resources for all skill levels, covering tutorials, libraries, frameworks, and best practices.
RSiteCatalyst 1.1 released with new API features, faster calls, and extended timeout for Adobe Analytics data in R.
Explains how Bayesian A/B testing improves online headline optimization, overcoming challenges of traditional frequentist methods for faster, more accurate results.
A critique of common pitfalls and unproductive patterns in statistics research presentations, aimed at improving academic discourse.
Explores the concept of 'error' in regression models, clarifying when it represents measurement error versus model prediction error.
A summary of upcoming technical talks on statistical computing, rare DNA variant analysis, and handling large datasets with R and SQL.