Towards Data Science - Author Spotlight with Eugene Yan
An interview with data scientist Eugene Yan discussing his career path from psychology to Amazon, favorite ML projects, and advice for aspiring data scientists.
An interview with data scientist Eugene Yan discussing his career path from psychology to Amazon, favorite ML projects, and advice for aspiring data scientists.
Introducing mltrace, an open-source lineage and tracing tool for debugging and maintaining production machine learning pipelines.
A high-level guide to tools and methods for understanding AI/ML models and their predictions, known as Explainable AI (XAI).
Explores the strategic 'metagame' of applying machine learning in industry, focusing on problem selection and business impact over pure technical knowledge.
Explores the distinction between using regression models for causal inference versus predictive inference, and the role of generalizability in prediction.
Explores how mutual information and KL divergence can be used to derive information-theoretic generalization bounds for Stochastic Gradient Descent (SGD).
An interview with AI researcher Joelle Pineau discussing her work in reinforcement learning, its applications, and advice for newcomers to the field.
A guest post sharing personal stories of imposter syndrome in tech and academia, with lessons on recognizing and managing self-doubt.
A developer builds a Chrome extension using TensorFlow.js to toggle dark/light mode on Netlify by clapping hands.
Explains the theory behind Linear Regression, a fundamental machine learning model for predicting continuous numerical values.
A technical guide on computing distance matrices using NumPy, focusing on Euclidean distance and its application in machine learning algorithms like k-Nearest Neighbors.
A data science leader shares insights from a fireside chat on building and running data teams, focusing on their role as profit centers and collaboration strategies.
A podcast episode exploring life lessons derived from machine learning concepts like data cleaning, explore-exploit, and overfitting.
Exploring how deep learning and a pre-trained geolocation model can be used to automate and improve performance in the GeoGuessr geographic discovery game.
A guide on writing effective design documents for machine learning systems, covering structure, purpose, and a two-stage review process.
A technical guide on building an indoor location prediction system using WiFi signal data and a Random Forest classifier in JavaScript.
Explores the concept of feature stores in machine learning, presenting a hierarchy of needs from basic access to full automation.
A curated list of public dataset repositories for machine learning and deep learning projects, including computer vision and NLP datasets.
A curated list of public dataset repositories for machine learning and deep learning projects, including sources for computer vision, NLP, and more.
A behind-the-scenes look at designing and implementing a production machine learning system for a major hospital group, covering architecture and validation.