This post is about a case-study that made me fall in love with Data Analytics. Moneyball, a book by Michael Lewis in 2003, later adopted as a movie in 2011 starring Brad Pitt discusses how sports analytics changed baseball through the story of Oakland A’s, a team near San Francisco, California. In this post, I would try to recreate the numeric figures mentioned in the book by Michael Lewis (as studied from MIT’s MOOC called ‘The Analytics Edge‘ on edX).
A while back, I happened to complete Udacity’s Data Analyst Nanodegree. While completing my coursework, I worked on a project on Exploratory Data Analysis (EDA) (numerical and graphical examination of data characteristics and relationships before applying more formal, rigorous statistical analysis). In this project, a dataset on red wine quality was explored (using R & ggplot2) based on its physicochemical properties. The objective was to identify physicochemical properties that distinguish good quality wine from lower quality ones. I had a high sense of satisfaction when I completed my work, and I decided to write about the thought-process of how I went through the whole study, having already uploaded the source code on GitHub. There are chances that you were looking for a qualitative explanation on the subject and accidentally ended up on this post. In that case, I suggest you read this article.