I realize this is a sci-fi classic and I should have read it a long time ago, however, I recently finished reading Hitchhiker’s Guide to the Galaxy. I was surprised that a book written in the 1980s could be so present in its commentary on modern computing and machine learning, especially considering how fast the field is advancing. As I was reading, I dog-eared a few pages that I thought were particularly poignant and extremely relevant to my work as a data scientist. In the following sections, I’ve written up my top three takeaways from the book as they relate to my work. I hope you enjoy and if not anything else, convince you to read the book!
I was recently inspired by the Not So Standard Deviations podcast to start collecting some data about myself and my habits. Specifically, I wanted to track my day-to-day commute to work. I do tend to make fairly good use of my commute time since I am lucky enough to ride a train and not drive, however commuting still represents is a decent chunk of my time.
Imagine you have an amazing pie recipe that you’d like to make for the holidays at your in-laws. Your recipe is best cooked fresh and calls for some uncommon ingredients that your in-laws likely do not have in their kitchen. How do you go about making the recipe in an environment that is not your own?
Random forests are a one of my favorite machine machine learning methods. I’ve found them to be incredibly powerful in predicting a number of items in my work, but often run into performance issues running them on my local machine. A coworker recommended the R package H2O – an open source, high performance, in-memory machine learning platform. It has been a game changer in terms of my being able to run efficient predictive models locally. In this post, I will walk though implemenation of a random forest using a long passed Kaggle competition, Don’t Get Kicked.
When I first started building my website, I decided to use Wix. It’s a great website builder that has lots of custom options – it is kind of like the PowerPoint of website editing. My one issue with Wix is that I wanted to easily embed some R code into its pages. I was already familiar with R Markdown, so being able to publish R Markdown documents directly would be ideal.
I have been toying with the idea of creating a website for some time now. The two main reasons behind this are to provide a resource that I (or anyone else!) can look back at to recall processes or past work and to showcase some of my projects.