I realize this is a sci-fi classic and I should have read it a long time ago, however, I recently finished reading Hitchhiker’s Guide to the Galaxy. I was surprised that a book written in the 1980s could be so present in its commentary on modern computing and machine learning, especially considering how fast the field is advancing. As I was reading, I dog-eared a few pages that I thought were particularly poignant and extremely relevant to my work as a data scientist. In the following sections, I’ve written up my top three takeaways from the book as they relate to my work. I hope you enjoy and if not anything else, convince you to read the book!
Random forests are a one of my favorite machine machine learning methods. I’ve found them to be incredibly powerful in predicting a number of items in my work, but often run into performance issues running them on my local machine. A coworker recommended the R package H2O – an open source, high performance, in-memory machine learning platform. It has been a game changer in terms of my being able to run efficient predictive models locally. In this post, I will walk though implemenation of a random forest using a long passed Kaggle competition, Don’t Get Kicked.
When I first started building my website, I decided to use Wix. It’s a great website builder that has lots of custom options – it is kind of like the PowerPoint of website editing. My one issue with Wix is that I wanted to easily embed some R code into its pages. I was already familiar with R Markdown, so being able to publish R Markdown documents directly would be ideal.
I have been toying with the idea of creating a website for some time now. The two main reasons behind this are to provide a resource that I (or anyone else!) can look back at to recall processes or past work and to showcase some of my projects.