Fastai Collaborative Filtering with R and Reticulate

Author

Notesofdabbler

Published

April 1, 2018

Jeremy Howard and Rachel Thomas are founders of fast.ai whose aim is to make deep learning accessible to all. They offer a course called Practical Deep Learning for Coders (Part 1). The last session, taught by Jeremy, was in Fall 2017 and the videos were released early January 2018. Their approach is top down by showing different applications first as black boxes followed by progressive peeling of the black box to teach the details of how things work. The course uses python and they have developed a python library fastai that is a wrapper around PyTorch.

I wanted to learn reticulate by trying to create a R version of one of the python notebooks from that class. The class covers the topic of collaborative filtering in lecture 5 and lecture 6. The dataset used is a sample of movielens dataset where about ~670 users have rated ~9000 movies. The objective is to develop a model to predict the rating that a user will give for a particular movie.

The Jupyter notebook for this topic is divided into 2 portions:

Since the first half involved mainly python functions from fastai library, it seemed like a good use case for reticulate since we could use reticulate just for model development and use R functions for other pre and post processing tasks. The second half involved model building from scratch. In pyTorch, custom models need to be written as python classes. While it was still possible to use reticulate in this case, this may not be the ideal use case since it might be better for somebody developing custom models to do the whole work in python. But once they wrap it into a python package, it is easier to use from R. Overall, reticulate was great to work with and it made it very easy to translate a python function to an equivalent R function. It is a great addition to the R packages.