In this day and age, we have so many choices when it comes to making online purchases, watching a TV show or movie using a video streaming service, or finding songs to listen to online. Given the plethora of choices, companies need some way of curating content for customers, since browsing through millions of products on a website is just not feasible.

Enter recommender systems. If you have ever shopped on Amazon, or watched a TV show on Netflix, you have encountered recommender systems at work. Not only do recommender systems provide a way to quickly show customers what they are looking for, but they also help customers discover new products or media that they might like.

A well-built recommender system can lead to an excellent customer experience, which is why understanding how they work is highly important from a data science perspective. recommender systems were simple in their early days, and have since evolved into more complex models.

Overview of Recommender Systems

The common idea in different recommender systems is that there needs to be some measure of similarity.

The two most popular ways in which recommendations are done are:

Item-based: Finding a similarity between items, and recommending an item based on interest of a similar item. For example, if I watch an episode of Marvel’s Agents of Shield on Netflix, I would see other Marvel-related content in my TV recommendations. This is also called “Content-based filtering”. A benefit of using this approach is that you do not have to rely on a lot of user data since the similarity calculation is happening at the item level.
Customer-based: Finding similarity between customers, and recommending an item that a similar customer has purchased/watched. This is also called “Collaborative filtering”. A benefit of this approach is that it works better for discovery, since items that look unrelated initially, might be liked by similar customers. Again taking the Netflix example, if I watch an episode of Marvel’s Agents of Shield, and other people who have watched Marvel’s Agents of Shield also watch The Office, then I would get a recommendation for The Office, even though the two shows seemingly have little in common.

Recommender systems could also combine these two methods into a hybrid approach which can benefit from the strengths of both item-based and customer-based methods.

Current State

While in the beginning of recommender systems it was important to find explicit similarity in people and products, a more effective method has been used to look at similarity of latent attributes. This is done by using matrix factorization. To oversimplify, all of the attributes for an item or a customer are combined in a way which reveals relationships that have not yet been realized.

Photo by Thomas Kelley on Unsplash

Math has a way of looking like magic at times.

A simple example of this is using matrix factorization to determine movie genres, without actually inputting what the genres are. While this seems pointless since all movies already have a predefined genre, this technique can allow new genres to be determined which fit the viewer-base in a way that results in better recommendations.

The algorithm can look at attributes such as title, names of actors in the movie, name of director, movie run-time, and many other attributes and output a new “genre” such as “appealing to the 25-35 age range”. It is worth noting that the algorithm will not give a name for a new genre, but will still incorporate it into the recommendation.

Future State — Deep Learning

Neural Networks and Deep Learning have been all the rage the last couple of years in many different fields, and it appears that they are also helpful for solving recommender system problems.

Ben Allison, a Principal Machine Learning Scientist at Amazon, gave a great talk earlier this year at Amazon’s re:MARS conference about building recommender systems using Recurrent Neural Networks and Deep Learning.

One of the benefits of Deep Learning is similar to matrix factorization, in that there is an ability to derive latent attributes. Deep Learning, however, can make up for some of the weaknesses of matrix factorization such as the inability to include time in the model - which standard matrix factorization isn’t designed for. Deep Learning, however, can utilize Recurrent Neural Networks which are specifically designed for time and sequence data.

Incorporating time into a recommender system is important, because there are often preference seasonal effects. For example, it is likely that in December, more people are going to be watching holiday-themed movies and buying home decorations.

Another point that Ben Allison brought up is the need to see what would happen if a customer was shown a sub-optimal recommendation. This is taking a reinforcement learning approach, since the goal in this case would be to show customers a recommendation, and then record what the customer does. At times, customers can be recommended something that does not seem like the best option, just to see how the customer reacts which will improve the learning in the long-term.

Recommender systems can be a very powerful tool in a company’s arsenal, and future developments are going to increase business value even further. Some of the applications include being able to anticipate seasonal purchases based on recommendations, determine important purchases, and give better recommendations to customers which can increase retention and brand loyalty.

Most businesses will have some use for recommender systems, and I encourage everyone to learn more about this fascinating area.