CPU or GPU for your recommendation engine?
Captured source
source ↗CPU or GPU for your recommendation engine? Deploy • Olga Petrova • 07/10/20 • 9 min read
In today's data-driven world, GPUs are the hardware of choice for training Deep Learning models. What about tasks that do not involve artificial neural networks? For instance, is there a benefit to using a GPU for making product recommendations? Continue reading to find out!
Anyone selling anything these days makes recommendations. "Customers who bought this item also bought these ones." "Here are the top 10 TV series that we bet you'll enjoy." Sometimes these recommendations are based on the intrinsic properties of the products, but more often they come from the behaviours of users such as yourself.
Let us say we want to build a simple book recommender system. The data that we need for it is available on any website containing users' reviews of books: e.g. this dataset has been collected from BookCrossing.com , a website dedicated to the practice of "releasing books into the wild" - leaving them in public places to be picked up and read by other members of the community. There are three data tables available, but we will only be needing two of them today: BX-Books and BX-Book-Ratings containing information on the books and the bookcrossers' book ratings respectively (pardon the excessive use of book in the preceding sentence, finding a suitable synonym is no easy task!). Each book in BX-Books is identified by a unique ISBN, and each row of BX-Book-Ratings lists the ISBN of the title that the user's rating refers to.
Using cosine similarity to make product recommendations
First, let us discuss how the ratings can be leveraged to generate appropriate book recommendations. If you are already familiar with the basics of recommender systems (or simply uninterested in the details), feel free to skip to the next section for the CPU vs. GPU comparison.
For each book, there are multiple ratings posted by different users, and it is this information that we will be using to infer the "likeliness" of the books from. Consider Harry Potter and the Sorcerer's Stone, Harry Potter and the Chamber of Secrets (the first two tomes of the series), and a textbook called Quantum Computation and Quantum Information . To illustrate with a simple example, let us say we have a total of five readers. Four of them read both Harry Potter books and ranked them highly, and one of the four has also enjoyed reading about quantum information:
Reader A Reader B Reader C Reader D Reader E HP 1 7 8 7 9 8 HP 2 8 8 9 6 - QCQI 10 - - - -
Eric (that's what E stands for) has made a New Year's resolution to read more in 2020, and, as it often happens, made no effort in keeping to it until mid year. Eric read the first Harry Potter book, and would like to use the book ratings of his friends to decide what to read next. How can he do that?
First, let's replace the - (not read) signs in the table above with zeros:
Reader A Reader B Reader C Reader D Reader E HP1 7 8 7 9 8 HP2 8 8 9 6 0 QCQI 10 0 0 0 0
Now, each book corresponds to a 5-dimensional vector containing the scores each reader has assigned to it. Eric would like to know which book, Harry Potter 2 or the quantum information textbook, would be most similar to Harry Potter 1 in terms of the readers' feedback. Mathematically this means that we are going to consider two pairs of vectors: HP1 and HP2, and HP1 and QCQI. A popular measure of how similar two vectors are, is called the cosine similarity , given by the dot product of the two vectors divided by the product of their magnitudes:
cos(θ) = A · B / || A || || B ||
When two vectors are aligned with one another, the cosine of the (zero) angle between them is 1, meaning the similarity is maximised. The similarity is zero for two vectors that are perpendicular to each other (e.g. if there is no overlap in the users who read the two books), and can also be negative if we allowed for negative ratings in our data table. The cosines for the two pairs in question are calculated as follows:
COS(Θ) HP1 & HP2 (7x8+8x8+7x9+9x6) / [Sqrt(72+82+72+92+82) Sqrt(82+82+92+62)] 0.86 HP1 & QCQI (7x10) / [Sqrt(72+82+72+92+82) Sqrt(102)] 0.40
Thus, when we make a recommendation on which book to read next based on Eric's interest in Harry Potter and the Sorcerer's Stone, Harry Potter and the Chamber of Secrets is a good bet (much better than Quantum Computation and Quantum Information !)
As you can imagine, the amount of effort to calculate cosine similarities for each pair of vectors grows quite quickly with the number of books as well as the number of users. Let us first see how long this will take on a CPU (10 core vCPU from a high end Intel Xeon Gold 6148 processor to be precise), using the popular pandas and sklearn libraries.
The CPU route
Let us start by loading the data using pandas .
import pandas as pd datadir = 'reco/BX-CSV-Dump/' books = pd.read_csv(datadir+'BX-Books.csv', sep=';', error_bad_lines=False, encoding="latin-1") books.columns = ['ISBN', 'bookTitle', 'bookAuthor', 'yearOfPublication', 'publisher', 'imageUrlS', 'imageUrlM', 'imageUrlL'] ratings = pd.read_csv(datadir+'BX-Book-Ratings.csv', sep=';', error_bad_lines=False, encoding="latin-1") ratings.columns = ['userID', 'ISBN', 'bookRating'] CopyContentIcon Copy code Now we can inspect the contents of the two tables:
and
Unless your final goal is to be able to say: "Here is the list of books that users with reading history similar to yours may or may not have hated with passion", you will probably want to remove all ratings below a certain threshold:
Keep only Ratings above 5: ratings = ratings[ratings.bookRating > 5] CopyContentIcon Copy code
I will also drop the columns that we will not be needing from the Books table, make sure each ISBN in it corresponds to a single book entry, and set ISBN as the table's index:
columns = ['yearOfPublication', 'publisher', 'imageUrlS', 'imageUrlM', 'imageUrlL'] books = books.drop(columns, axis=1) books = books.drop_duplicates(subset='ISBN', keep="first") books = books.set_index('ISBN', verify_integrity=True) CopyContentIcon Copy code An additional pre-processing step that we will take is to filter out books that have been rated…
Excerpt shown — open the source for the full document.