Recommender Models

Short Summary: Various recommender system models.

Uses domain expertise and hand-engineered features (i.e., categories for users and items).
- We can then use similarity functions such as cosine similarity for recommendations.
Pros:
- No need for data about other users as recommendations are user-specific (scalable).
- Can capture the specific user interests and recommend less popular, niche items.
Cons:
- Since the features are hand-engineered, a lot of domain knowledge is required.
- Can only recommend based on existing interests (cannot discover new interests).

Uses user embedding \(U \in \mathbb R^{m \times d}\) and item embeddings \(V \in \mathbb R^{n \times d}\) (factors of the matrix).
- During training, computes \(UV^T\) and uses a loss function to compare to actual results.
Pros:
- No domain knowledge necessary since the embeddings are automatically learned.
- The model can help users discover new interests based on interests of similar users.
- Embeddings \(U\) and \(V\) are static, and candidates can be pre-computed and stored.
Cons:
- Cannot handle fresh items and suffers from the cold-start problem.
- Difficult to include side features since the embeddings are automatically learned.

Uses a deep neural network for predictions.
- The input is the user query.
- Output is a probability tensor with size equal to the number of items.
- Can use the cross-entropy loss for training.
Pros:
- Easy to include side features.
- Easy to handle new queries.
- Item embeddings \(V\) are static and can be stored.
Cons:
- Prone to folding. Need to use techniques such as negative sampling or gravity.
- Harder to scale to very large corpora (can use hashing, negative sampling, etc).
- The query embedding needs to be computed at query time (more expensive to serve).