Modern Recommender Systems - Part 2: Data

Pavel Kordik

Mar 07, 2024

Data used by modern recommenders and how we can measure progress towards goals.

Modern Recommender Systems

1. Introduction
2. Data
3. Comming Soon

Data Is Crucial

Data plays an essential role in the functioning of a recommender system, as it is the primary source of information used to generate accurate and personalized recommendations. In this blogpost, we will discuss the importance of data for recommender systems, the various types of data sources used, and how data can be used to improve the accuracy and effectiveness of recommendations. The data that can be used for recommendations can be categorized into 1) Item catalog 2) User catalog and 3) History of user X item interactions.

Item Catalog

First of all, it is good to know what we can recommend to users. A database of all items is called an item catalog. In this catalog, we store not only items that can be recommended (active items), but also historical items that were recommended in the past and are not available to users any more. Those historical items are important when measuring similarity of users who interacted with them in the past.

Attributes of items help recommenders understand how items are related and which are more similar than others. Here are a few examples of most important item attributes (or item properties).

Categories - Items can be categorized into distinct groups, however you might also come with a hierarchical system of categories where one item can belong to multiple categories. Categories can be used to create item segments so you can recommend particular categories to a given user. You can also filter out items from recommendation based on their category labels or boost probability that items from a particular set of categories are recommended to a user.
Text descriptions - When you recommend articles, the text of the article can be used in a text description attribute of the item. Modern recommenders have capabilities to process text using advanced neural networks. Similarities of text neural item embeddings can be very important especially when recommending cold start items that do not have many interactions yet.
Images - Modern recommenders can use multiple images of an item to create an image neural item embedding. Again, such information is super important for recommendation systems especially when images play a significant role for users (e.g online art gallery) or when interactions and text descriptions are missing. Imagine an online marketplace where users can upload images of items for sale. As they use their smartphones, it is not likely that they will also add rich and informative text descriptions. Another example would be a real-estate portal, where users like to find similar listings based on images of properties. Or a fashion e-commerce site that decided to utilize visual similarity to recommend alternatives from the product catalog.

User Catalog

Similarly to item catalog, user catalog holds attributes and properties of users. Most important user attributes are the following:

Location of user - Geographic location of users is important in recommendation scenarios, when users are interested in items that are located nearby (such as real estate, job or event recommendation). Even users with no interaction history can then get relevant recommendations such as popular items in their region.
User search history - One can suggest relevant items based on historical user search queries. Also, user search history is instrumental for personalized query suggestions, where reminding users about their past similar queries is very helpful.
User bio, interests or skills - In some domains, it is important to take into consideration not just user interactions with items, but also additional background information that can reveal user interests and help to select relevant items. Again, this is particularly important in cold start scenarios where we need to recommend to users without historical interactions.

Conclusion

Data stands at the core of modern recommender systems, fueling the generation of personalized and precise recommendations. The effectiveness of these systems hinges on their ability to leverage diverse data sources, including item catalogs, user catalogs, and user-item interactions. By understanding the attributes of both items and users, along with their interaction history, recommender systems can navigate the complexities of personalization, privacy, and changing user preferences to provide relevant recommendations. However, challenges such as difficulty of user identification, and the dynamic nature of user attributes necessitate advanced strategies to maintain the data useful for improving user experience. Furthermore, accurate collection and interpretation of user feedback are essential for refining recommendation algorithms and enhancing user satisfaction.

Here are main takeaways from the article:

Data Categorization: Recommender systems rely on item catalogs, user catalogs, and the history of user-item interactions to generate recommendations.

Item Catalog Importance: Attributes stored in the item catalog, like categories, text descriptions, and images, help in understanding item relationships and preferences for better recommendations.

User Catalog Challenges: Data privacy, user identification difficulties, and the need for up-to-date user attributes present significant challenges in maintaining useful and actual user profiles.

User to Item Interactions: The most crucial data source for recommender systems, enabling the creation of personalized recommendations based on user behavior.

Feedback Collection Challenges: Issues such as caching recommendations, biased user interactions, and lack of explicit feedback pose challenges to the effectiveness of recommender systems.

Privacy and Personalization Balance: Modern platforms must navigate the delicate balance between providing personalized experiences and respecting user privacy.

Advanced Data Quality Strategies: Employing advanced techniques to address biases, improve data quality, and adapt to user behavior changes is essential for the continued relevance and effectiveness of recommender systems.

Elevate Your Personalization Strategy with Recombee's Innovative Features

The digital landscape and customer preferences and behavior are changing faster than ever now. To help our clients stay on top of the game, our team has focused on developing innovative features...

Jan Valuch

Mar 13, 2024

New Features

Recommendation Engine

Recombee Real-Time AI Recommendations as the New Destination in Segment

Segment has enabled its users to enjoy Recombee personalization services without the need to leave their platform and with minimum coding involved. With a few simple clicks, domains using Segment can upgrade their services to maximize the digital experience for their customers.

Adela Sloupenska

Mar 05, 2024

Personalization

Integrations

Partnerships

Is This Comment Useful? Enhancing Personalized Recommendations by Considering User Rating Uncertainty

Picture this: you're on the hunt for the perfect new smartphone, browsing through your favourite online electronics store. The online store’s recommendation engine pops up with what it thinks could be your possible next gadget love...

Rodrigo Alves

Mar 01, 2024

Recommendation Engine

Personalization

Modern Recommender Systems - Part 2: Data

Data Is Crucial

Item Catalog

User Catalog

Problems and Challenges of User Catalog

Data Privacy Concerns

User Identification Difficulties

Maintaining Up-To-Date User Attributes

User to Item Interactions

Problems With Collecting User Feedback

Conclusion

Next Articles

Elevate Your Personalization Strategy with Recombee's Innovative Features

Recombee Real-Time AI Recommendations as the New Destination in Segment

Is This Comment Useful? Enhancing Personalized Recommendations by Considering User Rating Uncertainty

Try the World’s Best Recommender Engine
Free for 30 Days

Modern Recommender Systems - Part 2: Data

Data Is Crucial

Item Catalog

User Catalog

Problems and Challenges of User Catalog

Data Privacy Concerns

User Identification Difficulties

Maintaining Up-To-Date User Attributes

User to Item Interactions

Problems With Collecting User Feedback

Conclusion

Next Articles

Elevate Your Personalization Strategy with Recombee's Innovative Features

Recombee Real-Time AI Recommendations as the New Destination in Segment

Is This Comment Useful? Enhancing Personalized Recommendations by Considering User Rating Uncertainty

Try the World’s Best Recommender Engine Free for 30 Days

Try the World’s Best Recommender Engine
Free for 30 Days