About recommendation systems (1/3)

septembre, 9, 2009
Sylvain

This post is highly inspired from the introduction of a research paper, written by Sébastien Hémon, Thomas Largillier and myself, entitled Partial ranking of products for recommendation systems.

Recommending products to customers is a field far older than computer science.  With the tremendous increase of the commercial potential of the e-commerce it has become of the utmost importance to perform well in this field.  Moreover, companies have now the ability to store information about consumption habits of customer: their previous purchases, their tastes. They also have access to information that often allows to cluster customers into communities that share some interests.

Recommendation systems are aimed to select products, for a particular user, from a list shared by all (available products for instance) according to known previous preferences of users. Roughly, there are two classical ways to recommend products: one consists in learning the preferences of a particular user based on the products he liked before and recommending products similar to these ones, while the other approach is to recommend products other users that seem to have the same preferences have liked.  The first approach is clearly based on the product itself and is generally called the content based approach in the literature while the second depends on the user and not on the products and is mentioned as the collaborative filtering approach.  Beneath the notion of collaborative filtering the following assumption is made: users who agreed in the past tend to agree again in the future. The recommendation is done for each user but one need, in order to do it, to collect information about the preferences of many users.

With Sébastien and Thomas, We proposed a recommendation system which follows the collaborative filtering paradigm. This algorithm allows to recommend to a particular user a product that will be in the set of its favorite products with high probability.Our definition of a good recommendation is thus recommending products that are in the $latex x$ favorites products (where $latex x$ is a fixed integer). Such recommendations are done according to a categorization of users into equivalence classes with respect to the relation « having the same preferred products ».  Our algorithm is inventive because it does not need to know the exact rating of each product by each user in order to make a good recommendation with high probability. It only needs a partial list of users’ taste like the ranking of the products w.r.t. an utility function rather than the utility values of some products.  It means that we only need the users to say « I prefer X to Y » rather than « X is worth 10, Y is worth 8 ».  This is the main difference with approaches that use a rating for each product and whose goal is the reconstruction of the utility function of a user, we also think that this is more realistic. This allows us to consider that the number of products may be very high. Moreover, our recommendation system uses information about the preferences of only a very small subset of users (called a committee) on a very small set of products called the witness products set.  For instance, in practical experiments, we use subsets of both users and products of size between 10 and 20.  Our method also uses partial information about the preferences of all users on this witness products set.

Although this method is new, it follows the basic definitions given by:
Petros Drineas, Iornadis Kerenidis, and Prabhakar Raghavan. Competitive recommendation systems. Annual ACM Symposium on Theory of Computing, Proceedings of the thirty-fourth annual ACM symposium on Theory of computing, pages 82-90, 2002.

In a next post I will give the principle of our method and the hypothesis we made to make sure that the method is working in a real life framework.

Picture: courtesy of Abby Blank