Comments:

# In short Short article about the use of cross-domain recommendation (ie use recommendation profile on a type of products to recommend products of another type) and in particular the gain when considering cold-start problems. The article is short and esay to read, experimental results do not allow to identify a clear trend, in particular concerning diversity related results. # Summary ### Introduction ##### usual solution to cold start pbs: * ask users directly * exploit additional information for example combining collaborative filtering with content-based rec ##### literature on the second possibility (cross-domain RS) * ref 15 (Winoto, Tang): conjecture that CD-RS may degrade accuracy but improve diversity (they test this assumption in the paper) * ref 11 (Sahebi, Brusilovski): quality of rec improve when domains are semantically connected ##### article assumption profile size (quantity of info) and diversity in source domain have an impact on accuracy in target domain with CD-rec ##### 3 research questions * improvement in terms of accuracy of CD rec for cold-start users? * is CD rec really useful to improve diversity? * what is the impact of size and diversity of the user profile in the source domain for the rec in the target domain? ### Experimental setting ##### Datasets: * Facebook likes on music and movies, metadata from DBPedia * likes only so dataset with positive feedback only * typical data entry: id, name, category, timestamp * disambiguation with DBPedia (technically challenging) ##### Rec algorithms evaluated * popularity based pop * CF user-based using nearest neighbor with Jaccard (unn) * CF item-based using nearest neighbor with Jaccard (inn) * CF item-based using matrix factorization (imf) * hybrid HeteRec (see ref 16) * hybrid PathRank (see ref 8) ##### Evaluation methodology * user-based 5-fold cross-validation strategy (see ref 7) * elaborate preprocessing (only users with > 16 likes..) * after preprocessing: music is 50K users, 5800 artists, 2M likes ; movies is 27K users, 4000 movies, 875.000 likes * quality estimators: Mean Reciprocal Rank for accuracy ; Intra-list diversity and Binomial diversity (see ref 14, Vargas et al.) for diversity ; also catalogue coverage ### Results ##### most results in Tab1 ##### Cross-domain recommendation accuracy (RQ1) * specific focus on cold-start situations (profile in target domain unknown or close to unknown) * case1: Music source, Movies target * CD-unn most accurate for users with cold start but perf strongly decreases as soon as target profile grows, would be caused by the choice of Jaccard as similarity metric (which is unreliable in cold start situations) * in terms of accuracy, only inn and imf benefit from Music feedback * in terms of coverage, only unn benefit from Music feedback * case2: Movies source, Music target * CD-unn again less performing when increasing profile size * coverage: same trend as case1 * summary: CD rec may be useful in cold start situation; some methods are much more efficient when using only source domain rather than source domain + a few info from target domain (result which should be explored more) ##### Cross-domain recommendation diversity (RQ2) * binomial diversity and ILD follow similar trends * case1: in general, CD rec brings less diversity * case2: opposite trend, most CD rec brings more diversity ##### Size and diversity of source domain user profiles (RQ3) * groups users by 20 likes intervals in the source domain, and ILD quartiles * compute average MRR as a function of these categories * results on Fig1, focusing on cold start targets (few or no likes in the target domain) * observation: improvement with profile size in source domain (left panel) [not surprising] * observation: best results obtained for very focused in terms of diversity * interpretation: RS chosen have a hard time finding inter-domain correlations, in particular from Music to Movies * conclusion: user profile in source and target domains are important for rec * remark: CD-inn has better perf than other RS in many scenarios considered
Read the paper, add your comments…

Comments:

Very good article on segregation phenomena as measured on online news consumption. ### Introduction * question asked: impact of technological changes on ideological segregation * two conflicting hypotheses: either consumption increased of like-minded opinions (echo chambers) - ex: Sunstein, 2009, or access to broader spectrum of information implies more consumption of opposite opinions - ex: Benkler, 2006 * work proposed: study 50,000 anon users from the US who regularly consume online news * ML algorithms identify hard news, then divide them in descriptive reporting vs opinion pieces * defines ideological segregation as the difference of the expected share of conservative news consumption between two random individuals * observes that segregation tends to be higher when users come from social media * observes that individual users tend to read news only from one side of the spectrum * observes counter-intuitively that reading of opposite sides tends to append more often from channels with highest segregation (social, search) * descriptive reporting corresponds to about 75% of the traffic * online news consumption still dominated by mainstream medias ### Data and methods * data collection: from Bing toolbar for IE => 1.2M US citizens from March to May 2013 * focus son 50,000 regular newsreaders => 2.3 billion pages (median:~1000 pages per user) * selection bias: individuals who accept to share their info ; IE users in general more aged * test representativeness by measuring Spearman coefficient of consumption on the dataset with Quantcast and Alexa rankings: 0.67 et 0.7 ; while Spearman(Quantcast, Alexa) ~ 0.64 ##### identifying news and opinion articles : * use Open Directory Project => identify ~8000 domains as news, politics, etc. * contain major national sources, important regional outlets, important blogs * isolate 4.1M articles, but not always relevant in terms of ideology (e.g. sports, weather,...) => isolate with ML 1.9M of "front section news", among which 200,000 opinion stories (Tab1 indicates terms highly predictive of the categories) ##### measuring the political slant of publishers * impossible to do it manually, but no easy way to do it automatically for all 1.9M articles => assign slant of the outlet * use outlet readers slant, inferred from vote at the presidential election, which is inferred from the location through the IP address * robustness check: Tab2 lists top20 consistent to common knowledge and consistent with previous studies (Gentzkow et Shapiro, 2011) ##### inferring consumption channels * 4 info channels: direct (visit the domain), social (TB, Twitter, mail), search (Google, Bing, Yahoo), aggregator (Google news) * use the referrer domain to define the channel (interpretations pb to solve, eg if ref=Facebook and 4 articles read, are all of them from social origin?) ##### limiting to active news consumers : * limit = 10 news articles and 2 opinion pieces during the 3 months period => from 1.2M to 50.000 users (so 4%) * RK: some conclusions are still true with looser threshold ### Results #### overall segregation * individual polarity = average from polarities of the outlets consumed * segregation = distance between polarity scores * naive estimation insufficient => use of a hierarchical bayesian model ##### bayesian model * process standard in the literature? see Gelman et Hill, 2007 * look for sigma_d global dispersion * polarity of user i supposed to be distributed according to a normal law with latent variables * evaluate parameters using approximate marginal likelihood estimate ##### segregation * distribution of users polarity obtained: see fig2 * segregation = sqrt(2).sigma_p = 0.11 * 2/3 of the scores are between 0.41 and 0.54 => most people are moderate #### segregation by channel and article subjectivity * pb of data scarcity exacerbated by dividing data into channels * but for a user polarity probably correlated for different channels * same type of bayesian model but with a 8 dimension vector * 8 dimensions = 4 channels * 2 classes (opinion and report) ##### results on fig3 : segregation per channel * trend: segregation effect stronger for opinions * trend: social media tend to increase segregation effects * strongest segregation for search ; possible explanations: 1) search formulations are already oriented, 2) when search formulated, users read like-minded medias * as access to a large variety of media comes from the technology, they cause the segregation effect * trend: aggregators => less segregation * interpretation of overall segregation effect weakness: even after pre-filtering, many news are not polarizing * general conclusion: there is a filter bubble effect but still limited #### ideological isolation ##### two conflicting hypotheses * moderate polarization and individuals consume a large spectrum of opinions * moderate polarization but individuals consume a thin spectrum of opinions * dispersion sigma_d=0.06, very small => rather second hypothesis * explanation: 78% of users use only 1 source, 94% one or two sources * RK: still true for users with larger number of sources ##### Dispersion per user and per channel: Fig4a * more or less identical for news and opinions * direct: lowest dispersion, search: highest dispersion ##### Dispersion per individual polarity: Fig4b * most polarized individuals are also ones with highest dispersion ##### Does it mean that highly polarized individuals see opposite opinions? (Fig5) * test by ranking medias with l from left (0) to right (1) * define opposing partisan exposure o_i = min (l_i , 1-l_i) * fig5: percentage of exposure to opposite opinion articles, depending on the channel and on the user polarity * lower than 20% in all cases * weaker for opinion pieces than reports * lowest for most partisan users * conclusion: users read ideologically homogeneous outlets, and partisan users are in general exposed only to their side of the spectrum ### Discussion and conclusion ##### Overall * with social media and web search in general more segregation than with direct consumption * however, channels with more segregation are counter-intuitively related to a wider range of opinions * majority of online behaviors mimick reading habits: most users go to their favorite outlet (which are predominantly mainstream) ##### Limits * measure slant of the outlet, not of an article * focus only on consumption, not on the vote itself * no measure of amplifying effect of social medias or search engines
I agree that it is a very good article. I wish the Algodiv project had the same data… The main contributions are the results on echo chambers: **people _are_ in echo chambers, but the influence remains limited because all users massively consume news content from mainstream media**. I find the "segregation" metric interesting, it should be compared (and merged?) with other diversity-related metrics. I see a few more limits with the methodology. The main one is the reliability of the _slant_ of a news outlet, obtained from the location of the IP addresses of the readers (not exactly reliable) matched with polls results for a given county at the 2016 presidential election(!). Besides the reliability of the metric, it is very hard to have any notion of this made-up scale. Is a 0.11 interval large or small? What does it mean to have BBC at 0.3 and FoxNews at 0.59? Is a difference between 0.3 and 0.32 truly the same as between 0.48 and 0.5?
Read the paper, add your comments…

Comments:

# Short summary - The authors propose a Recommender System which is a hybrid between collaborative filtering and content-based recommendation - Content-based recommendation is essentially based on a HIN structure, with various kinds of content nodes (Figure 2 shows an explicit example) - In the broad lines, their approach consist in making a random walk on the HIN structure - More precisely, this is a RW with restart (which allows to personalize the results), and this is a Vertex Reinforced Random Walk, i.e. a specific kind of RW where future transitions to a node are more probable if this node has been visited in the past - VRRW are not Markovian processes, and this translates into the fact that the transition matrix should be updated - There is a lot of technical details about the implementation, which could be useful to someone who want to apply the method practically - The learning process is achieved through Stochastic Gradient Descent - They evaluate the efficiency of their approach (Div-HeteRec) on two Meetup datasets: one in Bangalore, another in Hyderabad (general stats in Tab.1) - Performances are compared to RWR, CF-NMF and two versions of their method with less parameters (Uni-HeteRec, without personalization ; Learn-HeteRec, with a static transition matrix) - Results in terms of accuracy (precision, recall and NDCG @k, with k=1,2,3,5,10) are compiled in Tables 2 to 5 for two different reco problems: group-to-user and tag-to-group - Div-HeteRec performs better for group-to-user recommendation (perf better on Hyderabad dataset than on Bangalore dataset) - Learn-HeteRec performs better for tag-to-group recommendation - They provide hypothetical justifications to their observations
Read the paper, add your comments…

Comments:

# Summary ### Introduction ##### about hybrid reco : - many RS have an underlying HIN structure and are achieving hybrid reco (in the sense using both user feedbacks and content-based info) ##### difference with existing methods : - use different types of relations, benefit : use the fact that users consume an item for different reasons (e.g. : movie for genre, for director, etc) ##### Recommender System : - combine users feedback and various types of info in a collaborative filtering style - use metapath in HIN to generate reco - technical implementation uses Matrix Factorization ##### Datasets : - MovieLens 100K combined with IMDB and Yelp, implicit feedback only ##### Contributions: - study reco with implicit feedback in HIN - use network heterogeneity to spread preferences on the metapaths - generate personalized reco - specific case study : ML100K and Yelp ### Background and preliminaries ##### binary user feedback - explain how to generate the bipartite adjacency matrix ##### Heterogeneous Information Network - definition (using entity mapping function and link entity mapping function) - vocabulary to describe HIN ##### Matrix Factorization for implicit feedback - describe principle of MF (decomposing the feedback matrix) - resolution using NMF ##### Problem definition - how to make personalized recommendation based on implicit feedback in the form of a list of recommendations ### Meta-path based latent features ##### meta-path - definition and interest (types of paths in a HIN) - can be used to measure similarity and proximity between entities - ex: user [watches] movie [watched by] user [following] actor [starring] movie ##### user preference diffusion - type of meta-paths considered in the paper : user -> item -> * -> item (* may be tag, genre, director, plot for ML100K ; * may be category, customer, location for Yelp) - define user preference score : normalized weighted sum of the number of paths to a given item (eq 2) - if L types of metapaths, then L matrices R (user preference matrices) - use these scores to build the recommendation model ##### global recommendation model - define the recommendation mechanic which is inspired by MF - (?) MF may be achieved on each user preference matrix taken separately : find a couple of reduced matrices with NMF, then prediction model is given by equation 4 - RK: not personalized as coefficients are the same for every user ### Personalized recommendation model - same principle as global recommendation method, except that there is first a clustering, and the learning is achieved cluster per cluster - the number of clusters is a parameter of the method ### Model learning with implicit feedback - learning the model parameters (thetas in equation 4) - use implicit feedback to do so (1 = user browses item / 0 = user does not) - usually prediction done with either classification or learning-to-rank but their approach: rank 1s above 0s (in the spirit of ref 21) ##### Bayesian ranking-based optimization - assumption: a user ranking is independent from the others (allow to get eq.7) - assumption on the probability expressed in equation 8 - allows to derive the expression of objective function O ##### optimization algorithm - optimization: finds thetas such that dO/dTheta = 0 - method: Stochastic Gradient Descent ##### learning personalized recommendation models - this technique is not personalized - to personalize the reco: clusters with a k-means method ### Empirical study ##### Data - dataset 1 : IMDB + ML100K (IM100K); if user has seen movie 1 else 0 - dataset 2 : Yelp; if user has reviewed buisness 1 else 0 - d2 much sparser than d1 (see feedback distribs on figure 5) - temporal split 80% / 20% between training and test ##### Competitors and evaluation metrics - RS benchmarks: popularity-based, co-click, NMF (baseline of collaborative filtering), hybrid SVM - for their method: 10 different metapaths différentes (see Table 6.2) - evaluation: as based on implicit feedback, precision at position and top-10 mean reciprocal rank (MRR_k) ##### Performance comparison - Table 3 for a summary - very few items interact with a lot of users - parameters for NMF: dimension of the reduced matrix: 20 (IM100K), 60 (Yelp) - Hybrid-SVM uses the same info as their method (HeteRec) and uses PathSim - in general HeteRec better than all benchmark methods - in particular HeteRec > Hybrid-SVM (while similar information) - improvement higher for Yelp than for IM100K, possibly a consequence of Yelp sparsity - HeteRec-p (personalized version) : even better than HeteRec-g ##### Performance analysis - more precise analysis of the performances on IM100K only for HeteRec-g , HeteRec-p , NMF , Co-Click - divide in 6 different training sets, depending on various parameters - performances increase with the number of movies watched for all methods except co-click - performances decrease with movies popularity for all methods except co-click ##### Parameter tuning - HeteRec have more parameters - regularization parameter lambda (eq 9) computed with cross-validation - sampling necessary for Yelp (as 10^12 elements), performance variations with sampling represented on Fig7 (relatively stable) - for HeteRec-p: number of clusters, see fgure 6c ### Related work ##### CF based hybrid RS ##### Information network analysis
Read the paper, add your comments…

Comments:

Very good article from GroupLens team about the diversity narrowing effect and the role of the recommender system. Experimental measurements on the MovieLens platform, the content of movies is described with tag-genome, the RS is an item-item collaborative filtering. ### Introduction ##### two research questions : - Do recommender systems expose users to narrower content over time? - How does the experience of users who take recommendations differ from that of users who do not regularly take recommendations? ##### method - users in two categories (following / ignoring ~ control group) - then sort of A/B testing to measure consumption diversity and enjoyment at the individual level ##### 4 contributions : - method to analyze the effects of RS on users - give quantitative proofs that users using RS have a better experience than others - show that diversity reduction effect is small - show that users using RS tend to consume more divers contents ### Related work - Pariser (11) on filter bubble - Tetlock (16) measures bias induce by bubbles (among economists) - Sunstein (15) thinks that personnalization tends to reduce the space of shared experience among users - Negroponte (MIT MediaLab co-founder) defends that algorithms may also open horizons - Linden (Amazon RS contributer, 5) thinks that reco can generate some form of serendipity and oppose to Pariser's thesis - Fleder et al (2) use simulation to show that RS tend to uniform experience / Hosanagar et al (3) use measurements, but limitations to these studies, e.g. simplistic model of human behavior (Fleder) ### Data & metrics ##### Dataset MovieLens (September 2013) : - 220.000 users - 20M ratings, 20000 movies - RS (item-item Collaborating Filtering, similar to Ama) propose "top picks" per user (15 default value) ##### Tag-genome to describe movie content - information space which describes movies with tags given by users - 9500 movies in tag-genome (april 2013) and 1100 tags => 10-11M pairs ##### Time period - 21 months from Feb 2008 to Aug 2010 (because less missing data) ##### preprocessing : - 15 first ratings taken out (platform propositions) - then first 3 months taken out (as many ratings are "catching up" + to give users time to get used to ML and ML to get info from the user ##### recommendation blocks definition : - several possibilities: per login session, per periods of time, but as users don't have the same activity - => blocks of 10 consecutive ratings, 10 is roughly the median of number of ratings per 3 months ##### users and items : - only users who have their first rating during the period of interest and who have at least 3 rating blocks => 1400 users with 3 to 200 rating blocks - 173,000 ratings on 10500 movies, all in tag genome (small inconsistency: there is only 9500 movies in the tag-genome ##### Identifying consumed recommendations in a rating block - groups based on the number of reco followed by users - criterion to consider that reco followed: in the list of top picks between 3h and 3 months before evaluation ##### Ignoring group vs following group - 2 groups of users - pb: some users take a lot of reco during a period and then very few - => rank users depending on the proportion of rating blocks during which they followed a reco (Fig4) - then if >50% : following group (286 users); if 0% : ignoring group (430 users) (seems quite ad hoc, but makes sense as differences are not very large and following reco must be quite rare) ##### Measuring content diversity: tag genome (see figure 5) - matrix movie-tag with 1 to 5 relevance score - movie is then a vector of scores - distance between movies is euclidean distance in this space (rather than cosine similarity because of matrix density) - order of magnitudes: min distance 5.1 (Halloween 4 - Halloween 5), max distance 44.2 (Matrix - Paris was a woman), mean 23.4 - authors defend that tag-genome is very expressive (more than cosine similarity), also benefits from the continuous input from users (illustrate on examples) ##### Measuring the effect of recommender systems ###### # content diversity metrics (standard) - average pairwise distance of the movies in the list - maximum pairwise distance of the movies in the list - on recommended movies (15 top picks) - on rated movies - more or less normally distributed (see fig6) ###### # user experience : - average rating per user - more or less normally distributed ###### # analysis : - mean shift for users in both groups - measure difference between groups and within group at different times - within group : use standard t-test (as same size) - between groups : use Welch t-tests (as different sizes) ### Results ##### Research Question 1: do RS expose users to narrower content over time ? - diversity on recommended movies: see tab2 - statistically significant drop on following group - significant drop (?) on ignoring group - following more diverse than ignoring (significant), but gap narrows over time ##### Research Question 2: does the experience of users who take recommendations differ from that of users who do not regularly take recommendations ? - diversity on rated movies using mean distance: see tab3 - beginning: no significant difference - end: significant drop for both groups - diversity on rated movies using max distance: see tab4 - same trend - enjoyment evaluated with rating between first and last block for both groups: see tab5 - following group gives higher ratings, and the rating drop is lower in following than in ignoring group - similar trend with rating means (see tab7) - refined analysis in tab6 with ratings depending on the fact that movies rated are recommended or not => better experience if movie recommended (complement) - rank together all blocks of all groups at all times per increasing average rating - => block located in this ranking with its percentile (ex: first block, ignoring group ~ 63rd percentile) - in tab8: percentile drop for both groups => percentile drop for ignoring is high (-19) not in the following (-1) ### Discussion ##### summary - diversity tends to narrow in anyway over time - effect subdued for the group which follows reco - users following reco get more diverse recommendations - ratings seem to encourage RS to broaden recommendations ##### prospects - is there a natural trend to narrow consumption diversity? - they think that item-item CF slows this effect down - the RS can also inform the user of his or her consumption diversity - finally RS can be designed to intentionally force diversity ##### limitations - restriction to top picks - restriction to one dataset and one system (item-item CF) (note: on current platform (May 2019), 4 different RS: peasant/bard/warrior/wizard, the one described in the article seems to be warrior, default RS)
Very interesting article. I'd like to get their data on the "Top Picks for you". There's probably a correlation between movie releases and what users watch and what pop into Top Picks for you, source of (minor?) bias. They use one metric for content diversity, there's room to explore other ones. They use Ziegler's work on intra list diversity ([21]).
Read the paper, add your comments…
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29