Alt-Tab — Papersᵞ

☆

0

Coverage, Redundancy and Size-Awareness in Genre Diversity for Recommender Systems

Comments:

# In short Aim at introducing a diversity notion for recommendation which combines different existing notions of diversity (intra-list diversity, coverage, redundancy), and then apply re-ranking technique. # Summary ### Introduction * approach based on genre Intra-List Similarity * they aim at 3 different properties: genre coverage, (non-)redundancy, list size awareness * dataset: movies rec with Netflix prize ### Related Work ##### Diversity in Recommender Systems * Herlocker et al : accuracy alone insufficient to assess satisfaction * McNee et al : defining properties related to satisfaction (coverage, diversity, novelty, serendipity) * ref 14 (Pu et al) : increasing diversity increases satisfaction * ref 22 (Ziegler et al) : introduce Intra-List Diversity * ref 7 (Clarke et al) : ILD limited when considering query results, as queries are short and ambiguous * ref 15 (Santos et al) : propose to cover a maximum of subtopics in the first results (as for a web research) ##### Measuring and enhancing diversity * frameworks to improve diversity largely rely on re-ranking * usual approach: greedy selection, assumes the definition of an objective function (see algo1, à la Ziegler), pairwise framework, measure based on the ILS (or ILD); in ref 21 (Zhang and Hurley) same kind of strategy * framework intent-aware: optimization of coverage (particularly to circumvent ambiguity problems), ref15 proposes xQuAD for example * framework proportionality aims at covering topics proportionally to the user interest, ref 9 (Dang and Croft) for example ### Characterizing genres * what characterizes a genre * following limitations (hierarchy of meaning, unbalanced distribution, overlap between genres, ...) * dataset Netflix: 100M ratings (1 to 5), 480.000 users, around 18000 movies; genres extracted from IMDB => info on 9300 movies (meaning 83% of the ratings) ### Measuring genre diversity in recommendation lists * a diversity measure should capture genre coverage (covering a maximum of genres, proportionally to user interest) * redundancy (important that items in the list cover a genre but also that other items do not cover this genre) * size-awareness (the previous two should take into account the size of the rec list, e.g. if the list is short only most important genres) * limitations of the literature: Ziegler's ILS, ref5's MMR are pairwise notions which are not well suited to evaluate notions such as a genre generality * intent-aware frameworks (refs 2 and 15) do not fully account for the idea that it is important that items do not cover a genre represented in the list, assumes that genres are independent from each other * ref9 (Dang and Croft) use the notion of proportionality to the user interest but do not penalize redundancy * no existing method take the length of the list into account ### Binomial framework for genre diversity * general principle: random distribution is considered as reference for optimal => model likelihood for a genre to randomly appear in a list according to a binomial distribution ##### Binomial diversity metric * selection of an item from a genre is seen as a Bernoulli test * n.b.: theoretically selection without replacement, practically nearly equivalent to selection with replacement * formal definitions: item i covers genre G(i) ; k_g^s = number of success on set s that item has genre g ; p_g" is proportion of interactions of a user with genre g (local importance) ; p_g' is proportion of interactions of all users with genre g (global importance) ; p_g = (1-alpha).p_g' + alpha.p_g" is the expected probability of a genre g to be in rec list R * coverage score: product of the probabilities for the genres not represented in R not to be selected randomly following the Bernoulli process (eq9) * non-redundancy score: measures how probable it is that a genre appears at least k times in R (so it's a kind of remaining tolerance) (eq10) * binomial diversity = coverage . non-redundancy * BinomDiv has appropriate properties: maximizes coverage as a function of p_g, penalizes over-representation of genres, adapts to the list length with the number of tests to do to create R ##### Binomial re-ranking algorithm * greedy re-ranking to optimize a trade-off function between relevance and diversity (eq13), parametrized by lambda ##### Qualitative analysis * results in Table 3: see how various diversity metrics behave in 4 different specific ranking situations ; principal conclusion is that BinomDiv is the only one which works all for all these situations * results in Table 4: (item based kNN + reranking) ; we observe the qualitative results of the reranking, depending on the user tastes ### Experiments * Two experiments with two datasets: Netflix prize + imdb genres (83M ratings, 480K users, 9300 movies, 28 genres) * MovieLens 1M ##### Setup * 5-fold cross-validation * RS rank all movies above a given threshold (grade) for the user considered + 1000 random movies of the dataset * RS tested: item-based CF kNN ; CF implicit Matrix Factorization ; item popularity ; random * reranking optimization is done with a grid search on lambda (trade-off diversity/relevance parameter) * diversity evaluation with all index in the literature (EILD, ERR-IA, CPR) + subtopic recall + subtopic per item * relevance evaluation with nDCG ##### Results for baseline diversity * Tab5: résults without diversification reranking (reminder: alpha reflects personalization degree) * random: very low relevance ; strong diversity * popularity: better relevance ; weaker diversity * personalized RS tend to have weaker non-personalized diversity scores but improve when the user history is taken into account ##### Results for diversified results * Tab6: résults after reranking, cutoff 20 items ; alpha =0.5 ; best lambda found with grid search * all diversifications => accuracy decreases * any diversification process is best when diversity evaluation is realized with it * xQuAD and ERR-IA tend to accumulate genres without penalizing redundancy * ERR-IA and CPR-rel correlated to SPI (subtopics per item) * Fig3: view improvement to baseline * BinomDiv can improve to baseline for nearly every diversity metric * general conclusion: BinomDiv able to bring more coverage while limiting redundancy * Tab7: explores size-awareness by changing cut-off value, diversification relative to lambda always best with the corresponding size [?]

Alt-Tab at 2019-08-23 09:52:44

Read the paper, add your comments…

☆

0

Accuracy and Diversity in Cross-domain Recommendations for Cold-start Users with Positive-only Feedback

Comments:

# In short Short article about the use of cross-domain recommendation (ie use recommendation profile on a type of products to recommend products of another type) and in particular the gain when considering cold-start problems. The article is short and esay to read, experimental results do not allow to identify a clear trend, in particular concerning diversity related results. # Summary ### Introduction ##### usual solution to cold start pbs: * ask users directly * exploit additional information for example combining collaborative filtering with content-based rec ##### literature on the second possibility (cross-domain RS) * ref 15 (Winoto, Tang): conjecture that CD-RS may degrade accuracy but improve diversity (they test this assumption in the paper) * ref 11 (Sahebi, Brusilovski): quality of rec improve when domains are semantically connected ##### article assumption profile size (quantity of info) and diversity in source domain have an impact on accuracy in target domain with CD-rec ##### 3 research questions * improvement in terms of accuracy of CD rec for cold-start users? * is CD rec really useful to improve diversity? * what is the impact of size and diversity of the user profile in the source domain for the rec in the target domain? ### Experimental setting ##### Datasets: * Facebook likes on music and movies, metadata from DBPedia * likes only so dataset with positive feedback only * typical data entry: id, name, category, timestamp * disambiguation with DBPedia (technically challenging) ##### Rec algorithms evaluated * popularity based pop * CF user-based using nearest neighbor with Jaccard (unn) * CF item-based using nearest neighbor with Jaccard (inn) * CF item-based using matrix factorization (imf) * hybrid HeteRec (see ref 16) * hybrid PathRank (see ref 8) ##### Evaluation methodology * user-based 5-fold cross-validation strategy (see ref 7) * elaborate preprocessing (only users with > 16 likes..) * after preprocessing: music is 50K users, 5800 artists, 2M likes ; movies is 27K users, 4000 movies, 875.000 likes * quality estimators: Mean Reciprocal Rank for accuracy ; Intra-list diversity and Binomial diversity (see ref 14, Vargas et al.) for diversity ; also catalogue coverage ### Results ##### most results in Tab1 ##### Cross-domain recommendation accuracy (RQ1) * specific focus on cold-start situations (profile in target domain unknown or close to unknown) * case1: Music source, Movies target * CD-unn most accurate for users with cold start but perf strongly decreases as soon as target profile grows, would be caused by the choice of Jaccard as similarity metric (which is unreliable in cold start situations) * in terms of accuracy, only inn and imf benefit from Music feedback * in terms of coverage, only unn benefit from Music feedback * case2: Movies source, Music target * CD-unn again less performing when increasing profile size * coverage: same trend as case1 * summary: CD rec may be useful in cold start situation; some methods are much more efficient when using only source domain rather than source domain + a few info from target domain (result which should be explored more) ##### Cross-domain recommendation diversity (RQ2) * binomial diversity and ILD follow similar trends * case1: in general, CD rec brings less diversity * case2: opposite trend, most CD rec brings more diversity ##### Size and diversity of source domain user profiles (RQ3) * groups users by 20 likes intervals in the source domain, and ILD quartiles * compute average MRR as a function of these categories * results on Fig1, focusing on cold start targets (few or no likes in the target domain) * observation: improvement with profile size in source domain (left panel) [not surprising] * observation: best results obtained for very focused in terms of diversity * interpretation: RS chosen have a hard time finding inter-domain correlations, in particular from Music to Movies * conclusion: user profile in source and target domains are important for rec * remark: CD-inn has better perf than other RS in many scenarios considered

Alt-Tab at 2019-08-08 16:12:01

Read the paper, add your comments…

☆

0

Filter bubbles, echo chambers, and online news consumption

Comments:

Very good article on segregation phenomena as measured on online news consumption. ### Introduction * question asked: impact of technological changes on ideological segregation * two conflicting hypotheses: either consumption increased of like-minded opinions (echo chambers) - ex: Sunstein, 2009, or access to broader spectrum of information implies more consumption of opposite opinions - ex: Benkler, 2006 * work proposed: study 50,000 anon users from the US who regularly consume online news * ML algorithms identify hard news, then divide them in descriptive reporting vs opinion pieces * defines ideological segregation as the difference of the expected share of conservative news consumption between two random individuals * observes that segregation tends to be higher when users come from social media * observes that individual users tend to read news only from one side of the spectrum * observes counter-intuitively that reading of opposite sides tends to append more often from channels with highest segregation (social, search) * descriptive reporting corresponds to about 75% of the traffic * online news consumption still dominated by mainstream medias ### Data and methods * data collection: from Bing toolbar for IE => 1.2M US citizens from March to May 2013 * focus son 50,000 regular newsreaders => 2.3 billion pages (median:~1000 pages per user) * selection bias: individuals who accept to share their info ; IE users in general more aged * test representativeness by measuring Spearman coefficient of consumption on the dataset with Quantcast and Alexa rankings: 0.67 et 0.7 ; while Spearman(Quantcast, Alexa) ~ 0.64 ##### identifying news and opinion articles : * use Open Directory Project => identify ~8000 domains as news, politics, etc. * contain major national sources, important regional outlets, important blogs * isolate 4.1M articles, but not always relevant in terms of ideology (e.g. sports, weather,...) => isolate with ML 1.9M of "front section news", among which 200,000 opinion stories (Tab1 indicates terms highly predictive of the categories) ##### measuring the political slant of publishers * impossible to do it manually, but no easy way to do it automatically for all 1.9M articles => assign slant of the outlet * use outlet readers slant, inferred from vote at the presidential election, which is inferred from the location through the IP address * robustness check: Tab2 lists top20 consistent to common knowledge and consistent with previous studies (Gentzkow et Shapiro, 2011) ##### inferring consumption channels * 4 info channels: direct (visit the domain), social (TB, Twitter, mail), search (Google, Bing, Yahoo), aggregator (Google news) * use the referrer domain to define the channel (interpretations pb to solve, eg if ref=Facebook and 4 articles read, are all of them from social origin?) ##### limiting to active news consumers : * limit = 10 news articles and 2 opinion pieces during the 3 months period => from 1.2M to 50.000 users (so 4%) * RK: some conclusions are still true with looser threshold ### Results #### overall segregation * individual polarity = average from polarities of the outlets consumed * segregation = distance between polarity scores * naive estimation insufficient => use of a hierarchical bayesian model ##### bayesian model * process standard in the literature? see Gelman et Hill, 2007 * look for sigma_d global dispersion * polarity of user i supposed to be distributed according to a normal law with latent variables * evaluate parameters using approximate marginal likelihood estimate ##### segregation * distribution of users polarity obtained: see fig2 * segregation = sqrt(2).sigma_p = 0.11 * 2/3 of the scores are between 0.41 and 0.54 => most people are moderate #### segregation by channel and article subjectivity * pb of data scarcity exacerbated by dividing data into channels * but for a user polarity probably correlated for different channels * same type of bayesian model but with a 8 dimension vector * 8 dimensions = 4 channels * 2 classes (opinion and report) ##### results on fig3 : segregation per channel * trend: segregation effect stronger for opinions * trend: social media tend to increase segregation effects * strongest segregation for search ; possible explanations: 1) search formulations are already oriented, 2) when search formulated, users read like-minded medias * as access to a large variety of media comes from the technology, they cause the segregation effect * trend: aggregators => less segregation * interpretation of overall segregation effect weakness: even after pre-filtering, many news are not polarizing * general conclusion: there is a filter bubble effect but still limited #### ideological isolation ##### two conflicting hypotheses * moderate polarization and individuals consume a large spectrum of opinions * moderate polarization but individuals consume a thin spectrum of opinions * dispersion sigma_d=0.06, very small => rather second hypothesis * explanation: 78% of users use only 1 source, 94% one or two sources * RK: still true for users with larger number of sources ##### Dispersion per user and per channel: Fig4a * more or less identical for news and opinions * direct: lowest dispersion, search: highest dispersion ##### Dispersion per individual polarity: Fig4b * most polarized individuals are also ones with highest dispersion ##### Does it mean that highly polarized individuals see opposite opinions? (Fig5) * test by ranking medias with l from left (0) to right (1) * define opposing partisan exposure o_i = min (l_i , 1-l_i) * fig5: percentage of exposure to opposite opinion articles, depending on the channel and on the user polarity * lower than 20% in all cases * weaker for opinion pieces than reports * lowest for most partisan users * conclusion: users read ideologically homogeneous outlets, and partisan users are in general exposed only to their side of the spectrum ### Discussion and conclusion ##### Overall * with social media and web search in general more segregation than with direct consumption * however, channels with more segregation are counter-intuitively related to a wider range of opinions * majority of online behaviors mimick reading habits: most users go to their favorite outlet (which are predominantly mainstream) ##### Limits * measure slant of the outlet, not of an article * focus only on consumption, not on the vote itself * no measure of amplifying effect of social medias or search engines

Alt-Tab at 2019-04-26 16:21:25

I agree that it is a very good article. I wish the Algodiv project had the same data… The main contributions are the results on echo chambers: **people _are_ in echo chambers, but the influence remains limited because all users massively consume news content from mainstream media**. I find the "segregation" metric interesting, it should be compared (and merged?) with other diversity-related metrics. I see a few more limits with the methodology. The main one is the reliability of the _slant_ of a news outlet, obtained from the location of the IP addresses of the readers (not exactly reliable) matched with polls results for a given county at the 2016 presidential election(!). Besides the reliability of the metric, it is very hard to have any notion of this made-up scale. Is a 0.11 interval large or small? What does it mean to have BBC at 0.3 and FoxNews at 0.59? Is a difference between 0.3 and 0.32 truly the same as between 0.48 and 0.5?

rfs at 2019-07-30 13:37:15

Read the paper, add your comments…

Papers^γ / Alt-Tab

Wellcome to Alt-Tab's library,

Comments:

Comments:

Comments:

Papersγ / Alt-Tab

Wellcome to Alt-Tab's library,

Comments:

Comments:

Comments:

Papers^γ / Alt-Tab