The paper describes the architecture of YouTube current recommendation system (as of 2016), based on Deep Learning, which replaced the previous Matrix Factorization based architecture (ref 23). The architecture uses Google Brain's TensorFlow tool and achieves significantly better than the former one (see fig6 for example). The paper is a high-level description of the architecture, and sometimes lack the technical details which would allow a precise understanding. However, it provides very interesting ideas about the problems faced and solutions contemplated by YouTube engineers and of actual constraints with industrial recommendation systems. It also helps realizing that there is as much craft as science in the process. Overall, the architecture consists of two main phases : - a coarse candidate generation (creating a set of a few hundred videos per user from the corpus of several million videos available), - precise ranking of the candidates. Both steps use a DNN architecture. Some technical details : - user profile is summarized as a heterogeneous embedding of identifiers of videos watched, demographic information, search tokens, etc. - a decisive advantage of DNN put into light in the paper is their ability to deal with heterogeneous signal (sources of information of various nature), another is that it (partly) circumvents feature manufacturing - while development of the method calls to offline evaluation metrics (precision, recall, etc.), the final evaluation relies on live A/B testing experiments, the discussion related to this point in Sec 3.4 is very interesting - for candidate generation, YouTube uses implicit feedback information (e.g. watch times) rather than explicit feedback (e.g. thumb up) because there is more information available - taking into account the "freshness" of a video has an important impact on the efficacy of the candidate generation (fig 4) - taking into account the context of a watch (meaning the sequence of watch) is also important as co-watch distribution probability is very asymmetric, in particular taking into account the previous action of a user related to similar items matters
There is no 👍-style explicit feedback for the comments yet. So, I'll use the implicit one. Alt-Tab's comment is a very condensed one, it matters, I even think that this comment is better than the orignal article abstract. Now I really want to drop everything, sit down and read this article in details.

### Style I like the Levin style, he writes a bit provocative phrases that looks, in the same time, incredibly truthful (at least for the author himself). Consider for example what he says about the invention of positional numeral systems and Quantum computers: > Archimedes made a great discovery that digital representation of numbers is exponentially > more efficient than analog ones (sand pile sizes). Many subsequent analog devices yielded > unimpressive results. It is not clear why QCs [quantum computers] should be an exception. ### One-way function and the axiom of choice One-way functions and the axiom of choice deals with practical or conceptual (im-)possibility to solve [inverse problems](https://en.wikipedia.org/wiki/Inverse_problem) arising in Computer Science and Mathematics. ### Kolmogorov complexity and One-way functions Two primes $p$ and $q$ have almost the same [informational content](https://en.wikipedia.org/wiki/Algorithmic_information_theory) as their product $pq$. However, the [$p, q \mapsto pq$](https://en.wikipedia.org/wiki/Computational_complexity_of_mathematical_operations#Arithmetic_functions) is much more easy to do than [$pq \mapsto p,q$](https://en.wikipedia.org/wiki/Integer_factorization#Prime_decomposition).