Wellcome to Maximimi's library,

  You can find here all papers liked or uploaded by Maximimi
  together with brief user bio and description of her/his academic activity.


[Link to my homepage](https://sites.google.com/view/danisch/home) ## I will read the following papers. - [Quasi-Succinct Indices](https://papers-gamma.link/paper/130) - [PageRank as a Function of the Damping Factor](https://papers-gamma.link/paper/106) - [Graph Stream Algorithms: A Survey](https://papers-gamma.link/paper/102) - [Network Sampling: From Static to Streaming Graphs](https://papers-gamma.link/paper/122) - [The Protein-Folding Problem, 50 Years On](https://papers-gamma.link/paper/78) - [Computational inference of gene regulatory networks: Approaches, limitations and opportunitie](https://papers-gamma.link/paper/77) - [Graph complexity analysis identifies an ETV5 tumor-specific network in human and murine low-grade glioma](https://papers-gamma.link/paper/79) - [Gene Networks in Plant Biology: Approaches in Reconstruction and Analysis](https://papers-gamma.link/paper/76) - [The non-convex Burer–Monteiro approach works on smooth semidefinite programs](https://papers-gamma.link/paper/80) - [Solving SDPs for synchronization and MaxCut problems via the Grothendieck inequality](https://papers-gamma.link/paper/81) - [Influence maximization in complex networks through optimal percolation](https://papers-gamma.link/paper/70) - [Motifs in Temporal Networks](https://papers-gamma.link/paper/61) - [Deep Sparse Rectifier Neural Networks](https://papers-gamma.link/paper/69) - [Sparse Convolutional Neural Networks](https://papers-gamma.link/paper/67) - [A fast and simple algorithm for training neural probabilistic language models](https://papers-gamma.link/paper/58) - [Adding One Neuron Can Eliminate All Bad Local Minima](https://papers-gamma.link/paper/71) ## I read the following papers. ### 2019-2020 - [Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs](https://papers-gamma.link/paper/162) - [Karp-Sipser based kernels for bipartite graph matching](https://papers-gamma.link/paper/160) - [Speedup Graph Processing by Graph Ordering](https://papers-gamma.link/paper/159) - [Tree Sampling Divergence: An Information-Theoretic Metric for Hierarchical Graph Clustering](https://papers-gamma.link/paper/154) ### 2018-2019: - [SWeG: Lossless and Lossy Summarization of Web-Scale Graphs](https://papers-gamma.link/paper/139) - [Smoothed Analysis: An Attempt to Explain the Behavior of Algorithms in Practice](https://papers-gamma.link/paper/129) - [Are stable instances easy?](https://papers-gamma.link/paper/128) - [Hierarchical Taxonomy Aware Network Embedding](https://papers-gamma.link/paper/116) - [Billion-scale Network Embedding with Iterative Random Projection](https://papers-gamma.link/paper/110) - [HARP: Hierarchical Representation Learning for Networks](https://papers-gamma.link/paper/109/) - [Layered Label Propagation: A MultiResolution Coordinate-Free Ordering for Compressing Social Networks](https://papers-gamma.link/paper/105) ### 2017-2018: - [Link Prediction in Graph Streams](https://papers-gamma.link/paper/101) - [The Community-search Problem and How to Plan a Successful Cocktail Party](https://papers-gamma.link/paper/74) - [A Nonlinear Programming Algorithm for Solving Semidefinite Programs via Low-rank Factorization](https://papers-gamma.link/paper/55) - [Deep Learning](https://papers-gamma.link/paper/68) - [Reducing the Dimensionality of Data with Neural Networks](https://papers-gamma.link/paper/65) - [Representation Learning on Graphs: Methods and Applications](https://papers-gamma.link/paper/60) - [Improved Approximation Algorithms for MAX k-CUT and MAX BISECTION](https://papers-gamma.link/paper/56) - [Cauchy Graph Embedding](https://papers-gamma.link/paper/53) - [Phase Transitions in Semidefinite Relaxations](https://papers-gamma.link/paper/57) - [Graph Embedding Techniques, Applications, and Performance: A Survey](https://papers-gamma.link/paper/52) - [VERSE: Versatile Graph Embeddings from Similarity Measures](https://papers-gamma.link/paper/48) - [Hierarchical Clustering Beyond the Worst-Case](https://papers-gamma.link/paper/45) - [Scalable Motif-aware Graph Clustering](https://papers-gamma.link/paper/18) - [Practical Algorithms for Linear Boolean-width](https://papers-gamma.link/paper/40) - [New Perspectives and Methods in Link Prediction](https://papers-gamma.link/paper/28/New%20Perspectives%20and%20Methods%20in%20Link%20Prediction) - [In-Core Computation of Geometric Centralities with HyperBall: A Hundred Billion Nodes and Beyond](https://papers-gamma.link/paper/37) - [Diversity is All You Need: Learning Skills without a Reward Function](https://papers-gamma.link/paper/36) - [When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors](https://papers-gamma.link/paper/23) - [Fast Approximation of Centrality](https://papers-gamma.link/paper/35/Fast%20Approximation%20of%20Centrality) - [Indexing Public-Private Graphs](https://papers-gamma.link/paper/19/Indexing%20Public-Private%20Graphs) - [On the uniform generation of random graphs with prescribed degree sequences](https://papers-gamma.link/paper/26/On%20the%20uniform%20generation%20of%20random%20graphs%20with%20prescribed%20d%20egree%20sequences) - [Linear Additive Markov Processes](https://papers-gamma.link/paper/21/Linear%20Additive%20Markov%20Processes) - [ESCAPE: Efficiently Counting All 5-Vertex Subgraphs](https://papers-gamma.link/paper/17/ESCAPE:%20Efficiently%20Counting%20All%205-Vertex%20Subgraphs) - [The k-peak Decomposition: Mapping the Global Structure of Graphs](https://papers-gamma.link/paper/16/The%20k-peak%20Decomposition:%20Mapping%20the%20Global%20Structure%20of%20Graphs) - [A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem](https://papers-gamma.link/paper/24)

Comments:

Hard to understand for a newbie in Deep RL. A formal definition of what is a skill ("A skill is simply a policy.") would help. ### Typos: - "guaranteeing that is has maximum entropy" - "We discuss the the log p(z) term in Appendix B." - "so it much first gather momentum" - "While are skills are learned"
I agree with the previous comment. The article seems to aim at people who are already familiar with reinforcement (not necessarily deep, or based on neural networks I guess) and its usual benchmark. The implementations are not detailed, the authors lay stress on the general idea (which is relatively simple to get), and its visual results which look quite spectacular.
The problem from my point of view is that a scientist has close to no incentive to give additional context information. For example, an author would not add a link to a Wikipedia page, as it is considered (rightly or wrongly) under the scientific quality standards. Even worse: I have observed that in most scientific communities, it is expected that you don't give too much details on the some definitions, assumptions or computations if they are considered as "common knowledge". In other words, adding too much details is understood as a sign that you are an outsider of the community. In anyway, hyperlinks would not solve everything, as the writer still has to target a given "level of knowledge" of the reader, he would make the information understandable by a specific audience. Typically, when I read about a field which is new to me, I first go to Wikipedia. The level of details of a Wikipedia page is widely heterogeneous: some pages are too basic for what I already know, but others are too elaborate, and as you said, I am not willing to spend the "focusing time" necessary for me to understand such a page if I cannot even evaluate if I will have a much better understanding of the paper afterwards. The author is certainly more capable of evaluating what are the context information which are useful to understand what he writes, but you are right that it costs time to him as well. So, I think that in a way, the problem can be seen as an economic one, of which focusing time is the main resource.
> For example, an author would not add a link to a Wikipedia page, as it is considered (rightly or wrongly) under the scientific quality standards I would not agree, some scientists already write blog-posts about their beloved subjects using a lot of hyperlinks to Wikipedia and other sites, consider for example [Igor Pak's](https://igorpak.wordpress.com/2015/05/26/the-power-of-negative-thinking-part-i-pattern-avoidance/) or [Terence Tao's](https://terrytao.wordpress.com/2009/04/26/szemeredis-regularity-lemma-via-random-partitions/) blogs. Currently, such blog-posts are considered complementary to "real" scientific publications. But but one day everything will change.
Read the paper, add your comments…

Comments:

### The Louvain algorithm: "In some cases, it may be possible to have a very fast algorithms based on heuristics to compute partitions, however, we are unaware of any such methods that would have provable guarantees for the kinds of graphs that appear in hierarchical clustering" The Louvain algorithm: https://perso.uclouvain.be/vincent.blondel/research/louvain.html is such an efficient algorithm. It outputs a hierarchical clustering. The algorithm optimizes the "modularity" in a greedy way, it then merges clusters into "hypernodes" once a local minimum is found and iterates on the newly formed graph. In most cases, only the partition corresponding to the last level (the larger clusters) is used. It is known for being fast and accurate in practice. However, to the best of my knowledge, it does not have provable approximation guarantees. ### Wikipedia is a graph with a ground truth hierarchical clustering: The set of Wikipedia pages and the hypertext links among them form a graph. The Wikipedia categories can be seen as a ground-truth hierarchical clustering of this graph. It would be interesting to see whether the suggested algorithm can find them. Datatset: http://cfinder.org/wiki/?n=Main.Data#toc1 ### Points in an ultrametric space have a natural hierarchical clustering: What is the intuition behind this fact? ### Minors: - in Defnition 1.1 "Let G be a graph generated by a minimal ultrametric". "ultrametric" is defined earlier in the paper, but "minimal ultrametric" is not. ### Typos: - "a very fast algorithms based on" - "The expected graph as the is the weighted complete graph"
> It outputs a hierarchical clustering. The algorithm optimizes the "modularity" in a greedy way, it then merges clusters into "hypernodes" once a local minimum is found and iterates on the newly formed graph. It seems that the trees considered in the article are binary trees, while Louvain algorithm yields a hierarchical clustering tree which is not necessarily binary.
Video of the talk: https://www.lincs.fr/events/hierarchical-clustering-objective-functions-and-algorithms/
> ### Wikipedia is a graph with a ground truth hierarchical clustering: > > The set of Wikipedia pages and the hypertext links among them form a graph. The Wikipedia categories can be seen as a ground-truth hierarchical clustering of this graph. It would be interesting to see whether the suggested algorithm can find them. > > Datatset: http://cfinder.org/wiki/?n=Main.Data# Actually, this hierarchy is a DAG (directed acyclic graph), not a tree.
Read the paper, add your comments…

Comments:

An iconic paper dealing with autoencoders and layer by layer pretraining of deep neural networks. Layer by layer pretraining of a deep neural network allows finding a good initialization for the subsequent backprop fine tuning of the hole network. It nicely concludes "It has been obvious since the 1980s that backpropagation through deep autoencoders would be very effective for nonlinear dimensionality reduction, provided that computers were fast enough, data sets were big enough, and the initial weights were close enough to a good solution. All three conditions are now satisfied.". ### Scales quadratically with the dimensionality of the input? I read: "autoencoders [...] can be applied to very large data sets because both the pretraining and the fine-tuning scale linearly in time and space with the number of training cases." Indeed it does scale linearly with the number of training cases. However, I think that the method scales quadratically with the dimensionality of the input data as the two first layers seem to have similar sizes. Is there any way to adapt the method for very high dimensional data? Indeed, if the input dimensionality is 1M, then the method may not scale as it will require a number of connections in the order of $(1M)^2$. Is it ok if the second layer is already small compared to the dimensionality of the input data? Or is it ok if the connections between the first two layers are sparse (not all connections are there, just a linear number of them)?
Read the paper, add your comments…
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26