Wellcome to Maximimi's library,

  You can find here all papers liked or uploaded by Maximimi
  together with brief user bio and description of her/his academic activity.


[Link to my homepage](https://sites.google.com/view/danisch/home) ## I will read the following papers. - [Quasi-Succinct Indices](https://papers-gamma.link/paper/130) - [PageRank as a Function of the Damping Factor](https://papers-gamma.link/paper/106) - [Graph Stream Algorithms: A Survey](https://papers-gamma.link/paper/102) - [Network Sampling: From Static to Streaming Graphs](https://papers-gamma.link/paper/122) - [The Protein-Folding Problem, 50 Years On](https://papers-gamma.link/paper/78) - [Computational inference of gene regulatory networks: Approaches, limitations and opportunitie](https://papers-gamma.link/paper/77) - [Graph complexity analysis identifies an ETV5 tumor-specific network in human and murine low-grade glioma](https://papers-gamma.link/paper/79) - [Gene Networks in Plant Biology: Approaches in Reconstruction and Analysis](https://papers-gamma.link/paper/76) - [The non-convex Burer–Monteiro approach works on smooth semidefinite programs](https://papers-gamma.link/paper/80) - [Solving SDPs for synchronization and MaxCut problems via the Grothendieck inequality](https://papers-gamma.link/paper/81) - [Influence maximization in complex networks through optimal percolation](https://papers-gamma.link/paper/70) - [Motifs in Temporal Networks](https://papers-gamma.link/paper/61) - [Deep Sparse Rectifier Neural Networks](https://papers-gamma.link/paper/69) - [Sparse Convolutional Neural Networks](https://papers-gamma.link/paper/67) - [A fast and simple algorithm for training neural probabilistic language models](https://papers-gamma.link/paper/58) - [Adding One Neuron Can Eliminate All Bad Local Minima](https://papers-gamma.link/paper/71) ## I read the following papers. ### 2018-2019: - [SWeG: Lossless and Lossy Summarization of Web-Scale Graphs](https://papers-gamma.link/paper/139) - [Smoothed Analysis: An Attempt to Explain the Behavior of Algorithms in Practice](https://papers-gamma.link/paper/129) - [Are stable instances easy?](https://papers-gamma.link/paper/128) - [Hierarchical Taxonomy Aware Network Embedding](https://papers-gamma.link/paper/116) - [Billion-scale Network Embedding with Iterative Random Projection](https://papers-gamma.link/paper/110) - [HARP: Hierarchical Representation Learning for Networks](https://papers-gamma.link/paper/109/) - [Layered Label Propagation: A MultiResolution Coordinate-Free Ordering for Compressing Social Networks](https://papers-gamma.link/paper/105) ### 2017-2018: - [Link Prediction in Graph Streams](https://papers-gamma.link/paper/101) - [The Community-search Problem and How to Plan a Successful Cocktail Party](https://papers-gamma.link/paper/74) - [A Nonlinear Programming Algorithm for Solving Semidefinite Programs via Low-rank Factorization](https://papers-gamma.link/paper/55) - [Deep Learning](https://papers-gamma.link/paper/68) - [Reducing the Dimensionality of Data with Neural Networks](https://papers-gamma.link/paper/65) - [Representation Learning on Graphs: Methods and Applications](https://papers-gamma.link/paper/60) - [Improved Approximation Algorithms for MAX k-CUT and MAX BISECTION](https://papers-gamma.link/paper/56) - [Cauchy Graph Embedding](https://papers-gamma.link/paper/53) - [Phase Transitions in Semidefinite Relaxations](https://papers-gamma.link/paper/57) - [Graph Embedding Techniques, Applications, and Performance: A Survey](https://papers-gamma.link/paper/52) - [VERSE: Versatile Graph Embeddings from Similarity Measures](https://papers-gamma.link/paper/48) - [Hierarchical Clustering Beyond the Worst-Case](https://papers-gamma.link/paper/45) - [Scalable Motif-aware Graph Clustering](https://papers-gamma.link/paper/18) - [Practical Algorithms for Linear Boolean-width](https://papers-gamma.link/paper/40) - [New Perspectives and Methods in Link Prediction](https://papers-gamma.link/paper/28/New%20Perspectives%20and%20Methods%20in%20Link%20Prediction) - [In-Core Computation of Geometric Centralities with HyperBall: A Hundred Billion Nodes and Beyond](https://papers-gamma.link/paper/37) - [Diversity is All You Need: Learning Skills without a Reward Function](https://papers-gamma.link/paper/36) - [When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors](https://papers-gamma.link/paper/23) - [Fast Approximation of Centrality](https://papers-gamma.link/paper/35/Fast%20Approximation%20of%20Centrality) - [Indexing Public-Private Graphs](https://papers-gamma.link/paper/19/Indexing%20Public-Private%20Graphs) - [On the uniform generation of random graphs with prescribed degree sequences](https://papers-gamma.link/paper/26/On%20the%20uniform%20generation%20of%20random%20graphs%20with%20prescribed%20d%20egree%20sequences) - [Linear Additive Markov Processes](https://papers-gamma.link/paper/21/Linear%20Additive%20Markov%20Processes) - [ESCAPE: Efficiently Counting All 5-Vertex Subgraphs](https://papers-gamma.link/paper/17/ESCAPE:%20Efficiently%20Counting%20All%205-Vertex%20Subgraphs) - [The k-peak Decomposition: Mapping the Global Structure of Graphs](https://papers-gamma.link/paper/16/The%20k-peak%20Decomposition:%20Mapping%20the%20Global%20Structure%20of%20Graphs) - [A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem](https://papers-gamma.link/paper/24)

Comments:

Interesting paper around the question: "Can machine learning be used to cheaply detect and exploit structure in practically relevant instances of NP-hard problems that come from the same distribution?". The problem of listing all maximum cliques in a graph is considered. A binary classifier is trained to discriminate between a node belonging to a maximum clique and a node that do not belong to a maximum clique: logistic regression is used with node features such as the total number of edges, node degree, clustering coefficient and trained on a set of graphs. The classifier is then used to prune the search space of a test graph: nodes that are labeled "not belonging to a maximum clique" by the classifier are removed. Some speedup is observed when comparing the running time of state-of-the-art methods on the original graph (or on a graph pruned using a baseline pruning method (degree pruning)) and on the pruned graph. # A more challenging baseline pruning For the baseline pruning: a heuristic is used to find a large clique and then if the found clique has size $k$, all nodes of degree $k-2$ or less are removed. It would have been better to iteratively remove all nodes of degree $k-2$ or less till there is no more such nodes and thus only keeping the $(k-1)$-core of the graph (it seems that only nodes of degree $k-2$ or less in the original graphs are removed). That would lead to much smaller graphs and a more challenging baseline pruning. How good is this k-core pruning? In table 1, the speedup obtained with degree pruning is not reported for igraph and EmMCE. Is the speedup obtained by the proposed pruning much better than the one obtained by the degree pruning for these two programs? # Setting the confidence threshold The confidence threshold used for the logistic regression classifier is arbitrarily fixed to $q=0.55$. If $q$ is higher, then the list of maximum cliques computed on the pruned graph may not be correct (i.e. it may not be the same as in the original graph as nodes belonging to a maximum clique would be pruned). It is not clear to me why all results are exact in table 1: just luck? How about using $q=0.6$ or $0.7$? Parameter sensitivity is not discussed. Note that if there is a single maximum clique of size $k$ and many cliques of size $k-1$ then removing a node from the maximum clique will completely change the output (which may then be many cliques of size $k-1$ instead of a single clique of size $k$).
Read the paper, add your comments…

Comments:

Read the paper, add your comments…

Comments:

Great paper by the founders of Smoothed Analysis! Some nice quotes: - "if one can prove that an algorithm performs well in the worst case, then one can be confident that it will work well in every domain. However, there are many algorithms that work well in practice that do not work well in the worst case. Smoothed analysis provides a theoretical framework for explaining why some of these algorithms do work well in practice." - "The performance profiles of algorithms across the landscape of input instances can differ greatly and can be quite irregular." - "If a single input instance triggers an exponential run time, the algorithm is called an exponential-time algorithm." - "While polynomial time algorithms are usually viewed as being efficient, we clearly prefer those whose run time is a polynomial of low degree, especially those that run in nearly linear time." - "Developing means for predicting the performance of algorithms and heuristics on real data and on real computers is a grand challenge in algorithms." - "We hope that theoretical explanations will be found for the success in practice of many of these algorithms, and that these theories will catalyze better algorithm design." - "there are many problems that need to be solved in practice for which we do not know algorithms with good worst-case performance. Instead, scientists and engineers typically use heuristic algorithms to solve these problems. Many of these algorithms work well in practice, in spite of having a poor, sometimes exponential, worst-case running time. Practitioners justify the use of these heuristics by observing that worst-case instances are usually not “typical” and rarely occur in practice. The worst-case analysis can be too pessimistic." - "heuristics are often used to speed up the practical performance of implementations that are based on algorithms with polynomial worst-case complexity. These heuristics might in fact worsen the worst-case performance, or make the worst-case complexity difficult to analyze." - "While one would ideally choose the distribution of inputs that occur in practice, this is difficult as it is rare that one can determine or cleanly express these distributions, and the distributions can vary greatly between one application and another. Instead, average-case analyses have employed distributions with concise mathematical descriptions, such as Gaussian random vectors, uniform {0, 1} vectors, and Erdos-Renyi random graphs. The drawback of using such distributions is that the inputs actually encountered in practice may bear very little resemblance to the inputs that are likely to be generated by such distributions." - "Because of the intrinsic difficulty in defining practical distributions, we consider an alternative approach to modeling real data. The basic idea is to identify typical properties of practical data, define an input model that captures these properties, and then rigorously analyze the performance of algorithms assuming their inputs have these properties. Smoothed analysis is a step in this direction. It is motivated by the observation that practical data are often subject to some small degree of random noise." - "At a high level, each input is generated from a two-stage model: In the first stage, an instance is generated and in the second stage, the instance from the first stage is slightly perturbed. The perturbed instance is the input to the algorithm." - "we hope insights gained from smoothed analysis will lead to new ideas in algorithm design [...] we suggest that it might be possible to solve some problems more efficiently by perturbing their inputs."
Read the paper, add your comments…
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23