### Comments:

I'm in love with this paper! ### Non-worst-case analysis: In practice, "we are not interested in all problem instances, but only in those which can actually occur in reality." "The notion of stability [...] is a concrete way to formalize the notion that the only instances of interest are those for which small perturbation in the data (which may reflect e.g. some measurement errors) do not change the optimal partition of the graph." This Stable Analysis is different from "Smoothed Analysis" where "one shows that the hard instances form a discrete and isolated subset of the input space". ### Open problems: **Conjecture:** There exists some constant $\gamma^{*}$ such that $\gamma^{*}$-stable instances can be solved in polynomial time. **Question:** it is shown that $\gamma$-stable instances, with $\gamma>\sqrt{\Delta n}$, can be solved in polynomial time. Can this be improved (without further assumptions such as a lower bound on the minimum degree)? As $\sqrt{\Delta n}$ is usually large, this may not be useful in practice. **Question:** How does the algorithm "FindMaxCut" (page 6) perform in practice on real-world instances??? **Question:** How about the greedy heuristic?: Start from a random cut and do passes on the nodes moving each node to the other side of the cut if the size of the cut increases until convergence. Does it have some guarantee on $\gamma$-stable instances? ### Extended spectral clustering: "Let D be a diagonal matrix. Think of W + D as the weighted adjacency matrix of a graph, with loops added. Such loops do not change the weight of any cut, so that regardless of what D we choose, a cut is maximal in W iff it is maximal in W + D. Furthermore, it is not hard to see that W is $\gamma$-stable, iff W + D is. Our approach is to first find a “good” D, and then take the spectral partitioning of W + D as the maximal cut. These observations suggest the following question: Is it true that for every $\gamma$-stable instance W with γ large enough there exists a diagonal D for which extended spectral partitioning solves Max-Cut? If so, can such a D be found efficiently? Below we present certain sufficient conditions for these statements." I did not fully understand what is presented below that paragraph. Let G be a $\gamma$-stable graph, how do I get $D$? ### Goemans-Williamson algorithm: The approximation guarantee of the Goemans-Williamson algorithm is better on $\gamma$-stable instances than in general. ### Random model: With high probability, the extended specral clustering leads to the optimal cut on $\gamma$-stable instances generated from a certain random model for $\gamma\geq 1+\Omega(\sqrt{\frac{\log(n)}{n}})$. ### Typos: - page 3, Proposition 2.1: " A graph G graph" - page 4: "this follows from Definition 2.1", should be "Proposition 2.1". - page 5, Definition 2.2: should be "E" instead of "e" in the equation. - page 5: "which must to be on the" and "of the optional cut" -> "optimal". - page 8: "we multiply it be a PSD matrix"
Read the paper, add your comments…

### Comments:

Nice paper building on top of [the WebGraph framework](https://papers-gamma.link/paper/31) and [Chierichetti et al.](https://papers-gamma.link/paper/126) to compress graphs. ### Approximation guarantee I read: "our algorithm is inspired by a theoretical approach with provable guarantees on the final quality, and it is designed to directly optimize the resulting compression ratio.". I misunderstood initially, but the proposed algorithm actually does not have any provable approximation guarantee other than the $\log(n)$ one (which is also obtained by a random ordering of the nodes). Designing an algorithm with (a better) approximation guarantee for minimizing "MLogA", "MLogGapA" or "BiMLogA" seems to be a nice open problem. ### Objectives Is there any better objective than "MLogA", "MLogGapA" or "BiMLogA" to have a proxy of the compression obtained by the BV-framework? Is it possible to directly look for an ordering that minimizes the size of the output of BV compression algorithm?
Read the paper, add your comments…

### Comments:

Read the paper, add your comments…

### Comments:

Hello, I need the key point of your algorithm is not really clear > • We start with the root such that all edges are possibly > here or not. The upper bound is the sum of the $n \choose 2$ > heaviest edges, while the associated solution is the empty > subgraph, the lower bound is thus 0. > > • Then at each iteration we create two children for the node > with maximum lower bound (i.e. density of the associated > solution). Suppose the node is at depth $i$ in the tree, we > keep the decisions made on the first $i-1$ edges and create > two children, one where the $i^\text{th}$ edge is included and one > where it is not. I need to understand this part in clear way if that is possible please
> I need to understand this part in clear way if that is possible please I see. I agree it is not perfectly clear, sorry about that. Can you try to understand it with the branch and bound [wikipedia page](https://en.wikipedia.org/wiki/Branch_and_bound), [the slides](https://drive.google.com/file/d/0B6cGK503Ibt0Qlg3bUVKRnFBTG8/view) and [the code](https://github.com/maxdan94/HkS)? If it is still not clear after that, please come back and I'll try to phrase a better explanation ASAP.
Read the paper, add your comments…

### Comments:

##To take away:## - This paper is about a slight improvement of the $k$-clique Algorithm of Chiba and Nishizeki - The performance in practice on sparse graphs is impressive - The parallelization is non-trivial and the speedup is nearly optimal up to 40 threads - Authors generate a stream of k-cliques to compute "compact" subgraphs - A parallel C code is available here: https://github.com/maxdan94/kClist ##Suggestions to extend this work:## - Can we find a node ordering better than the core ordering? - Generate a stream of $k$-cliques to compute other quantities? - Generalize the algorithm to $k$-motifs? - Parallelization on higher order $k$-cliques if more threads are available?
Slides of the talk: https://drive.google.com/file/d/15MVJ2TzkdsHcyF6tE4VeYQqH8bU0kzDE/view
> Another extension: can we guarantee a given order on the output stream? that it is uniformly random, for instance? I think that this is a very interesting and open question! I have tried to generate a stream of k-cliques such that the order is random by modifying the kClist algorithm. But I was not able to do so. I wanted to do that in order to optimize a function depending on the k-cliques using stochastic gradient descent. I have found that using a random ordering lead to a faster convergence than using the order the k-cliques are outputed by the kClist algorithm. Here is what I've tried: - If you have enough RAM, then you can of course store all k-cliques and do a [random permutation](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle). But, since you mention "steam", I do not think that this is the case for you. - You can use another node-ordering (different from the core-ordering) to form the DAG. You can use, for instance, a random node ordering. You may lose the theoretical upperbound on the running time, but you will see that, in practice, the algorithm is still very fast (say twice slower than with the core ordering (but this depends on the input graph and k, you may also find some settings where it is actually faster than with the core ordering)). The order the k-cliques are stream will then change, but it will not be uniform at random. - Once you have formed the DAG using the node ordering (core ordering or any other ordering), you do not need to process the nodes in that same order. You can use another random ordering for that. It will add some randomness in the stream, but the order will still not be uniform at random. Please let me know if you have any better ideas.
> Another extension: can we guarantee a given order on the output stream? that it is uniformly random, for instance? One possible way to do is using a buffer, however, it remains non uniform. A buffer of size n/100 can be filled at first using the first n/100 output. Afterwards, one K-clique is randomly selected to be « outputted » and replaced with a new k-clique. The larger the buffer, the closer the ouput will be to a uniformly random output.
Read the paper, add your comments…