☆

3

##To take away:##
- This paper is about a slight improvement of the $k$-clique Algorithm of Chiba and Nishizeki
- The performance in practice on sparse graphs is impressive
- The parallelization is non-trivial and the speedup is nearly optimal up to 40 threads
- Authors generate a stream of k-cliques to compute "compact" subgraphs
- A parallel C code is available here: https://github.com/maxdan94/kClist
##Suggestions to extend this work:##
- Can we find a node ordering better than the core ordering?
- Generate a stream of $k$-cliques to compute other quantities?
- Generalize the algorithm to $k$-motifs?
- Parallelization on higher order $k$-cliques if more threads are available?

Slides of the talk: https://drive.google.com/file/d/15MVJ2TzkdsHcyF6tE4VeYQqH8bU0kzDE/view

> Another extension: can we guarantee a given order on the output stream? that it is uniformly random, for instance?
I think that this is a very interesting and open question! I have tried to generate a stream of k-cliques such that the order is random by modifying the kClist algorithm. But I was not able to do so.
I wanted to do that in order to optimize a function depending on the k-cliques using stochastic gradient descent. I have found that using a random ordering lead to a faster convergence than using the order the k-cliques are outputed by the kClist algorithm.
Here is what I've tried:
- If you have enough RAM, then you can of course store all k-cliques and do a [random permutation](https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle). But, since you mention "steam", I do not think that this is the case for you.
- You can use another node-ordering (different from the core-ordering) to form the DAG. You can use, for instance, a random node ordering. You may lose the theoretical upperbound on the running time, but you will see that, in practice, the algorithm is still very fast (say twice slower than with the core ordering (but this depends on the input graph and k, you may also find some settings where it is actually faster than with the core ordering)). The order the k-cliques are stream will then change, but it will not be uniform at random.
- Once you have formed the DAG using the node ordering (core ordering or any other ordering), you do not need to process the nodes in that same order. You can use another random ordering for that. It will add some randomness in the stream, but the order will still not be uniform at random.
Please let me know if you have any better ideas.

> Another extension: can we guarantee a given order on the output stream? that it is uniformly random, for instance?
One possible way to do is using a buffer, however, it remains non uniform. A buffer of size n/100 can be filled at first using the first n/100 output. Afterwards, one K-clique is randomly selected to be « outputted » and replaced with a new k-clique. The larger the buffer, the closer the ouput will be to a uniformly random output.

☆

1

Great title!

> Great title!
Second it!

I think the paper is also nice. :)

I think this is a great paper. I like how the subject is presented, in particular the state of the art and the motivation are well-written. In my humble opinion, more papers should have an introduction as well-written as this. Although one has to readily accept that having to compare large graphs together is an important task, once this is accepted there is no ambiguity about where the authors want to lead the reader. There is a good overview of the problems with common methods in the literature, clear enough that the reader knows what is going on, yet simple enough not to feel overwhelming. Then, the authors clearly state how they positioned themselves in regard to these problems. The problem statement, in section 3, also follows this pattern of clearly exposing what is what and then using this information to expose what their contribution is. Now, I am not an expert on the subject so I can't make any serious judgment on the scope of the contribution. Some parts about scaling to large graphs seems a bit underwhelming to me, as the authors point out themselves that the Taylor expansion "provides a rather dubious approximation". Still, the theoretical work is well exposed and well developped. Thus, I expect anyone in this field could use this as a great starting point to improve on the experimental results.
Overall, I wish more papers were written with such clarity. The authors took time to clearly state their problem, which enabled me, an outsider to such questions of graph comparison techniques, to easily follow their argument. This was an enjoyable read.

## Comments: