Very well written paper proposing an interesting algorithm to solve (lossless and lossy) graph summarization problems (cf problem definition). The algorithm is parallelizable in a shared memory environment and can be distributed using mapreduce. The theoretical analysis is sound and shows that the algorithm has a linear time complexity in practical scenarios. The experimental evaluation shows that the algorithm achieves a better trade-of between time and quality than existing approaches. In particular, the algorithm is faster than the greedy heuristic while leading to similar results. The proposed algorithm can be seen as a scalable implementation of the greedy heuristic: using shingle hashing of nodes allows to prune the search space and consider only some relevant pairs of nodes instead of considering all pairs of nodes (or all pairs of nodes sharing at least one neighbor). ### No theoretical approximation guarantees: The graph summarization problem is well defined and has an optimal solution. The proposed algorithm does not seem to have any theoretical approximation guarantees. According to the experimental evaluation, the quality of the output is similar to the one of the (straightforward) greedy heuristic, but we do not know how far from optimality it is. Is there any algorithm for the problem with some theoretical approximation guarantees? ### Absolute performance: While the performance relatively to other algorithms is good, the absolute performance of the algorithm is somehow disappointing. Figure 3: we see that the size of the output graphs is larger than 40% the size of the input graphs (in terms of (super) edges) for all graphs (and in many cases larger than 70%) except for the two web graphs (it is known that web graphs can be compressed efficiently (cf the webgraph framework)) and surprisingly a protein interaction graph (only 10% of edges are kept).

Well, apparently nobody knows how to enumerate directed animals according to the number of edges. It is an open question of combinatorics. The following table from "Directed Animals on Two Dimensional Lattices" article by A. R. Conway, R. Brak and A. J. Guttmann presents results for n<40 **Number of bond animals on the square lattice...**  1 1 2 2 3 5 4 14 5 42 6 130 7 412 8 1326 9 4318 10 14188 11 46950 12 156258 13 522523 14 1754254 15 5909419 16 19964450 17 67618388 18 229526054 19 780633253 20 2659600616 21 9075301990 22 31010850632 23 106100239080 24 363428599306 25 1246172974048 26 4277163883744 27 14693260749888 28 50516757992258 29 173812617499767 30 598455761148888 31 2061895016795926 32 7108299669877836 33 24519543126693604 34 84623480620967174 35 292204621065844292 36 1009457489428859322 37 3488847073597306764 38 12063072821044567580 39 41725940730851479532 40 144383424404966638976