ESCAPE: Efficiently Counting All 5-Vertex Subgraphs
Authors: Ali Pinar
Liked by: Maximimi, Open Reading Group
Domains: Graph mining
Tags: WWW2017
Uploaded by: Open Reading Group
Upload date: 2018-01-12 15:14:19


### Very nice application to graph classification / clustering: Frequencies of motifs are graph's features that can be plugged in a Machine Learning algorithm in order to perform classification or clustering of a set of graphs. ### Very nice application to link prediction / recommendation: page 8 "Trends in pattern counts". Some motifs are rarely observed as induced motifs. Thus, if such a motif is seen at time $t$ in a dynamic graph, then an edge between a pair of its vertices might be created (or be deleted) at time $t+1$. Note that the framework only counts motifs without listing/enumerating them. Some adaptations are thus needed to deal with link prediction and recommendation. ### Application to graph generation: Generate a random graph with a given number of each of the 21 possible 5-motifs. ### Some statements might be a little too strong: - "As we see in our experiments, the counts of 5-vertex patterns are in the orders of billions to trillions, even for graphs with a few million edges. An enumeration algorithm is forced to touch each occurrence, and cannot terminate in a reasonable time." - The framework is absolutely critical for exact counting, since enumeration is clearly infeasible even for graphs with a million edges. For instance, an autonomous systems graph with 11M edges has $10^{11}$ instances of pattern 17 (in Fig. 1)." - "For example, the tech-as-skitter graph has 11M edges, but 2 trillion 5-cycles. Most 5-vertex pattern counts are upwards of many billions for most of the graphs (which have only 10M edges). Any algorithm that explicitly touches each such pattern is bound to fail." The following C code generates a program that counts from 1 to one trillion. This program terminates within 30 minutes on a commodity machine (with -O3 option for the compiler optimization). ```c main() { unsigned long long i=0; while (i < 1e12) { i++; } } ``` This shows that "touching" billions or trillions of motifs might actually be possible in a reasonable amount of time. ### Minors: - "Degeneracy style ordering" might be confusing as a degree ordering is used and not a degeneracy ordering (i.e. k-core decomposition). - Figure 7: "The line is defined by...", should be "The prediction is defined by..." as the line is just "$y=x$". ### Typos: - "they do not scale graphs of such sizes." - "For example, the 6th 4-pattern in the four-clique and the 8th 5-pattern is the five-cycle." - "and $O\big(n+m+T(G)$ storage." - "An alternate (not equivalent) measure is the see the fraction of 4-cycle" - "an automonous systems graph with" ### Others: - - Is it possible to generalize the framework to any k?
A program for listing all k-motifs:

Please consider to register or login to comment on the paper.