A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem

☆

Download

Authors: Shweta Jain, C. Seshadhri
Liked by:
Domains: Graph mining
Tags: WWW2017

Uploaded by: Open Reading Group
Upload date: 2018-01-12 15:20:03

Comments:

### Arboricity is different from degeneracy: - "A classic bound on the degeneracy asserts that

$\alpha(G)\leq \sqrt{2m}$ (Lemma 1 of [12])." [12] is related to the arboricity! The degeneracy

$\alpha$ is the maximum of the minimum degree of any subgraph. The arboricity

$\beta$ is the minimum number of forests needed to cover all edges in the graph. In [12],

$\beta\leq ceil(\sqrt{2m+n})/2$ is proven. We have,

$\beta \leq \alpha \leq 2 \beta − 1$ . - "By a theorem of Chiba-Nishizeki [12], we can argue that the number of vertices in any set of the Turan shadow is at most

$\sqrt{2m}$ (where

$m$ is the number of edges in G)." We can actually state that the number of vertices in any set of the Turan shadow is at most

$\alpha$ where

$\alpha$ is the degeneracy of the input graph (in large and sparse real-world graphs we generally have

$\alpha<<\sqrt{2m}$ ). ### Tight bound on the shadow size: "Indeed, we can design instances where Claim 5.7 is tight (a set of

$n/\alpha$ Erdos-Renyi graphs

$G(\alpha,1/3))$ ." It is not clear why this is a tight bound.

$\alpha$ is the degeneracy of the input graph (the graph "

$n/\alpha$ Erdos-Renyi graphs

$G(\alpha,1/3))$ " may not have degeneracy

$\alpha$ ). ### Scalability of an enumeration procedure: "The primary challenge is combinatorial explosion. An autonomous system network with ten million edges has more than a trillion 10-cliques. Any enumeration procedure is doomed to failure." The fact that there can be "a trillion 10-cliques" may not be enough to discard an enumeration procedure. For instance, the program (counting from 1 to one trillion) generated by the following C code terminates within few minutes on a commodity machine. ```c main() { unsigned long long i=0; while (i ＜ 1e12) { i++; } } ``` ### Minors: - "Denote the number of k-cliques by

$C$ , then the success probability is

$C/{ n \choose k}$ . Thus, we can estimate this probability using

${ n \choose k}/C$ samples." This may need some explanations. ### Typos: - "It will be convenient to express this result in terms on k-clique densities."

### Arboricity is different from degeneracy: - "A classic bound on the degeneracy asserts that $\alpha(G)\leq \sqrt{2m}$ (Lemma 1 of [12])." [12] is related to the arboricity! The degeneracy $\alpha$ is the maximum of the minimum degree of any subgraph. The arboricity $\beta$ is the minimum number of forests needed to cover all edges in the graph. In [12], $\beta\leq ceil(\sqrt{2m+n})/2$ is proven. We have, $\beta \leq \alpha \leq 2 \beta − 1$. - "By a theorem of Chiba-Nishizeki [12], we can argue that the number of vertices in any set of the Turan shadow is at most $\sqrt{2m}$ (where $m$ is the number of edges in G)." We can actually state that the number of vertices in any set of the Turan shadow is at most $\alpha$ where $\alpha$ is the degeneracy of the input graph (in large and sparse real-world graphs we generally have $\alpha<<\sqrt{2m}$). ### Tight bound on the shadow size: "Indeed, we can design instances where Claim 5.7 is tight (a set of $n/\alpha$ Erdos-Renyi graphs $G(\alpha,1/3))$." It is not clear why this is a tight bound. $\alpha$ is the degeneracy of the input graph (the graph "$n/\alpha$ Erdos-Renyi graphs $G(\alpha,1/3))$" may not have degeneracy $\alpha$). ### Scalability of an enumeration procedure: "The primary challenge is combinatorial explosion. An autonomous system network with ten million edges has more than a trillion 10-cliques. Any enumeration procedure is doomed to failure." The fact that there can be "a trillion 10-cliques" may not be enough to discard an enumeration procedure. For instance, the program (counting from 1 to one trillion) generated by the following C code terminates within few minutes on a commodity machine. ```c main() { unsigned long long i=0; while (i ＜ 1e12) { i++; } } ``` ### Minors: - "Denote the number of k-cliques by $C$, then the success probability is $C/{ n \choose k}$. Thus, we can estimate this probability using ${ n \choose k}/C$ samples." This may need some explanations. ### Typos: - "It will be convenient to express this result in terms on k-clique densities."

Stranger at 2018-05-10 16:51:28

Papers^γ

Comments:

Please consider to register or login to comment on the paper.

Papersγ

Comments:

Please consider to register or login to comment on the paper.

Papers^γ