together with brief user bio and description of her/his academic activity.

[Link to my homepage](https://sites.google.com/view/danisch/home)
## I will read the following papers.
- [PageRank as a Function of the Damping Factor](https://papers-gamma.link/paper/106)
- [Graph Stream Algorithms: A Survey](https://papers-gamma.link/paper/102)
- [Network Sampling: From Static to Streaming Graphs](https://papers-gamma.link/paper/122)
- [The Protein-Folding Problem, 50 Years On](https://papers-gamma.link/paper/78)
- [Computational inference of gene regulatory networks: Approaches, limitations and opportunitie](https://papers-gamma.link/paper/77)
- [Graph complexity analysis identifies an ETV5 tumor-specific network in human and murine low-grade glioma](https://papers-gamma.link/paper/79)
- [Gene Networks in Plant Biology: Approaches in Reconstruction and Analysis](https://papers-gamma.link/paper/76)
- [The non-convex Burer–Monteiro approach works on smooth semidefinite programs](https://papers-gamma.link/paper/80)
- [Solving SDPs for synchronization and MaxCut problems via the Grothendieck inequality](https://papers-gamma.link/paper/81)
- [Influence maximization in complex networks through optimal percolation](https://papers-gamma.link/paper/70)
- [Motifs in Temporal Networks](https://papers-gamma.link/paper/61)
- [Deep Sparse Rectifier Neural Networks](https://papers-gamma.link/paper/69)
- [Sparse Convolutional Neural Networks](https://papers-gamma.link/paper/67)
- [A fast and simple algorithm for training neural probabilistic language models](https://papers-gamma.link/paper/58)
- [Adding One Neuron Can Eliminate All Bad Local Minima](https://papers-gamma.link/paper/71)
## I read the following papers.
### 2018-2019:
- [Are stable instances easy?](https://papers-gamma.link/paper/128)
- [Hierarchical Taxonomy Aware Network Embedding](https://papers-gamma.link/paper/116)
- [Billion-scale Network Embedding with Iterative Random Projection](https://papers-gamma.link/paper/110)
- [HARP: Hierarchical Representation Learning for Networks](https://papers-gamma.link/paper/109/)
- [Layered Label Propagation: A MultiResolution Coordinate-Free Ordering for Compressing Social Networks](https://papers-gamma.link/paper/105)
### 2017-2018:
- [Link Prediction in Graph Streams](https://papers-gamma.link/paper/101)
- [The Community-search Problem and How to Plan a Successful Cocktail Party](https://papers-gamma.link/paper/74)
- [A Nonlinear Programming Algorithm for Solving Semidefinite Programs via Low-rank Factorization](https://papers-gamma.link/paper/55)
- [Deep Learning](https://papers-gamma.link/paper/68)
- [Reducing the Dimensionality of Data with Neural Networks](https://papers-gamma.link/paper/65)
- [Representation Learning on Graphs: Methods and Applications](https://papers-gamma.link/paper/60)
- [Improved Approximation Algorithms for MAX k-CUT and MAX BISECTION](https://papers-gamma.link/paper/56)
- [Cauchy Graph Embedding](https://papers-gamma.link/paper/53)
- [Phase Transitions in Semidefinite Relaxations](https://papers-gamma.link/paper/57)
- [Graph Embedding Techniques, Applications, and Performance: A Survey](https://papers-gamma.link/paper/52)
- [VERSE: Versatile Graph Embeddings from Similarity Measures](https://papers-gamma.link/paper/48)
- [Hierarchical Clustering Beyond the Worst-Case](https://papers-gamma.link/paper/45)
- [Scalable Motif-aware Graph Clustering](https://papers-gamma.link/paper/18)
- [Practical Algorithms for Linear Boolean-width](https://papers-gamma.link/paper/40)
- [New Perspectives and Methods in Link Prediction](https://papers-gamma.link/paper/28/New%20Perspectives%20and%20Methods%20in%20Link%20Prediction)
- [In-Core Computation of Geometric Centralities with HyperBall: A Hundred Billion Nodes and Beyond](https://papers-gamma.link/paper/37)
- [Diversity is All You Need: Learning Skills without a Reward Function](https://papers-gamma.link/paper/36)
- [When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors](https://papers-gamma.link/paper/23)
- [Fast Approximation of Centrality](https://papers-gamma.link/paper/35/Fast%20Approximation%20of%20Centrality)
- [Indexing Public-Private Graphs](https://papers-gamma.link/paper/19/Indexing%20Public-Private%20Graphs)
- [On the uniform generation of random graphs with prescribed degree sequences](https://papers-gamma.link/paper/26/On%20the%20uniform%20generation%20of%20random%20graphs%20with%20prescribed%20d%20egree%20sequences)
- [Linear Additive Markov Processes](https://papers-gamma.link/paper/21/Linear%20Additive%20Markov%20Processes)
- [ESCAPE: Efficiently Counting All 5-Vertex Subgraphs](https://papers-gamma.link/paper/17/ESCAPE:%20Efficiently%20Counting%20All%205-Vertex%20Subgraphs)
- [The k-peak Decomposition: Mapping the Global Structure of Graphs](https://papers-gamma.link/paper/16/The%20k-peak%20Decomposition:%20Mapping%20the%20Global%20Structure%20of%20Graphs)
- [A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem](https://papers-gamma.link/paper/24)

☆

2

### Very nice application to graph classification / clustering:
Frequencies of motifs are graph's features that can be plugged in a Machine Learning algorithm in order to perform classification or clustering of a set of graphs.
### Very nice application to link prediction / recommendation:
page 8 "Trends in pattern counts". Some motifs are rarely observed as induced motifs. Thus, if such a motif is seen at time $t$ in a dynamic graph, then an edge between a pair of its vertices might be created (or be deleted) at time $t+1$.
Note that the framework only counts motifs without listing/enumerating them. Some adaptations are thus needed to deal with link prediction and recommendation.
### Application to graph generation:
Generate a random graph with a given number of each of the 21 possible 5-motifs.
### Some statements might be a little too strong:
- "As we see in our experiments, the counts of 5-vertex patterns are in the orders of billions to trillions, even for graphs with a few million edges. An enumeration algorithm is forced to touch each occurrence, and cannot terminate in a reasonable time."
- The framework is absolutely critical for exact counting, since enumeration is clearly infeasible even for graphs with a million edges. For instance, an autonomous systems graph with 11M edges has $10^{11}$ instances of pattern 17 (in Fig. 1)."
- "For example, the tech-as-skitter graph has 11M edges, but 2 trillion 5-cycles. Most 5-vertex pattern counts are upwards of many billions for most of the graphs (which have only 10M edges). Any algorithm that explicitly touches each such pattern is bound to fail."
The following C code generates a program that counts from 1 to one trillion. This program terminates within 30 minutes on a commodity machine (with -O3 option for the compiler optimization).
```c
main() {
unsigned long long i=0;
while (i ＜ 1e12) {
i++;
}
}
```
This shows that "touching" billions or trillions of motifs might actually be possible in a reasonable amount of time.
### Minors:
- "Degeneracy style ordering" might be confusing as a degree ordering is used and not a degeneracy ordering (i.e. k-core decomposition).
- Figure 7: "The line is defined by...", should be "The prediction is defined by..." as the line is just "$y=x$".
### Typos:
- "they do not scale graphs of such sizes."
- "For example, the 6th 4-pattern in the four-clique and the 8th 5-pattern is the five-cycle."
- "and $O\big(n+m+T(G)$ storage."
- "An alternate (not equivalent) measure is the see the fraction of 4-cycle"
- "an automonous systems graph with"
### Others:
- https://oeis.org/A001349
- Is it possible to generalize the framework to any k?

A program for listing all k-motifs: https://github.com/maxdan94/kmotif

☆

1

### Nice validation of the obtained embedding:
The obtained embedding seems useful for: (i) node clustering, (ii) link prediction and (iii) node classification.
### Concerns about the scalability of Louvain:
The Louvain algorithm can process Orkut in a few seconds on a commodity machine, contrary to what Table 2 suggests. The obtained modularity is of 0.68 such as written in Table 3 (the table presenting the dataset). The value of 0.68 is much more than the modularity obtained thanks to embeddings and k-means.
### Table 2 reporting the average and worst case running time is confusing:
In Table 2, $\Theta$ is used to denote the average running time and $O$ is used to denote the worst case running time. This is very confusing as worst-case running time can be given in terms of big-oh or theta.
Definitions are given here: https://en.wikipedia.org/wiki/Big_O_notation
It is also not clear how these asymptotic complexities were obtained, in particular, the asymptotic time and memory complexities of the suggested algorithm are not proven.

### VERSE Embedding Model
"We define the unnormalized distance between two nodes u,v in the embedding space as the dot product of their embeddings Wu·W⊤v. The similarity distribution in the embedded space is then normalized with softmax."
What is the intuition behind the softmax?

> Perhaps, an interesting research direction would be something like embedding-guided modularity optimization.
Maybe using the modularity-matrix (https://en.wikipedia.org/wiki/Modularity_(networks)#Matrix_formulation) as the similarity matrix used in VERSE?
This matrix can have negative entries though...

Hi!
> If you see the matrix $W$ used in LINE as the similarity matrix you use in VERSE, then yes I think that what I have suggested is similar to LINE. So VERSE can be seen as using LINE on the input similarity matrix? Is that correct?
VERSE can be seen as going beyond this ;)
If we just plug similarities into LINE training objective, we observe that (Fig. 3, NS line) the objective does not recover similarities at all! This is very surprising, and suggests that current embedding methods such as node2vec or LINE do not optimize the actual objective they claim to optimize. When I discovered that, I thought this is just a bug in the code, so I have made a version of VERSE that optimizes the random walk based similarity from DeepWalk. It turned out that DeepWalk does not preserve the similarity ranking of these walks.
> I've missed that. You mean that in LINE, "negative sampling" is used for the optimization; while in VERSE, "Noise Contrastive Estimation" is used?
Yes! To overcome the aforementioned problem, we have proposed an NCE-based sampling solution. It seems to work much better than plain negative sampling, and show that it is close to the full neural factorization of the matrix (which is only possible for VERSE model)!
>
>> Perhaps, an interesting research direction would be something like embedding-guided modularity optimization.
>
> Maybe using the modularity-matrix (https://en.wikipedia.org/wiki/Modularity_(networks)#Matrix_formulation) as the similarity matrix used in VERSE?
> This matrix can have negative entries though...
Yes, modularity matrix would probably not suit the algorithm well, as it is not probabilistic per node. Maybe some reformulation would help, I am not aware of any to this day.

## Comments: