Comments:

### The Louvain algorithm: "In some cases, it may be possible to have a very fast algorithms based on heuristics to compute partitions, however, we are unaware of any such methods that would have provable guarantees for the kinds of graphs that appear in hierarchical clustering" The Louvain algorithm: https://perso.uclouvain.be/vincent.blondel/research/louvain.html is such an efficient algorithm. It outputs a hierarchical clustering. The algorithm optimizes the "modularity" in a greedy way, it then merges clusters into "hypernodes" once a local minimum is found and iterates on the newly formed graph. In most cases, only the partition corresponding to the last level (the larger clusters) is used. It is known for being fast and accurate in practice. However, to the best of my knowledge, it does not have provable approximation guarantees. ### Wikipedia is a graph with a ground truth hierarchical clustering: The set of Wikipedia pages and the hypertext links among them form a graph. The Wikipedia categories can be seen as a ground-truth hierarchical clustering of this graph. It would be interesting to see whether the suggested algorithm can find them. Datatset: http://cfinder.org/wiki/?n=Main.Data#toc1 ### Points in an ultrametric space have a natural hierarchical clustering: What is the intuition behind this fact? ### Minors: - in Defnition 1.1 "Let G be a graph generated by a minimal ultrametric". "ultrametric" is defined earlier in the paper, but "minimal ultrametric" is not. ### Typos: - "a very fast algorithms based on" - "The expected graph as the is the weighted complete graph"
> It outputs a hierarchical clustering. The algorithm optimizes the "modularity" in a greedy way, it then merges clusters into "hypernodes" once a local minimum is found and iterates on the newly formed graph. It seems that the trees considered in the article are binary trees, while Louvain algorithm yields a hierarchical clustering tree which is not necessarily binary.
Video of the talk: https://www.lincs.fr/events/hierarchical-clustering-objective-functions-and-algorithms/
> ### Wikipedia is a graph with a ground truth hierarchical clustering: > > The set of Wikipedia pages and the hypertext links among them form a graph. The Wikipedia categories can be seen as a ground-truth hierarchical clustering of this graph. It would be interesting to see whether the suggested algorithm can find them. > > Datatset: http://cfinder.org/wiki/?n=Main.Data# Actually, this hierarchy is a DAG (directed acyclic graph), not a tree.

Please consider to register or login to comment on the paper.