Youtube is a video-sharing web site that includes a social network. In the Youtube social network, users form friendship each other and users can create groups which other users can join. We consider such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al.
We regard each connected component in a group as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. As for the network, we provide the largest connected component.
| Network statistics | |
|---|---|
| Nodes | 1134890 |
| Edges | 2987624 |
| Nodes in largest WCC | 1134890 (1.000) |
| Edges in largest WCC | 2987624 (1.000) |
| Nodes in largest SCC | 1134890 (1.000) |
| Edges in largest SCC | 2987624 (1.000) |
| Average clustering coefficient | 0.0808 |
| Number of triangles | 3056386 |
| Fraction of closed triangles | 0.002081 |
| Diameter (longest shortest path) | 20 |
| 90-percentile effective diameter | 6.5 |
| Community statistics | |
| Number of communities | 8,385 |
| Average community size | 13.50 |
| Average membership size | 0.10 |
| File | Description |
|---|---|
| com-youtube.ungraph.txt.gz | Undirected Youtube network |
| com-youtube.all.cmty.txt.gz | Youtube communities |
| com-youtube.top5000.cmty.txt.gz | Youtube communities (Top 5,000) |