fivebythree.net

The clustering coefficient of Nostr's follower network

2023-09-29
Abstract
An example of the clustering coefficient estimation of Nostr’s follower network

Preface

For more readings, please see the “Nostrasia 2024 Inverse Advent Calendar” page for other exciting readings. This article is also a part of it.

Nostrasia 2024 Inverse Advent calendar

This article is about an example of a clustering coefficient estimation over Nostr’s follower network, which is an essential indicator of a “Complex Network.” It is also strongly related to my lightning talk at Nostrasia 2024.

Definition: Clustering coefficient of directed graph

In this article, also under the context of a follower network, I defined a cluster as the condition that “one of my follower’s followers is also one of my followers.”

example of follower network

(An arrow means the node at the outgoing point follows the node at the incoming point)

In the picture above, the node 1 is myself, and 2, 3, 4, and 5 is the follower of 1. It also shows that 3 is a follower of 2 as well. I defined clusters as the triangular relationship between 1, 2, and 3.

cluster

When I calculated the clustering coefficient of node one, I checked all the possible pairs of node 1’s followers to have this relationship and counted how many cluster triangles exist. The clustering coefficient of node 1 can finally be calculated by dividing the number of clusters by the number of follower pairs.

$$ C_i = \frac{\sum_{(i;j,k)} g \left(i,j,k\right) }{n_i(n_i-1)} $$

$(i;j,k)$ denotes that the summation will be done over all the node 1’s follower pairs $i$ and $k$.

$g(i;j,k)$ equals $1$ if $j$ and $k$ form a cluster and otherwise $0$.

$n_i$ is the number of followers of the node 1.

Generally, the more the followers follow each other, the larger the clustering coefficient is.

Finally, the average clustering coefficient over the network is defined as below.

$$ C = \frac{1}{N}\sum_{i=1}^N{C_i} $$

Please note that the average is over the nodes with more than two followers.

Result

I calculated an average clustering coefficient of Nostr’s follower network by collecting active npubs globally from August 1st to 31st. The npubs that were not active within this duration have not been considered even though the collected npubs follow them.

I also checked the internet for similar studies about X/Twitter and compared the results with the calculation.

Average Clustering Coefficient
Nostr 24.3%
X/Twitter 1.9% (*1)

(*1) From Interest Clustering Coefficient: a Metric for Directed Networks like Twitter (PDF)

My result shows that the average clustering coefficient of Nostr’s follower network is more than ten times as large as X/Twitter.

(To be honest, I am still suspecting my program was working wrong…)

This result consists of the tendency of Nostr users to follow each other more often than X/Twitter, as I mentioned in the lightning talk at Nostrasia 2024.

Here is the degree distribution of the clustering coefficient. The chart shows that most users have a clustering coefficient between 0.2 and 0.4.

histogram

I made a scatter chart with the x-axis as the clustering coefficient and the y-axis as the number of followers.

There is a clear boundary above the scattered plot, and the reason for this boundary is still elusive.

I am still checking if the boundary came from a mistake in codes or data processing so far.

scatter_plot

Conclusion

  • The clustering coefficient of Nostr’s follower network is larger than that of X/Twitter.
  • The reason for it is still not apparent but may have something to do with the tendency of Nostr users to follow each other more than X/Twitter users.
  • This result can also be related to Nostr’s user base being considerably smaller than that of X/Twitter.
  • The reason for the clear boundary on the scatter chart is still unclear.

Reference