Saturday, 9 November 2013

105. Small-World Networks and the Six Degrees of Separation

'If we want to understand life – and ultimately cure disease – we must think networks' (Barabási, Linked).

It is observed that in most of the large networks, the distance or path-length between any two nodes does not increase in proportion to the number of nodes in the network. Perhaps the best known example of this is the 'six degrees of separation' feature discovered by Stanley Milgram in 1967. He found that between most pairs of people in the USA there is typically a network distance of just six acquaintances. This is commonly referred to as a small-world feature in a network. It can occur even in random networks (cf. Part 104), for which Erdős and Rényi had shown that the typical distance between any two nodes scales as the logarithm of the total number N of nodes.

Milgram’s idea stood discredited for some time, but got extensive endorsement later. A Microsoft team (E. Horvitz and J. Leskovec) studied the addresses of 30 billion Microsoft Messenger instant messages sent during a single month (June) in 2006. Two people were taken to be acquaintances if they had sent each other an instant message. It was found that any two persons on average are linked by seven or fewer acquaintances.

Another feature of social and many other networks is the occurrence of cliques. In social networks, cliques are groups of friends or acquaintances in which everyone knows every other member of the group. The clustering coefficient of a network serves as a good measure of this tendency to form clusters or cliques (Wasserman and Faust 1994; Watts 1999). Consider a particular node i. Suppose it is connected to ki other nodes. If these other nodes are part of a clique (which naturally includes the node i also), there would be ki(ki - 1)/2 edges among them. Suppose the actual number of edges among them is Ei. Then the clustering coefficient of the node i is defined as

Ci = [2Ei]/[ki(ki - 1)].

And the clustering coefficient C for the entire network is the average of all the Ci’s.

As we have seen in Part 104, for a random network, C = p. For practically all real networks, C is much larger than p, indicating that the small-world characteristic is indeed very common in complex networks.

Small-world networks have short path lengths like random networks, and have clustering coefficients that are quite independent of network size, like regular networks. This last feature of regular networks or graphs is well illustrated by a crystal lattice, in which the clustering coefficient is fixed, no matter what the size of the crystal is. Consider, for simplicity, a ring lattice, i.e. a 1-dimensional lattice with periodic boundary conditions. Let each node be connected to k other nodes nearest to it. It has a high clustering coefficient because most of the immediate neighbours of any lattice point are also neighbours of one another. Its clustering coefficient can be shown to be

C = [3(k-2)]/[4(k-1)]

A low-dimensional lattice like this does not have short path lengths. For a d-dimensional cubic lattice, the average distance between two nodes scales as N1/d; this is a much faster increase with N than the logarithmic increase typical of random networks, as also of most of the real-life networks.

The Watts-Strogatz model

As stated above, real small-world graphs or networks have short path lengths unlike ordered or regular networks, and have clustering coefficients that are quite independent of network size, unlike random networks. Watts and Strogatz (WS) (1998) proposed a one-parameter model that explored the regime intermediate between random graphs and regular graphs. They started with a ring lattice of order N, in which each node is connected to its nearest k nodes, k/2 on either side. For ensuring that they were dealing with a sparse but connected network, they assumed that N>>k>>ln(N)>>1. Then they introduced randomness by rewiring each conceivable edge with a certain probability p. Thus even distant nodes had a chance of being connected. In fact, the procedure introduced pNk/2 long-distance edges connecting nodes which would have been parts of different neighbourhoods otherwise.

Connection probability p = 0 corresponds to perfect order or regularity, and p = 1 to randomness. Somewhere between these two limits a transition occurs from order to randomness (or chaos). This transition is of great interest in the context of complex systems, because this ‘edge of chaos’ is where complexity thrives. It is in the vicinity of the edge of chaos that the network exhibits a coexistence of clustering and short path lengths.

For a ring lattice, i.e. when p = 0, the average path length is

L(0) ≈ N/(2k) >> 1

And the average clustering coefficient is

C(0) ≈ 3/4

Thus the average path-length scales linearly with order, and the average clustering coefficient is large.

At the other extreme in the WS model, i.e. when p 1, the model describes a random graph, for which

C(1) ≈ Crandom ~ k/N << 1


L(1) ≈ Lrandom ~ ln(N)/ln(k)

Thus the clustering coefficient decreases with increasing order of the net, and the path length increases logarithmically with order.

The WS model has the feature that there is a substantial range of probabilities p for which large C values coexist with small L values, in agreement with what is observed in a variety of real networks. However, it is not able to reproduce the power-law degree-distribution exhibited by many real networks for large k. I shall deal with this aspect of networks in the next post.

No comments:

Post a Comment