'If we want
to understand life – and ultimately cure disease – we must think networks'
(Barabási, Linked).
It is observed
that in most of the large networks, the distance or path-length between any two
nodes does not increase in proportion to the number of nodes in the network.
Perhaps the best known example of this is the 'six degrees of separation' feature discovered by Stanley Milgram in 1967.
He found that between most pairs of people in the USA there is typically a
network distance of just six acquaintances. This is commonly referred to as a
small-world feature in a network. It can occur even in random networks (cf.
Part 104), for which Erdős and Rényi had shown that the typical distance
between any two nodes scales as the logarithm of the total number N of nodes.
Milgram’s idea stood discredited for some time, but got extensive
endorsement later. A Microsoft team (E. Horvitz and J. Leskovec) studied the addresses of 30 billion Microsoft Messenger instant
messages sent during a single month (June) in 2006. Two people were taken to be
acquaintances if they had sent each other an instant message. It was found that
any two persons on average are linked by seven or fewer acquaintances.
Another feature of social and many other networks is the occurrence of cliques. In social networks, cliques are
groups of friends or acquaintances in which everyone knows every other member
of the group. The clustering coefficient
of a network serves as a good measure of this tendency to form clusters or
cliques (Wasserman and Faust 1994; Watts 1999). Consider a particular node i.
Suppose it is connected to ki
other nodes. If these other nodes are part of a clique (which naturally
includes the node i also), there
would be ki(ki - 1)/2 edges among them.
Suppose the actual number of edges among them is Ei. Then the clustering coefficient of the node i is defined as
Ci =
[2Ei]/[ki(ki - 1)].
And the
clustering coefficient C for the
entire network is the average of all the Ci’s.
As we have
seen in Part 104, for a random
network, C = p. For practically all real networks, C is much larger than p,
indicating that the small-world characteristic is indeed very common in complex
networks.
Small-world
networks have short path lengths like random networks, and have clustering
coefficients that are quite independent of network size, like regular networks.
This last feature of regular networks or graphs is well illustrated by a
crystal lattice, in which the clustering coefficient is fixed, no matter what
the size of the crystal is. Consider, for simplicity, a ring lattice, i.e. a
1-dimensional lattice with periodic boundary conditions. Let each node be
connected to k other nodes nearest to
it. It has a high clustering coefficient because most of the immediate
neighbours of any lattice point are also neighbours of one another. Its
clustering coefficient can be shown to be
C =
[3(k-2)]/[4(k-1)]
A
low-dimensional lattice like this does not have short path lengths. For a d-dimensional cubic lattice, the average
distance between two nodes scales as N1/d; this is a much faster increase
with N than the logarithmic increase
typical of random networks, as also of most of the real-life networks.
The Watts-Strogatz
model
As stated
above, real small-world graphs or networks have short path lengths unlike
ordered or regular networks, and have clustering coefficients that are quite
independent of network size, unlike random networks. Watts and Strogatz (WS) (1998) proposed a one-parameter model that explored
the regime intermediate between random graphs and regular graphs. They started
with a ring lattice of order N, in
which each node is connected to its nearest k
nodes, k/2 on either side. For
ensuring that they were dealing with a sparse but connected network, they
assumed that N>>k>>ln(N)>>1. Then they
introduced randomness by rewiring each conceivable edge with a certain
probability p. Thus even distant
nodes had a chance of being connected. In fact, the procedure introduced pNk/2 long-distance edges connecting
nodes which would have been parts of different neighbourhoods otherwise.
Connection
probability p = 0 corresponds to perfect order or regularity, and p
= 1 to
randomness. Somewhere between these two
limits a transition occurs from order to randomness (or chaos). This
transition is of great interest in the context of complex systems, because this
‘edge of chaos’ is where complexity thrives. It is in the
vicinity of the edge of chaos that the network exhibits a coexistence of
clustering and short path lengths.
For a ring
lattice, i.e. when p = 0, the average path length is
L(0)
≈ N/(2k) >> 1
And the
average clustering coefficient is
C(0)
≈ 3/4
Thus the
average path-length scales linearly with order, and the average clustering
coefficient is large.
At the other
extreme in the WS model, i.e. when p → 1,
the model describes a random graph, for which
C(1)
≈ Crandom ~ k/N << 1
and
L(1)
≈ Lrandom ~ ln(N)/ln(k)
Thus the
clustering coefficient decreases with increasing order of the net, and the path
length increases logarithmically with order.
The WS model
has the feature that there is a substantial range of probabilities p for which large C values coexist with small L
values, in agreement with what is observed in a variety of real networks.
However, it is not able to reproduce the power-law degree-distribution exhibited by many real
networks for large k. I shall deal
with this aspect of networks in the next post.
No comments:
Post a Comment