Saturday, 26 October 2013

103. Networks and Network Flows

'The mystery of life begins with the intricate web of interactions, integrating the millions of molecules within each organism. The enigma of the society starts with the convoluted structure of the social network. The unpredictability of economic processes is rooted in the unknown interaction map behind the mythical market. Therefore, networks are the prerequisite for describing any complex system, indicating that complexity theory must inevitably stand on the shoulders of network theory' (A.-L.Barabási, Linked)


 A network is a directed graph in which every edge or arc has been assigned a label. Networks may be isotropic, or otherwise. Isotropic networks do not have well-defined inputs and outputs. By contrast, neural networks are examples of networks with well-defined inputs, which they convert into certain desired outputs.

Networks can be of different types. A network with a well-defined input and output may be formally defined as follows: A network is a directed graph D in which there is vertex s (called the source) and a vertex t (called the sink), and for which a function c (called the capacity) is defined which assigns to each edge of the network a nonnegative number. Such a network is relevant, for example, for building a model for maximizing the flow of goods from one point to another in a transportation system. Although I am assuming a single-input single-output network here, such considerations can be generalized to multiple sources and/or multiple sinks.

A flow on D is a function f from E(D) into nonnegative numbers such that the following two conditions are satisfied. The first is that the flow cannot exceed the capacity of an edge; i.e., for an edge e, f(e) ≤ c(e) for all e belonging to E (e є E). The second condition is that the input into any intermediate vertex must be equal to the output from that vertex. That is, for all vertices v belonging to the subgraph D – {s,t}, we must have ∑uv є E f(u,v) = ∑vw є Ef(v,w).

A cut in a network is a partition of its nodes N into two subsets S1 and S2 such that the source node s belongs to S1 and the sink node t belongs to S2. The capacity of the cut is the sum of the capacities of the edges from S1 to S2.

An important theorem in the theory of network-flows states that in a network the value of a maximum flow is equal to the capacity of a minimum cut. It is known as the max-flow min-cut theorem, or the maximin theorem. 

r-regular networks

The notion of an r-regular graph, or regular graph, was introduced in Part 102. An r-regular network is one which has an underlying r-regular graph. All vertices of an r-regular network have the same degree r. In other words, in a regular network all nearest-neighbour nodes are connected in a regular manner. Such networks have local groups of highly interconnected nodes. For them the number of edges per node has an approximately Gaussian distribution, with a mean value that is a measure of the ‘scale’ of the network (Bray 2003). The width (the 'full width at half maximum') of the Gaussian-distribution curve is a measure of the scale. The important thing is that such a scale is identifiable, unlike the situation in a power-law distribution.

I consider an example from systems biology to illustrate the notion of a network. The processes occurring in a biological cell can be modelled in terms of a variety of molecular networks (Alon 2008; Bray 2003). Transcription networks are an example. In a transcription network the nodes or vertices are the genes, and the edges are the transcriptional interactions (cf. Part 43). Consider two nodes u and v. If there is a transcription interaction between them, it is represented by a directed edge or arrow: u v. In general, more than one arrows may point at a particular node in some networks, in which case a suitable input function has to be defined (e.g. an AND or OR function). Similarly, two nodes may be simultaneously involved in more than one types of interaction (e.g. on different time scales). One can then use different colours for the arrows joining them. The strength of the interaction between two nodes is represented by assigning a weight to the arrow joining them.

Network motifs 

In spite of the fact that evolutionary processes involve a fair amount of random tinkering, Nature has zeroed-in again and again on certain recurring circuit elements, called network motifs (NMs). Each such motif in the molecular network performs a specific information-processing job. Here are some examples of such jobs in the case of the transcriptional networks of the cell comprising the bacterium E. coli (cf. Alon 2003): filtering out spurious input fluctuations; generating temporal programs of gene expression; accelerating the throughput of the network. Similar NMs have also been found in the transcription networks of yeast (Milo et al. 2002). Operational amplifiers and memory registers in human-made electronic circuits are some of the equivalents of NMs.

NMs are formally defined as follows (Milo et al. 2002): Given a network with N nodes and E edges, one can construct a randomized network or random network from it which has the same single-node characteristics as the actual network (i.e. the numbers N and E are unchanged), but where the connections between nodes are made at random. In other words, we construct a spanning random network from the actual network. NMs are those patterns that occur in the actual network significantly more frequently than in the spanning random network.

There are a large number of possibilities by which a given set of nodes can be interconnected. I shall consider some of them in the next few posts.



Saturday, 19 October 2013

102. Graphs and Networks

In physics one can, in principle, work out the properties of a noncomplex system in terms of the structure of the system and the distance-dependent interactions among its constituents. For instance, one can understand the emergence of macroscopic magnetism in a crystal in terms of the distance-dependent interactions among the spins. But there are a large number of complex systems in which the physical distances among the constituents are largely irrelevant, and in which there is ambiguity about whether or not there is interaction between a chosen pair of constituents. Social systems are one such example, as are ecological systems, neurons in a brain, investors in a stock market, and participants in the Internet. Network theory and statistical mechanics provide powerful tools for dealing with such systems.

I introduced networks briefly in Part 48. The experts make a distinction between networks and graphs although, sometimes, the two terms do tend be used synonymously in a loose sort of way.

A graph is a collection of points (called nodes or vertices) together with a collection of lines (called edges or links) that connect certain pairs of these points. A directed graph is a graph in which the edges are ordered pairs of vertices; that is, every edge in a directed graph has a direction.

A network is a directed graph in which every edge is given a label. The term 'network' is used in several different contexts, and can have different, context-dependent meanings. Electrical networks, social networks, communication networks, and neural networks are some examples.

Networks with a large number of nodes and edges generally exhibit complex behaviour, and are called complex networks (Albert and Barabási 2002). Complex systems embody organizing principles encoded in the topology of the underlying network.

A graph G is a nonempty finite set N(G) together with a finite set E(G) of distinct unordered pairs of elements of N(G). Each element in N(G) is called a node or a vertex, and each element in E(G) is called an edge.

The order |N| of G is the number of elements in N, and the size |E| of G is the number of elements in E.

In practical applications of graph theory, it is often necessary to assign a weight to each edge. For example, if the vertices in a graph represent cities, and the edges the roads between the cities, then a weight proportional to the length of a road can be assigned to the edge representing that road.

Consider two vertices u and v of G which are joined by an edge e. The edge e = {u, v} is often represented as just uv (or u-v). We say that u is adjacent with v. And e is said to be incident to u and v. Two edges incident to a common vertex are said to be adjacent with each other.

The neighbourhood n(u) of a vertex u is the set of vertices in the graph that are adjacent with u. The number of vertices in the set n(u) is called the degree of u. It is denoted by d(u), and is the number of edges of the graph incident to u.

The figure below illustrates these definitions. For the graph G shown in it, N = {u, v, w, x, y}; E = {ux, uy, vy, xy}; order |N| = 5; and size |E| = 4. The degrees of the vertices are: d(u) = d(x) = 2, d(v) = 1, d(y) = 3, d(w) = 0. The vertex w is said to be an isolated vertex.

(a) Graph G. (b) Subgraph G - y. (c) Spanning graph G - ux . (d) Spanning supergraph G + vw. (e) Graph H isomorphic to the graph G. (f) Multigraph. (g) Directed graph. (h) Königsberg graph, depicting the Königsberg-bridges problem posed and solved by Euler (see below).

A graph H is a subgraph of a graph G if N(H) is a subset of N(G) and E(H) is a subset of E(G). G is called a supergraph of H. In Fig. (a) above, if we delete the vertex y and also all the edges adjacent with y, then we obtain the subgraph G-y shown in Fig. (b).

If H and G have the same set of vertices, then H is called a spanning subgraph of G. For example, Fig. (c) has the same set of vertices as Fig. (a), but the edge ux is missing; consequently, G-ux is a spanning subgraph of G.

Fig. (d) depicts a spanning supergraph of G. In it, the vertices are the same as for G, but an edge vw has been added. It is denoted by G+vw.

Two graphs G and H are said to be isomorphic if there is a one-to-one mapping from the vertices of one graph onto the vertices of the other graph such that the edges are preserved. For example, the graph shown in Fig. (e) is isomorphic to that in Fig. (a).

A graph or subgraph in which every pair of vertices is adjacent is called a complete graph or subgraph. That is, each vertex of the graph or subgraph is connected to every other vertex of the graph or subgraph. If it is of order n, then each vertex in it is of degree (n-1), and its size is n(n-1)/2. A complete subgraph in a graph is said to constitute a cluster in the graph.

If the vertices of a graph G can be divided into two disjoint subsets G1 and G2 such that every edge in the graph G connects a vertex in G1 only to a vertex in G2, it is called a bipartite graph. If every vertex in G1 is connected to every vertex in G2, then G is a complete bipartite graph.

If all vertices of a graph have the same degree r, then it is called an r-regular graph, or regular graph.

A random graph is one in which the edges are distributed randomly.

If a graph has multiple edges between two vertices, or if it has edges in the form of loops (e.g. vv), it is called a multigraph (Fig. (f)).

A walk W in a graph G from a vertex u to a distinct vertex v is a finite sequence of alternating vertices and edges (W = v0e1v1...vkekvk+1, with v0 = u and vk+1 = v) such that each edge in the sequence is incident to each of the two vertices on either side of it in the sequence. The length of the walk is the number k, i.e. the number of edges in it.

If all the edges in W are distinct, it is called a trail.

A trail in which all the vertices are distinct is called a path.

A walk, trail, or path is said to be closed if the initial and the final vertex are the same (u = v). A closed path with k ≥ 3 is called a cycle.

If there is a path between two vertices of a graph, then the vertices are said to be connected. If each pair of vertices in a graph is connected, then the graph is said to be connected. If only a subset of the vertices in a graph are connected, and if they are connected to a particular vertex, then this subset constitutes a connected subgraph (or a cluster).

A graph that is not a connected graph can be partitioned into maximal connected subgraphs called the components or clusters of the graph. The graph in Fig. (a) has two components: {u, v, x, y} and {w}.

We define the distance between any two vertices u and v of a graph as the length of the shortest path between them. It is denoted by d(u, v). If the edge connecting these two vertices has been assigned a weight w, the weighted distance is wd(u, v).

A tree is a connected graph that does not contain any cycles. It is readily argued that any connected graph has a spanning tree.

The maximal distance between any pair of nodes of a graph defines the diameter of the graph. A disconnected graph is one that is made of two or more isolated clusters. Strictly speaking, for such a graph the diameter is infinite, so a modified definition of diameter is used by defining it as the maximum diameter of its clusters.

The first paper in graph theory was published in 1836 by Euler (cf. Barabási 2002). Euler solved the famous Königsberg bridges problem in that paper. There were seven bridges in Königsberg over a river, which connected pairs of regions from among the four which we may denote by, say, A, B, C, D. The problem posed was whether it is possible to traverse the seven bridges without crossing any bridge twice. If we represent the tour regions as the vertices in a graph, and the bridges as the edges connecting pairs of vertices, we obtain the multigraph shown in Fig. (h) above. Euler established that the problem posed had an answer in the negative. In other words, he showed that the graph in Fig. (h) does not have a trail that uses all the edges in the graph.


An Eulerian trail of a graph is defined as a trail that uses all the edges of the graph. A graph that contains a closed Eulerian trail is called an Eulerian graph. Euler showed that there is no Eulerian trail in the graph shown in Fig. (h). For such a trail to exist, a graph must be a connected graph, and all its vertices (except possibly the first and the last vertices of the trail) must be of even degree.

Euler proved the theorem that a connected graph or multigraph G has a closed Eulerian trail if and only if each vertex has even degree, and that G has an open Eulerian trail if and only if precisely two of its vertices are of odd degree.

So much for graphs. I shall discuss networks of various types in the next few posts.

Saturday, 12 October 2013

101. Evolutionary Arms Races and the Life-Dinner Principle

'If adaptation were solely to the inanimate environment, it is easy to believe that evolution will simply track Darwin’s ‘elements of air and water’ in their random walk through time. Selection would be stabilizing until a change in the climate or an accidental geographical displacement introduced a brief interlude of directionality. Each such directional interlude would seem to be as likely to reverse as to continue the previous one. But in fact consistent directionality is introduced because the environment of any one evolving lineage includes other evolving lineages. Above all, it is because adaptations in one lineage call forth counter-adaptations in others, setting in motion the unstable evolutionary progressions we call arms races' (Dawkins and Krebs 1979).

Symmetric conflicts were assumed in the case study I described in Part 100. In reality, both symmetric and asymmetric conflicts between species, or within a species, can occur, and often lead to ‘evolutionary arms races’: Adaptations in one lineage can alter the selection pressure and call forth counter-adaptations in other interacting lineages. If this occurs reciprocally, a runaway escalation of complexity of behaviour may occur, rather like a spiralling arms race between two rival nations.

Arms races take place on evolutionary time scales. It is lineages that evolve, not individuals.

Several factors can make an arms race asymmetric:

1. The specialist vs. the generalist factor. A predator species that is good at hunting several species of prey is unlikely to be strongly effective against any one of them, and therefore any of these prey species stands a greater chance of outrunning the predator in the evolutionary adaptation race. If, on the other hand, the predator has specialized in hunting a particular prey, any adaptively gained advantage by the prey is more likely to be matched by an evolutionary increase in the hunting capability of the predator. Of course, it is also true that a predator able to hunt several types of prey may, on the whole, make a good living; it runs many races.

2. Unequal selection pressure. ‘The rabbit runs faster than the fox, because the rabbit is running for his life while the fox is only running for his dinner’ (Dawkins 1979). This life-dinner principle can explain the asymmetry in the arms race in favour of the prey. The penalty of failure is much higher for the prey than for the predator, resulting in unequal selection pressures.

3. General inequalities in potential rates of evolution. Mammals evolve faster than bivalve molluscs because of the fiercer interspecific competition in the case of mammals. It may also be true that bivalves and, say, frogs have evolved slowly because they are getting on fine as they are.

4. Learning. Learning is similar to natural selection in that both tend to improve the performance of individuals in a population. The difference, of course, is that learning is adaptation over the life time of an individual, whereas in natural selection the improvement is seen only as an average over many generations of many individuals. But in a predator-prey context, learning by a predator can reduce the chances of survival of the prey, and may reduce substantially the number of generations over which the prey would have had an upper hand because of some genetically transmitted set of improved strategies.

Interspecific asymmetric arms races

An asymmetric arms race is like an attack-defence arms race, an example being the contest between parasites and their hosts; in fact, any predator-prey contest. Offensive adaptations on one side are countered by defensive adaptations on the other.

Why are cuckoo hosts so good at detecting cuckoo eggs, but so bad at detecting and rejecting cuckoo nestlings? Dawkins and Krebs (1979) explained it by analogy with the case of the heroin addict who knows that the drug is killing him, and yet cannot stop taking it because the drug is able to manipulate his nervous system, making him crave for the drug. There is evidence which suggests that the orange gape and the loud begging calls of the cuckoo chick have a ‘supernormal’ (irresistible) effect on the brain of the cuckoo. It is conceivable that, in the evolutionary arms race, cuckoos have put their adaptive emphasis on two different things in the life cycle of the chick: On mimetic deception at the egg stage (the cuckoo eggs look similar to the eggs of the host); and on manipulation of the nervous system of the host at the late nestling stage. As the host species became better and better at distinguishing cuckoo eggs from its own, cuckoos responded by evolving eggs with increased similarity to the eggs of the host species. And as the hosts evolved to become less and less susceptible to the cuckoo nestlings’ begging calls for food, the nestlings’ calls evolved to become more and more plaintive and irresistible. The life-dinner principle is applicable here. The cuckoo had to do better in the evolutionary arms race for shear survival of the species. The host species, on the other hand, had no threat to its survival by of the presence of cuckoo eggs and nestlings side by side with its own.

Predator-prey dynamics was modelled as early as the 1920s by Lotka (1925) and Volterra (1926). Population densities N1 and N2 of two competing prey and predator species are modelled on an evolutionary time scale by the famous Lotka-Volterra equations:

N1/∂t = N1(r1b1N2),

N2/∂t = N2(-r2b2N1).

In these equations, r1 is the rate of increase of the prey population when there are no predators present; r2 is the death rate of predators in the absence of prey; b1 denotes the rate at which the prey are eaten up by the predators; and b2 is the ability of the predators to catch the prey.

Intraspecific asymmetric arms races

Such arms races are difficult to analyse because the contestants (parents vs. offspring; males vs. females; queen ants vs. worker ants; etc.) belong to the same gene pool. Any advantage gained by either of the contestants goes to the same gene pool. It is an arms race between two branches of the same conditional strategy, rather than an arms race between two independent lineages with different gene pools. There are queen ants and there are worker ants in an antcolony; a similar situation prevails in a beehive. All female honeybees, including the queen bees, develop from larvae which are identical genetically. Those fed on ‘royal jelly’ become fertile queens, while the rest remain sterile workers. Something similar happens in ant colonies. The contest between queen and worker ants over relative parental investment is an example of arms race of this type.

 Suppose there is an ant colony in which there is only one queen, who is the mother of all members of the ant colony. What determines the male-female sex ratio in this population? One possible approach for answering this question is to use a game-theoretic model in which the contestants are genetically related (Maynard Smith 1978). The term haplodiploidy is used in biology for the way in which the sex of the progeny is determined by external factors like selective feeding or presence or absence of fertilization of eggs. In some social insects, for example, females develop from fertilized eggs, and males from unfertilized eggs. This ensures that sisters are more closely related to each other than to their own offspring, meaning that the best chance they can give their own genes of surviving is to look after each other, instead of laying eggs of their own. This ESS has evolved in Nature several times, and forms the basis of the stability enjoyed, for example, by beehives and termite mounds (Douglas 2005).

Interspecific symmetric arms races

In a symmetric arms race, the two sides get better and better at doing the same thing. This type of arms race is unlikely to be important. The competitors in this category would rather diverge than escalate the completion. There may be, for example, evasion of competition by niche separation (Lawlor and Maynard Smith 1976).

Intraspecific symmetric arms races

Symmetric arms races are more likely to occur within the same species. They are essentially of a competitive nature, and are the stuff Darwinian evolution is made of. Adaptation to male-male competition within a species for females comes under this category of evolutionary arms races (Maynard Smith 1977). Even adaptation to being eaten up by predators is an arms race of this type, because individuals less competent at this task will gradually get eliminated by natural selection.

How do arms races end?

There are several scenarios:

1. One lineage may drive the other to extinction.

2. One lineage may reach an optimum, and thus prevent the other from doing so.

3. Both sides may reach a mutual local optimum (as in the flower-bee coevolution).

4. There may be no stable end, and the arms race may cycle continuously.


For a real-life demonstration of the Life-Dinner Principle in action, watch this video: