Saturday, February 23, 2013

68. Cellular Automata and the 'Game of Life'

The notion of cellular automata was put forward in the 1940s by Stanislaw Ulam, a colleague of John von Neumann. Ulam suggested to Neumann to consider a digital programmable universe, in which 'time' is imagined as defined by the ticking of a cosmic digital clock, and 'space' is a discrete lattice of cells, each cell occupied by an abstractly defined very simple computer called a finite automaton. Very simple local rules determine the state of any cell at any discrete point of time. There are only a finite number of states available to a cell or an automaton. These states could be, say, a few colours, or a few integers, or just 'dead' or 'alive', etc. At each tick of the digital clock, every automaton changes over to a new state determined by its present state and the present states of the neighbouring cellular automata (CA). The rules by which the state of any automaton changes at a given instant of digital time are the equivalent of the physical laws of our actual universe.

The subject of cellular automata has got a big boost from the work of Stephen Wolfram. He established that CA have a truly wide-ranging applicability and, in particular, have features of nonlinear dynamical systems. He emphasized that the complex phenomena we see in the real universe can be usually thought of as the running of 'simple programs'. The best and often the only way to understand these phenomena is by modelling them on a computer with the help of CA, rather than by working out the consequences of idealized and approximate mathematical models based on a set of equations. Approximations are generally not a good idea for modelling complex systems: No approximation may be a good enough one when things are really complex.

Wolfram's idea of a 'simple program' typically has the following ingredients:
  • A set of transformation or operation rules (usually 'local' rules).
  • Data to operate on.
  • An engine that applies the rules to the data.
In a cellular automaton, the data enter only at the beginning of the computation ('initial conditions'), and the engine keeps applying the same deterministic rules to the outputs of its previous application of the rules. Wolfram investigated a large number of simple programs, and demonstrated that extremely complex-looking patterns can be generated by many of them. The figure below illustrates the point.

Shown here is a 1-dimensional cellular automaton. It consists of a row of squares, and each square can be either black or white. Starting from just one such row of squares (the top row in the upper figure), each time the system is updated, a new row of squares is created just below the previous row, following a simple rule. The simple transformation rule operative in this figure says that a square in the new row should be black only if one or the other, but not both, of its vertically-above predecessor's neighbours is black.

Shown in a separate figure at the bottom is a graphical depiction of this rule, in which 8 block are drawn, corresponding to the 8 possible configurations of three neighbouring cells, each configuration determining the colour of the cell in the next row. Starting with a single black square in the top row of squares, this rule produces a complex pattern of nested triangles.

Shown at the bottom, below the graphical depiction of the operating rule for this CA, are the numbers 0, 1, 0, 1, 1, 0, 1, 0, where 0 corresponds to a white cell, and 1 to a black cell. Treating this sequence of digits as a binary number, we get 01011010, which is the number 90 in the decimal system. Wolfram therefore calls this cellular automaton as generated by Rule 90. The sequence of the eight blocks for depicting this rule has been so chosen as to generate the smallest possible number for labelling or classifying the rule.

A particularly popular example of CA is the Game of Life (GOL), invented (around 1970) by John Conway. It provides a graphic demonstration of artificial evolution; the evolving structures can be seen on a computer screen. One starts with a 2-dimensional lattice of square cells, each cell randomly taken to be either black or white. Let black mean that the 'creature' denoted by a black cell is 'alive', and let white mean that the corresponding creature is 'dead'. Very simple local rules are introduced for how the cells are to change from one time step to the next.

For example, if a cell has two or three neighbours which are alive (i.e. black), then the cell becomes alive if it was dead to start with, or remains alive if it was already so. If the number of neighbours is less than two, the cell dies of 'loneliness'. And if the number of neighbours is more than three, again the cell dies, this time due to 'overcrowding'. Remarkable patterns emerge on the computer screen when this program is run.

Every run is different, and it is not possible to exhaust all possibilities. Often the 'live' cells organize themselves into coherent and ever-changing patterns. The figure below shows the so-called 'Gosper glider gun'.

A 'gun' is a configuration of the GOL which repeatedly and endlessly shoots out moving objects such as 'gliders'.

An unlimited variety of patterns, including bizarre-looking 'creatures' or manmade contraptions, can emerge as the CA evolve. Here is a 'spaceship':

Will you like a demo? Please click here:

Here is another excellent website for having fun with the GOL:

For more on what one can do with CA, watch this video called the 'Amazing Game of Life Demo':


This post should give you a feel for how life arose out of nonlife. It was just a matter of relentless evolution of complexity. Very simple local rules of interaction were in operation, and one thing led to another. Such evolution has been going on all the time, and will continue to do so.

Saturday, February 16, 2013

67. Chaitin's Omega Number in Algorithmic Information Theory

I introduced algorithmic information theory (AIT) in Part 23. The fractal figure below needs ~1.6 million bits for specification or storage. But it can be generated by a very small program (requiring only a tiny fraction of this number of bits), using the definition of the Mandelbrot set and a few other specifications. We say that the figure has only a small algorithmic information content, even though it looks very complex.

Computation involves three things:

1. The length of the computer program.

2. The time it takes to do the computation.

3. The computer memory needed for the job.

AIT largely ignores the second and the third aspect, and recognizes only the first for defining the information content of a given set of numbers or a given set of data. In other words, it focuses on program-size complexity.

Chaitin, who along with Kolmogorov founded the subject of AIT, makes the point that a theory can be likened to a computer program. The program calculates and explains a certain set of observations, and the smaller this program is (in terms of compression of information), the better is the theory (recall Ockham's razor).

When a set of observations or data cannot be described compactly in terms of axioms and/or theorems, there is no structure or order, or pattern, in the data. Such a set of data is logically random. Something is said to be random if the smallest program that calculates or generates it is the same size as it is, so there is no compression.

Chaitin has shown that certain facts are not just computationally irreducible or incompressible; they are logically irreducible or incompressible as well. The proof of their 'truth' must be in the form of additional axioms, without any reasoning. So there are severe limits to the powers of logic and reason.

Chaitin introduced a number omega (Ω) to quantify the degree of logical randomness of any system, and to show that the powers of reasoning are limited. He demonstrated the existence of an infinite stream of unprovable mathematical facts.

Let the term 'program' imply 'the concatenation of the computer program and the data to be read in by the program'. Consider an ensemble of all such possible programs. What is the probability that a program chosen at random from this set will ever halt (cf. Part 66)? The number Ω is that probability.

How do we choose a program at random for testing this? A program is simply a succession of bits (0s and 1s). Since we are considering all possible programs, any succession of bits is a possible program for testing its halting behaviour. We can flip a coin repeatedly to get a random sequence of bits. We go on adding random bits, one at a time, till the sequence of bits is a program that halts, if at all it can halt. The number Ω is the probability that the halting will indeed occur (if at all) for the tested sequence of randomly generated bits.

These operations, of course, assume the presence of a computing machine for doing the job of testing. We also assume the use of a programming language. But it turns out that the crucial conclusions about halting or otherwise do not depend on these things: the actual values of Ω may depend on them, but not the general conclusions drawn. Our arguments can proceed by assuming a particular computer and a particular language for computing.

Since the number Ω is a probability, it lies between 0 and 1. In binary notation it may look something like 0.110010101… The central point made by Chaitin is that the bits after the decimal point form an irreducible stream. Every 0 or 1 after the decimal point represents a fact, and the totality of these bits represents irreducible mathematical facts.

The number Ω can be regarded as an infinite sum. Each N-bit program that halts contributes 1/2N to this sum. Each such program adds a 1 to the Nth bit. One may think that a precise value of Ω can be computed by adding all the bits for the programs that halt. This is not so. Although Ω is a perfectly well-defined specific number, it is impossible to compute it in its entirety (see below). It is possible to compute only a few digits of Ω. For example, if we know that computer programs 0, 10, and 110 all halt, then Ω = 0.111 up to three digits. But the first N digits of Ω cannot be calculated by a program of length significantly shorter than N. Knowing the first N digits of Ω will enable us to tell whether or not each program up to N digits in size ever halts. This means that at least an N-bit program is needed to calculate N bits of Ω.

Chaitin’s Ω cannot be computed to arbitrarily high precision because if we know Ω exactly, we can solve Turing’s halting problem, which is actually unsolvable.

Given any finite program, no matter how long, we have an infinite number of bits that the program cannot compute. This implies that, given any finite set of axioms, there are an infinite number of truths that are unprovable in that system. Ω is irreducible.

Thus a theory of everything for all of mathematics cannot exist. The number Ω has an infinite number of bits or mathematical facts that cannot be derived from any principles simpler than the string of bits itself.

Gödel’s work had shown that individual formal axiomatic mathematical theories cannot prove the true numerical statement 'This statement is unprovable'. According to Chaitin (2006), Gödel left unanswered the key question: 'Is this an isolated phenomenon, or are there many important mathematical truths that are unprovable?' It now turns out that the number Ω provides an infinite number of true theorems that cannot be proved by any finite system of axioms.

Leibniz had stipulated that if something (a theorem) is true in mathematics, it is true for a reason, the reason being the proof of the theorem. But the bits of Ω are totally random, and therefore these mathematical truths are truths for no reason.

Saturday, February 9, 2013

66. Turing's Halting Problem

Hilbert had identified the so-called decision problem for defining a closed mathematical universe: Is there a decision procedure that, given any statement expressed in the given language, will always produce either a finite proof of that statement, or else a definite finite construction that refutes it, but never both? In other words, can a precise mechanical procedure distinguish between provable and disprovable statements within a given system?

The answer to this question required a way to define 'mechanical procedure'. Alan Turing (1936) provided an answer. He proved, among other things, that the notions of 'mechanical procedure', 'effectively calculable', and 'recursive functions' are actually one and the same thing.

I make a digression here to describe recursive functions (see, e.g., Kurzweil 1998). These are functions that can be defined by the accumulation of elementary component parts; they can be deconstructed into a finite number of elemental steps. A recursive procedure is one that calls on itself. Recursion is a process of defining or expressing a function or procedure in terms of itself. Each iteration of the recursive procedure is designed to produce a simpler version of the problem. This process continues until the algorithm reaches a subproblem for which the answer is already known or simple to work out in a nonrecursive manner.

As an illustration, consider the computation of the factorial function n!. The recursive rule here is: n! = n.(n-1)!. The procedure calls on itself again and again, till it reaches the subproblem 1!, for which the answer is a given: 1! = 1.

Recursion is used extensively in game-playing algorithms. For example, for a chess-playing program (first envisaged by Turing), Kurzweil (1998) defines the following recursive formula:

PICK MY BEST MOVE: Pick my best move, assuming my opponent will do the same. If I've won, I'm done.

At each step of the recursion, consequences of each possible move are worked out. It is assumed that the opponent is equally intelligent, and will do the same. Since the number of possibilities to consider may blow up exponentially, the so-called minimax procedure is adopted to arrive at a strategy in reasonable time. An expanding 'tree' of possible moves and countermoves is constructed (for a preassigned time), and an assessment of the freshest 'leaves' (extremities) of the tree that minimizes the chances of the opponent to win, and maximizes the chances of the game-playing program to win, is carried out. This information is then fed back down the branches of the tree for determining the best move of the program.

Returning to the work of Turing, his most important simplifying and concretizing departure from the approach of Gödel was to bring in the idea of a (conceptual) computer for number crunching. He interpreted Hilbert’s philosophy to mean that there should be a computer and a computer program for implementing Hilbert’s idea of a mechanical procedure for deciding whether a given proof is correct or not.

The most famous notion Turing introduced was that of the universal computer.  He demonstrated that all digital computers are fundamentally equivalent, regardless of what is there in their innards. It is all 1s and 0s underneath. He used this approach to prove that unsolvable mathematical problems do exist: He proved the so-called halting problem:

There can be no general procedure to decide in advance whether a self-contained computer program will eventually halt (i.e., solve a posed problem and then halt). Often the only way to answer this question is by actually running the program to see if it halts or not.

In other words, there is no computer program that can take another computer program and determine with certainty whether the first program halts or not. Thus there is no mechanical procedure that can decide in advance whether a program will eventually halt or not. And this means that there is no set of Hilbertian mathematical axioms that can enable us to prove whether a program will halt or not.

Turing invented an imaginary device now called the Turing machine. It consisted of a black box (it could be a computer or a human being) able to read and write a finite alphabet of symbols to and from an unbounded length of paper tape, and capable of changing its own 'm-configuration' or 'state of mind'. In the context of computational science, Turing introduced the all-important notions of discrete time and discrete state of mind.

This made logic a sequence of cause-and-effect steps, and mathematical proof a sequence of discrete logical steps.

The Turing machine embodies a relationship between a sequence of symbols in space and a sequence of events in time. Each step in the relationship between the tape and the Turing machine is governed by what we call a program. The program provides instructions to the machine for every conceivable situation.

Turing argued that all symbols, all information, all meaning and all intelligence that can be described in words or numbers can be encoded (and transmitted) as binary sequences of finite length. In contrast to this, Gödel’s work on undecidability in mathematics states that in all but the simplest mathematical systems there may be propositions that cannot be proved or disproved by any finite mathematical or logical process; the proof of a given proposition may possibly call for an infinitely large number of logical steps.

However, Turing did prove, like Gödel, that there are self-referential questions a universal Turing machine cannot answer.

But perhaps outsiders can.

Turing's proof that the halting problem in unsolvable is as follows (Chaitin 2006):

Suppose a general procedure H does exist that can decide whether any given computer program will halt. Knowing H, we can construct a program P that uses H as a subroutine. Suppose the size of program P is N bits, and P knows its own size. Clearly, P is large enough to contain the number N. Using the procedure H, the program P can take a look at all programs which are up to, say, 100N bits in size, and find out which of them halt and which do not. After this, P runs all the programs which had halted to see what outputs they had generated. Naturally, this will be a set of all digital objects with a degree of complexity (i.e. the number of bits required for their description) up to 100N. At the end, P outputs the smallest integer not in this set, and then halts.

This means that we have shown that P has halted and it has given an output integer that cannot be produced by a program of size ≤100N. This is logically inconsistent because the size of P in only N. Therefore H does not exist and thence P cannot be constructed; i.e. a general procedure H does not exist that can decide whether or not any given computer program will halt. QED.

Turing’s halting problem is unsolvable.