Pages

Sunday, April 6, 2014

127. Understanding Natural Phenomena: Epilogue


It is time to end this series of blog posts, which I have been writing under the label 'Understanding Natural Phenomena'. A quick recap will be in order here.

There are three basic questions we all ask about the world we live in:
  • ·      How can the universe emerge out of nothing?
  • ·      How can life emerge out of nonlife?
  • ·      How can intelligence emerge out of nonintelligence?
Modern science has credible answers to all these questions, and I have tried to give these answers in as simple a language as is possible, without dumbing down the essence of the answers.

I am a scientist and I take pride in the fact that I value the scientific method, which is the only sensible and logical method for understanding natural phenomena. I feel irritated by the statement I hear sometimes: 'Certain things are beyond science'. Except for certain questions which have to do with ethics or morality (which are policy matters, for the individual or for the state), nothing occurring in Nature can be beyond science. If not the scientific method, what other method we can possibly have for understanding natural phenomena? None that I know of.

For understanding the origin of our universe, we have to begin by reminding ourselves that all phenomena are governed by the laws of quantum mechanics. Classical mechanics is a good approximation in many situations, but it is only a limiting case of quantum mechanics. Laws of quantum mechanics are highly counter-intuitive, but there is nothing we can do about that. Laws of physics in our universe have existed long before we humans emerged on the scene. Our brains have evolved to give us a survival advantage (even mastery) over other creatures. Our intelligence has made us the dominant species on Earth, but it has not been necessary for our brains to evolve to give us the intuition to such an extent, and in such a way, that the laws of quantum mechanics do not appear counter-intuitive to us.

One has to understand the basics of quantum field theory to get a feel for how our universe emerged out of 'nothing' without violating the law of conservation of energy (with the term 'energy' including the mass part also). We are still some way from developing a widely accepted theory of quantum gravitation, but the essence of the basic idea for rationalizing the emergence of our universe out of 'nothing' is rather simple: The total energy of the universe has a positive contribution and a negative contribution, and the two add up to zero. The positive part comes from all the mass and energy we have around us, and the negative part is the attractive gravitational potential energy. The negative gravitational energy of the ever-expanding universe is the reason why an equal amount of positive mass and energy can emerge out of 'nothing'. We have 'something rather than nothing' because the 'nothing' is unstable, undergoing quantum fluctuations all the time.

The Big Bang model for the origin of our universe has held sway for a long time. Recently there are murmurs that the model may need revisions, even drastic ones. But that does not bother me one bit. The beauty of the scientific method is that its conclusions are self-correcting. Tomorrow if there is a better model for our universe than the Big Bang model, then so be it.

Another widely discussed model or theory in cosmology is M-theory. Some of the best brains in science have been working on it, but it still requires a lot of validation. I do hope it gets confirmed. What I like best about it is that the anthropic principle emerges out of it as a natural corollary. Of course, the anthropic principle is valid even if the M-theory is not. This is because the multiverse idea can still survive, via the cosmic-inflation theory.

Apart from quantum mechanics and the principle of conservation of energy/mass, the other big idea one must get the hang of is the second law of thermodynamics. The law says that an isolated system can evolve with time in only one direction, namely that of increasing disorder or entropy. People have no trouble understanding this, but trouble starts when we are dealing with thermodynamically open rather than isolated systems, and most systems of interest are indeed open systems.

An open system is one through which mass and/or energy can flow; an isolated system is one for which this is not possible. For an open system it is meaningless to speak only in terms of entropy for stating the second law. We must bring in the concept of free energy, and the generalized second law of thermodynamics, applicable even to open systems, says that free energy always tends to get minimized.

The free energy has two contributions, namely those from internal energy and entropy: F = U - TS. So F can decrease either if the entropy term TS increases, or if the internal-energy term U decreases, or if U and TS change in any general way such that there is a net decrease in F.

Consider the example of crystal growth, say of ice from water. The ice crystal has a higher degree of order compared to liquid water, so this is not a case of increase of entropy. But this phenomenon (emergence of order out of disorder) occurs because atoms in ice crystals are more tightly bound to one another than they are in liquid water, resulting in a large fall in the U term. And the tradeoff in the changes in the U term and the TS term is such that there is a net lowering of the free energy F, as demanded by the second law for open systems.

This is a very important statement, because it means that order can indeed emerge out of disorder (for example, the emergence of life out of nonlife) if there is an appropriate flow (input/output) of energy and/or mass to or from the system.

Entropy, being a measure of disorder, is also a measure of absence of information. If the entropy of a system increases, we are losing information about it. [We can also say, alternatively, that more information is needed for specifying it, although we do not have that extra information.] By the same token, if a system becomes more ordered (a case of decreased entropy), we can say that its content of available information has increased.

Consider a system that is not in equilibrium. Let S1 be its entropy. Naturally, it will tend to attain a state of equilibrium. When it has succeeded in doing so, let its entropy be S0. Since entropy generally tends to increase, we have S0 > S1. For most practical purposes, what matters is the change in entropy, rather than its absolute value. Therefore there is no harm in shifting the entropy scale such that we associate zero entropy with the state of equilibrium; i.e. we assume that S0 = 0. If we do that, we can say that, for book-keeping purposes, the entropy associated with a state in disequilibrium is negative.

For similar reasons, whereas entropy is a measure of absence of information, negative entropy is a measure of available information. Let us consider our ecosphere. It is an open system which continues to receive energy from the Sun. Since input of energy pushes a system away from equilibrium, we can say that the Sun has been pumping information or negative entropy into our ecosphere. Some of this information enters the structure and function of biomolecules and other complex molecules.

The structure of any biomolecule carries an enormous amount of information compared to simple molecules like O2, CO2, N2, etc. How and why did the biomolecules evolve out of simple molecules? They did so because, ultimately, some of the negative entropy or information being pumped into our ecosphere by the Sun got embodied in the structure of the biomolecules. This is how life-originating and life-sustaining molecules emerged on Earth. This is how life emerged out of nonlife.

A game-changing idea in science is that of Darwinian evolution. Two factors guided Darwin's formulation of the theory of evolution. One was Malthusian ideas: If resources are limited, the fitter individuals in a population stand a better chance of hogging more of them, and such individuals are more likely, not only to survive, but also to procreate. The other factor that influenced Darwin was the power of gradual change, evidenced by how gigantic creations like the Grand Canyon can emerge simply because enough time has been allowed for water to run its course, chiselling away one grain of rock at a time. Similarly, biological evolution resulted in the appearance of fitter and fitter species, and even new species.

So, life emerged very gradually out of nonlife through chemical evolution, and evolved further because of biological evolution. No miracles involved there.

How can intelligence emerge out of nonintelligence? The beehive provides an answer. Each bee hardly has any intelligence to speak of. Its genetic information enables it to sense pheromones, and it is genetically programmed to react to the behaviour of other bees in the hive in certain simple, automatic, ways. And yet the beehive is a veritable superorganism, able to take intelligent decisions. The origin of this swarm intelligence is in the interaction network of the bees. This is also what happens in the human brain. Each neuron is as dumb as can be, but the complex adaptive system comprising of billions of neurons and the trillions of interactions among them is capable of developing formidable levels of intelligence.

A grand achievement of the human mind is that, by adopting the scientific method, we have been able to develop science and technology to a level whereby we have been already able to probe the human brain to a fantastic degree of detail. This information has enabled us to understand the mechanism of human intelligence. What is more, we are already on our way to developing artificial brains, comparable in sophistication to the human brain. It is certain that in the next few decades a tipping point will be reached when artificial intelligence will equal human intelligence.

What happens beyond that tipping point ('singularity') is absolutely mind-boggling for a number of reasons:
  • Whereas human intelligence has practically stopped evolving, artificial intelligence will continue to grow at an explosive exponential rate.
  • At present out pattern-recognition capability is far superior to that of artificial brains. But artificial brains are bound to catch up soon.
  • Processes in artificial brains are already millions of times faster than those in the human brain.
  • The human body and brain is too fragile for interstellar travel. There is no such handicap for artificial brains and the robots embodying them.
Progress in computer science is the reason why our intelligence will soon be enhanced by artificial intelligence, developed in a digital cortical brain.

Wolfram Alpha is an answer engine (rather than a search engine like Google Search). It computes answers, rather than directing you to websites where the answers (or the recipes for obtaining the answers) may be available. It consists of ~15 million lines of Mathematica code, and computes answers from ~10 trillion bytes of data curated by Wolfram Research staff. The ever-increasing power of Wolfram Alpha will be available to our children or robots (our 'mind children') in a very routine sort of way. Look at what the scientific method has done to our lives and to our future!

In the present century itself there will be a cosmic network of immensely powerful robots, communicating with one another, and infusing the cosmos with a pervasive superintelligence created by us humans.

By way of acknowledgement of debt I list here the books which influenced my thinking greatly:

Jawaharlal Nehru (1946): The Discovery of India.

Isaac Asimov (1950): I, Robot.

Bertrand Russell (1957): Why I Am Not a Christian.

Kevin Kelly (1994): Out of Control: The New Biology of Machines, Social Systems, and the Economic World.

Murray Gell-Mann (1994): The Quark and the Jaguar: Adventures in the Simple and the Complex.

Daniel Dennett (1995): Darwin's Dangerous Idea: Evolution and the Meanings of Life.

George Dyson (1997): Darwin Among the Machines: The Evolution of Global Intelligence.

Hans Moravec (1999): Robot: Mere Machine to Transcendent Mind.

Moshe Zipper (2002): Machine Nature: The Coming Age of Bio-Inspired Computing.

Albert-Laszlo Barabási (2002): Linked: How Everything is Connected to Everything Else and What It Means for Business, Science, and Everyday Life.

Jared Diamond (2002): The Rise and the Fall of the Third Chimpanzee: How Our Animal Heritage Affects the Way We Live.

Bill Bryson (2003): A Short History of Nearly Everything.

Jeff Hawkins (2004): On Intelligence: How a New Understanding of the Brain will Lead to the Creation of Truly Intelligent Machines.

Ray Kurzweil (2005): The Singularity is Near: When Humans Transcend Biology.

Richard Dawkins (2009): The Greatest Show on Earth.

Stephen Hawking & Leonard Mlodinow (2010): The Grand Design: New Answers to the Ultimate Questions of Life.

Lawrence Krauss (2012): A Universe from Nothing: Why There is Something Rather Than Nothing.

Ray Kurzweil (2012): How to Create a Mind: The Secret of Human Thought Revealed.


I am now taking a break from this blog writing work so that I can focus on creating a book out of these blog posts. In these posts although I tried to make science simple and interesting, I am aware that things did get a bit dense at some places. My efforts to improve on that when I compile the book will be greatly benefitted from your feedback, comments, suggestions. Please feel free to communicate, either by writing directly at the relevant blog posts, or by writing to me privately at vkw1412@gmail.com.

The blog writing has been a rewarding experience for me. I look forward to your responses.



Saturday, April 5, 2014

126. Kurzweil's Pattern-Recognition Theory of Mind – 2


Let us continue from where we left off in Part 125.



The nature of data flowing into a pattern recognizer

What does the data for a pattern look like? Suppose the pattern is a face, an essentially 2-dimensional set of data. But, as can be seen from the structure of the neocortex, the pattern inputs are only 1-dimensional lists. All the experience in the creation and functioning of artificial pattern-recognition systems also confirms that one can represent 2- or higher-dimensional data streams as 1-dimensional lists. Our memories are patterns organized as lists (that is why we have trouble reciting the alphabet backwards). And, what is more, each item in the list is another pattern, and so on, hierarchically. We have learnt these lists, and we recognize them when an appropriate stimulus is present. Memories exist in the neocortex in order to be recognized.

Autoassociation and invariance

As explained in Part 125, we can recognize a pattern even if it is incomplete. This ability to associate a pattern with a part of itself is called autoassociation.

Often we are able to recognize patterns that are distorted, or when aspects of them are transformed. This ability is called invariance, and the brain deals with it in four ways.

The first way is through global transformations that are effected before the cortex receives the sensory data.

The second takes advantage of the redundancy in the storage of memory. The memory has many perspectives or variations stored away.

The third is the ability to combine two or more memory lists. That is how we understand metaphors and similes.

The fourth method derives from the 'size parameters' that allow a single module to encode multiple instances of a pattern.

Learning

As Kurzweil (2012) writes: 'Our neocortex is virgin territory when our brain is created. It has the capability of learning and therefore of creating connections between its pattern recognizers, but it gains those connections from experience. . .  Learning and recognition take place simultaneously. We start learning immediately, and as soon as we've learned a pattern, we immediately start recognizing it. . . . patterns that are not recognized are stored as new patterns and are appropriately connected to the lower-level patterns that form them'.

The language of thought

At the heart of the pattern-recognition theory of mind (PRTM) is the neocortical pattern-recognition module, the inputs to and the outputs from which are shown below (diagram taken from Kurzweil 2012).


The brain starts out with a very large number of ‘connections-in-waiting’ to which the pattern-recognition modules can hook up. As we learn and have experiences, the pattern recognizing modules of the neocortex are connecting to preestablished connections that were created when we were embryos. Kurzweil (2012) has summarized his PRTM as follows:

'a) Dendrites enter the module that represents the pattern. Even though patterns may seem to have two- or three-dimensional qualities, they are represented by a one-dimensional sequence of signals. The pattern must be present in this (sequential) order for the pattern recognizer to be able to recognize it. Each of the dendrites is connected ultimately to one or more axons of pattern recognizers at a lower conceptual level that have recognized a lower-level pattern that constitutes part of this pattern. For each of these input patterns, there may be many lower-level pattern recognizers that can generate the signal that the lower-level pattern has been recognized. The necessary threshold to recognize the pattern may be achieved even if not all of the inputs have signalled. The module computes the probability that the pattern it is responsible for is present. This computation considers the "importance" and "size" parameters (see [f] below).

'Note that some of the dendrites transmit signals into the module and some out of the module. If all of the input dendrites to this pattern recognizer are signalling that their lower-level patterns have been recognized except for one or two, then this pattern recognizer will send a signal down to the pattern recognizer(s) recognizing the lower-level patterns that have not yet been recognized, indicating that there is a high likelihood that that pattern will soon be recognized and that lower-level recognizer(s) should be on the lookout for it.

'b) When this pattern recognizer recognizes its pattern (based on all or most of the input dendrite signals being activated), the axon (output) of this pattern recognizer will activate. In turn, this axon can connect to an entire network of dendrites connecting to many higher-level pattern recognizers that this pattern is input to. This signal will transmit magnitude information so that the pattern recognizers at the next higher conceptual level can consider it.

'c) If a higher-level pattern recognizer is receiving a positive signal from all or most of its constituent patterns except for the one represented by this pattern recognizer, then that higher-level recognizer might send a signal down to this recognizer indicating that its pattern is expected. Such a signal would cause this pattern recognizer to lower its threshold, meaning that it would be more likely to send a signal on its axon (indicating that its pattern is considered to have been recognized) even if some of its inputs are missing or unclear.

'd) Inhibitory signals from below would make it less likely that this pattern recognizer will recognize its pattern. This can result from recognition of lower-level patterns that are inconsistent with the pattern associated with this pattern recognizer. . . .

'e) Inhibitory signals from above would also make it less likely that this pattern recognizer will recognize its pattern. This can result from a higher-level context that is inconsistent with the pattern associated with this recognizer.

'f) For each input, there are stored parameters for importance, expected size, and expected variability of size. The module computes an overall probability that the pattern is present based on all of these parameters and the current signals indicating which of the inputs are present and their magnitudes. A mathematically optimal way to accomplish this is with a technique called hidden Markov models. When such models are organized in a hierarchy (as they are in the neocortex or in attempts to simulate a neocortex), we call them hierarchical hidden Markov models.'

Triggered patterns trigger other patterns. Incomplete patterns send signals down the conceptual hierarchy. Complete patters send signals up the hierarchy. These patterns are the language of thought. Like language they are hierarchical, but they are not always language per se, although language-based thoughts are also possible.

There can be two modes of thinking, nondirected and directed. In the former, thoughts trigger one another in a nonlogical way. Dreams are examples of nondirected thoughts. Directed thinking is what we use when we are trying to solve a problem, or when we formulate an organized response.

Thus, according to the PRTM, our intelligence is the result of 'self-organizing, hierarchical recognizers of invariant self-associative patterns with redundancy and up-and-down predictions' (Kurzweil 2012).

It is rightly claimed in Kurzweil's (2012) book that it ' . . is an incredible synthesis of neuroscience and technology and provides a road map for the future of human progress'. The operating principle of the neocortex (explained by the PRTM) 'is arguably the most important idea in the world, as it is capable of representing all knowledge and skills as well as creating new knowledge'.