Saturday, March 29, 2014

125. Kurzweil's Pattern-Recognition Theory of Mind – 1

Kurzweil's (2012) pattern recognition theory of mind (PRTM) is based on a certain model of the neocortex, which I shall outline here. I believe that the model is bound to be correct, for two reasons. One is the immensely graphic and detailed experimental data we now have about the structure of the brain, as I outlined in the previous post. The other reason is the spectacular success already achieved in creating an artificial brain based on the PRTM; the success of, for example, IBM's Watson (cf. Part 118) is proof of that.

The human neocortex is an essentially 2-dimensional, ~2.5-mm thick, structure, comprising of six layers. The layers are numbered from I (the outermost layer) to VI (cf. Part 122).

The PRTM says, in essence, that all of the many wonders of the neocortex can be reduced to a single type of thought process, involving hierarchical thinking. This is lent credence to by the structure of the neocortex itself. Its fundamental structure and function has an extraordinarily high degree of uniformity, a la Vernon Mountcastle (cf. Part 124). And this structure is hierarchical in nature.

Mountcastle also postulated the existence of cortical columns along the thickness of the neocortex. The six layers and the cortical columns in them together imply the existence of a grid structure, which has been confirmed by experiment (cf. Part 124).

Kurzweil (2012) hypothesizes that the basic uniform unit of action in the entire neocortex is the so-called pattern recognizer (PR); it is the fundamental component of the neocortex. Deviating a bit from Mountcastle's model, Kurzweil stipulates that the PRs are not separated by specific physical boundaries; rather they are placed closely one to the next in an interwoven fashion. A cortical column is simply an aggregate of a large number of PRs.

The PRs wire themselves to one another throughout the course of a lifetime. Therefore the elaborate connectivity between modules that is there in the neocortex is not specified much by the genetic code; rather it gets created to embody the patterns we actually learn over time.

Kurzweil estimates that there are ~500,000 cortical columns in the human neocortex, each being ~0.5 mm wide and ~2.5 mm long. Each contains ~60,000 neurons. Since each PR within a cortical column contains ~100 neurons, it follows that there are ~500,000 x 60,000 / 100 or ~300 million PRs in our neocortex.

How many patterns can the human neocortex store? With as many as 300 million PRs available, our brain can indulge in a huge amount of redundancy, resulting in our fantastic pattern-recognition capability, which is far in excess of what any computer system has been able to attain so far. [However, let us also remind ourselves that computer processes are millions of times faster than the electrochemical processes that occur in our brains.]

Here is an example of the redundancy with which our brain stores patterns. The face of a loved one is not stored just once, but thousands of times. Some are just repetitions, but most are different perspectives of the face, differing in lighting, facial expressions, etc. And none of these repeated patterns are stored as 2-dimensional arrays of pixels. They are stored as 1-dimensional lists of features, but hierarchically: The constituent elements of a pattern are themselves patterns, and so on.

Even our procedures and actions comprise patterns, and are likewise stored in the neocortex.

Kurzweil's estimate of the total capacity of the human neocortex is on the order of low hundreds of millions of patterns, which is similar to the number of PRs, namely ~300 million.

The structure of a pattern

The PRTM says that patterns are recognized by pattern-recognition modules in the neocortex, and that the patterns and the modules are organized in hierarchies. When a pattern is recognized, there are three parts to this process. To make the description concrete, let us take the example of an APPLE, and also the word 'APPLE' we use for referring to this physical entity.

Part one is the input, consisting of the lower-level patterns that compose the main pattern. The descriptions of each of these lower-level patterns do not need to be repeated for each higher-level pattern that references them. The letter 'A' appears in the pattern for the word APPLE and also in a large number of other words. Each of these patterns need not repeat a description for the pattern of A, but can use a common description stored somewhere. All that is required is a neural connection to that location. There is an axon from the 'A' pattern recognizer that connects to multiple dendrites, one for each word that uses 'A'.

Part two of each pattern is the name of the pattern. This 'name' is simply the axon that emerges from each pattern processor. When the axon fires, its corresponding pattern has been recognized. It is as if the pattern recognizer is shouting: 'Hey guys, I just saw the written word "apple"'.

Part three of each pattern is the set of higher-level patterns that it, in turn, is part of. For the letter 'A' it is all words that include 'A'.

For the example of apple the object and apple the word, just like the hierarchy for the storage and recognition of the word 'apple', another part of the cortex has a hierarchy of pattern-recognizers processing the actual images of objects. If you are looking at an apple, the corresponding pattern recognizer will fire its axon, saying in effect: 'Hey guys, I just saw an actual apple'. Similarly, if somebody utters the word 'apple', the corresponding auditory pattern-recognizer will be triggered.

Information flows down the conceptual hierarchy as well as up. To quote Kurzweil (2012): 'If, for example, we are reading from left to right and have already seen and recognized the letters "A", "P", "P", "L, " the "APPLE" recognizer will predict that it is likely to see an "E" in the next position. It will send a signal down to the "E" recognizer saying, in effect, "Please be aware that there is a high likelihood that you will see your 'E' pattern very soon, so be on the lookout for it"'. The 'E' recognizer then adjusts (lowers) its threshold or action potential for the firing of the neuron which would potentially declare that 'E' has been seen. Even if an incomplete or smudged image of 'E' appears, it would be recognized correctly because it was expected.

This prediction feature is one of the primary reasons why we have a neocortex at all. Our brain is making predictions all the time, and at all levels of abstraction.

More on this next time.

Saturday, March 22, 2014

124. Peering Into the Human Brain

As Ray Kurzweil (2005) keeps emphasizing, the law of accelerating returns (LOAR) is always operative in the evolution of all information based technologies, resulting in their exponential growth (Moore's law is just one example of that). Naturally, progress in the development of better and better brain-probing technologies is no exception to this. In Part 123 I traced the early history of progress in the development of experimental techniques for probing the human brain. Our present capabilities are already mind-boggling, and much more will be coming in the near future. And needless to say, experiment and theory go hand in hand. Two examples will illustrate my point.

In Part 122 I told you about the path-breaking Mountcastle hypothesis, which says:

There is a common function, a common algorithm, that is performed by all the cortical regions.

Kurzweil (2012) has rightly emphasized the fundamental importance of this insight: 'A critically important observation about the neocortex is the extraordinary uniformity of its fundamental structure. This was first noticed by American neuroscientist Vernon Mountcastle (born in 1918). In 1957 Mountcastle discovered the columnar organization of the neocortex. In 1978 he made an observation that is as significant to neuroscience as the Michelson-Morley ether-disproving experiments of 1887 were to physics. That year he described the remarkably unvarying organization of the neocortex, hypothesizing that it was composed of a single mechanism that was repeated over and over again, and proposing the cortical column as that basic unit'.

Another basic insight is that the basic module of learning is a module of dozens of neurons (~100) (cf. Part 123). Support for this postulate has come from the work of Henry Markram. His ambitious Blue Brain Project aims to both model and simulate the human brain, including the entire neocortex, as also the old-brain regions such as the hippocampus, amygdala, and cerebellum: 'Reconstructing the brain piece by piece and building a virtual brain in a supercomputer—these are some of the goals of the Blue Brain Project.  The virtual brain will be an exceptional tool giving neuroscientists a new understanding of the brain and a better understanding of neurological diseases'.

This project is using a scanning-technology tool called the automated patch-clamp robot, with which researchers are 'measuring the specific ion channels, neurotransmitters, and enzymes that are responsible for the electrochemical activity within each neuron'. It is 'an automated system with one-micrometer precision that can perform scanning of neural tissue at very close range without damaging the delicate membranes of the neurons'. The scanning technology has been already used for simulating a single neuron (in 2005), a neocortical column consisting of 10,000 neurons (in 2011), and a neural mesocircuit consisting of 100 neocortical columns (in 2011).

The scientists developed this method to automate the process of finding and recording information from neurons in the living brain. It has been shown that a robotic arm guided by a cell-detecting computer algorithm can identify, and record from, neurons in the living-mouse brain with better accuracy and speed than a human experimenter. The automated process eliminates the need for months of training, and provides long-sought information about the activity of living cells.

Using this technique, scientists could classify the thousands of different types of cells in the brain, map how they connect to each other, and figure out how diseased cells differ from normal cells. To quote the authors (Kodandaramaiah et al.): 'Whole-cell patch-clamp electrophysiology of neurons is a gold-standard technique for high-fidelity analysis of the biophysical mechanisms of neural computation and pathology, but it requires great skill to perform. We have developed a robot that automatically performs patch clamping in vivo, algorithmically detecting cells by analyzing the temporal sequence of electrode impedance changes. We demonstrate good yield, throughput and quality of automated intracellular recording in mouse cortex and hippocampus'.

As quoted by Kurzweil (2012), Markram wrote in a 2011 paper that while he was 'search[ing] for evidence of Hebbian assemblies (collections of neurons that are arranged together) at the most elementary level of the cortex', what he found instead were 'elusive assemblies [whose] connectivity and synaptic weights are highly predictable and constrained'. He concluded that 'these findings imply that experience cannot mold the synaptic connections of these assemblies', and speculated that 'they serve as innate, Lego-like building blocks of knowledge for perception and that the acquisition of memories involves the combination of these building blocks into complex constructs'.

Here is more from Markram: 'Functional neuronal assemblies have been reported for decades, but direct evidence of clusters of synaptically connected neurons . . . has been missing. . . . Since these assemblies will all be similar in topology and synaptic weights, not molded by any specific experience, we consider these to be innate assemblies . . . Experience plays only a minor role in determining synaptic connections and weights within these assemblies . . . Our study found evidence [of] innate Lego-like assemblies of a few dozen neurons  . . Connections between assemblies may combine them into super-assemblies within a neocortical layer, then in higher-order assemblies in a cortical column, even higher-order assemblies in a brain region, and finally in the highest possible order in the whole brain . . . Acquiring memories is very similar to building with Lego. Each assembly is equivalent to a Lego block holding some piece of elementary innate knowledge about how to process, perceive and respond to the world. . . When different blocks come together, they therefore form a unique combination of these innate percepts that represents an individual's specific knowledge and experience'.

Further evidence for a regular structure of connections across the neocortex was published in the March 2012 issue of the journal Science by Van J. Wedeen et al. They write: 'Basically, the overall structure of the brain ends up resembling Manhattan, where you have a 2-D plan of streets and a third axis, an elevator going in the third dimension'.

As Wedeen said in a Science magazine podcast, 'This was an investigation of the three-dimensional structure of the pathways of the brain. When scientists have thought about the pathways of the brain for the last hundred years or so, the typical image or model that comes to mind is that these pathways might resemble a bowl of spaghetti – separate pathways that have little particular spatial pattern in relation to one another. Using magnetic resonance imaging, we were able to investigate this question experimentally. And what we found was that rather than being haphazardly arranged or independent pathways, we find that all of the pathways of the brain taken together fit together in a single exceedingly simple structure. They basically look like a cube. They basically run in three perpendicular directions, and in each one of these three directions the pathways are highly parallel to each other and arranged in arrays. So, instead of independent spaghettis, we see that the connectivity of the brain is, in a sense, a single coherent structure'.

A very precise form of scanning technology was used for revealing the grid-like structure of the connections, involving a variety of noninvasive scanning technologies, including new forms of MRI, magnetoencephalography, and diffusion tractography (a method to trace the pathways of fibre bundles in the brain).

This is incredible stuff! A great triumph of modern science, and of the scientific method! I re-quote: '. . . we find that all of the pathways of the brain taken together fit together in a single exceedingly simple structure. They basically look like a cube. They basically run in three perpendicular directions, and in each one of these three directions the pathways are highly parallel to each other and arranged in arrays'.

As Kurzweil (2012) explains, 'Whereas the Markram study shows a module of neurons that repeats itself across the neocortex, the Wedeen study demonstrates a remarkably orderly pattern of connections between modules. The brain starts out with a very large number of "connections-in-waiting" to which the pattern recognition modules can hook up. Thus if a given module wishes to connect to another, it does not need to grow an axon from one and a dendrite from the other to span the entire physical distance between them. It can simply harness one of these connections-in-waiting and just hook up to the ends of the fiber. As Wedeen and his colleagues write, "The pathways of the brain follow a base-plan established by . . . early embryogenesis. Thus, the pathways of mature brain present an image of these three primordial gradients, physically deformed by development". In other words, as we learn and have experiences, the pattern recognition modules of the neocortex are connecting to these preestablished connections that were created when we were embryos' (emphasis added).

This is rather like the field-programmable gate arrays (FPGAs) I described in Part 111. Humans developed the technology of FPGAs, not knowing that their own brains have evolved to have a similar configuration and working principle!

Here is a pictorial summary of the present status of tools for imaging the human brain (from Kurzweil 2012):

Wedeen, whose work I mentioned above, is also involved in the truly ambitious Human Connectome Project, which aims at mapping the wiring diagram of the entire, living human brain. The project aims to be completed by 2014.

 You may also like to watch this Youtube video reporting some very recent progress in 3D visualization of the brain. Called the Glass Brain, it is a 3D brain visualization that displays source and connectivity data based on real-time EEG, using BCILAB technology and Unity3D.

Saturday, March 15, 2014

123. Probing the Human Brain

If we want to reverse-engineer the human brain, the most important thing to do first is to probe its structure and function with experimental tools that have the highest possible spatial and temporal resolution. I give in this and the next post a very brief historical account of the progress made in achieving this objective. My information is based largely on the work of Kurzweil (2005, 2012).

1. At the beginning of the 20th century, crude tools were developed for examining the physical processes inside the brain. In 1928 E. D. Adrian measured the electrical output of nerve cells, thus demonstrating that there are electrical processes occurring inside the brain. To quote Adrian: 'I had arranged electrodes on the optic nerve of a toad in connection with some experiments on the retina. The room was nearly dark and I was puzzled to hear repeated noises in the loudspeaker attached to the amplifier, noises indicating that a great deal of impulse activity was going on. It was not until I compared the noises with my own movements around the room that I realized I was in the field of vision of the toad's eye and that it was signalling what I was doing'.

As Kurzweil (2005) remarks, 'Adrian's key insight from this experiment remains a cornerstone of neuroscience today: the frequency of the impulses from the sensory nerve is proportional to the intensity of the sensory phenomena being measured. For example, the higher the intensity of light, the higher the frequency (pulses per second) of the neural impulses from the retina to the brain'.

2. Horace Barlow, a student of Adrian, provided another crucial insight, namely 'trigger features' in neurons. He discovered that the retinas of frogs and rabbits have single neurons that trigger on 'seeing' specific shapes, directions, or velocities. This meant that perception involves a series of stages, with each layer of neurons recognizing more sophisticated features of the image.

Even today, electroencephalography (EEG) is a common investigative and diagnostic tool that records the electrical activity occurring along the scalp. It measures the voltage fluctuations resulting from the ionic currents flowing within the neurons (see below). The spectral content of an EEG can provide information for, for example, the epileptic activity in the brain of a patient. This technique can provide millisecond-level temporal resolution.

3. In 1939 A. L. Hodgkin and A. F. Huxley began developing an idea of how neurons perform, namely by accumulating their inputs and then producing a spike in membrane conductance: There is a sudden increase in the ability of the neuron's membrane to conduct a signal and the corresponding voltage along the axon of the neuron. This was described by Hodgkin and Huxley as the axon's 'action potential' (voltage). They actually measured the action potential on an animal neuron in 1952. Squid neurons were chosen by them for this, because of their large size and accessibility.

4. Building on the work of Hodgkin and Huxley, W. S. McCulloch and W. Pitts worked out in 1943 a simple model of neurons and neural nets. I described their model in Part 74, under the title 'Artificial Neural Networks'. This model was further refined by Hodgkin and Huxley in 1952. This very basic model for neural nets, whether in the brain or in a computer simulation, introduces the idea of a neural 'weight' which represents the 'strength' of the neural connection (synapse), and also a nonlinearity (firing threshold) in the neural cell body (the soma).

5. As I described in Part 74, another breakthrough idea was put forward in 1949 by Donald Hebb. His theory of neural learning (the 'Hebbian response theory') said that if a synapse is stimulated repeatedly, it becomes stronger. Over time this conditioning produces a learning response. Such 'connectionist' ideas flourished during the 1950s and 1960s, and led to much research on artificial neural nets.

6. There is another form of Hebbian learning, namely a loop in which the excitation of a neuron feeds back on itself, causing reverberation (a continued reexcitation of the neurons in the loop). Hebb suggested that this type of reverberation could result in short-term memory: 'Let us assume that the persistence or repetition of a reverberatory activity (or 'trace') tends to induce lasting cellular changes that add to its stability. . .  When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased'. This form of Hebbian learning is well captured by the popular phrase 'cells that fire together wire together'. Brain assemblies can create new connections and strengthen them, based on their own activity. The actual development of such connections by neurons has been seen in brain scans.

In Hebb's theory the central assumption is that the basic unit of learning in the neocortex is the neuron: a single neuron. But the current theory, described by Kurzweil (2012), of how the brain functions (I shall describe it in a future post) is based, not on the neuron itself, but rather on an assembly of neurons. This basic unit of learning comprises of ~100 neurons. According to Kurzweil (2012) 'the wiring and synaptic strengths within each unit are relatively stable and determined genetically . . . . Learning takes place in the creation of connections between these units, not within them, and probably in the synaptic strengths of those interunit connections'. As we shall see in the next post, experimental evidence has indeed been obtained for the existence of 100-neuron thick modules as the basic units of learning.

7. The connectionist movement suffered a temporary setback in 1969 when Marvin Minsky and Seymour Papert published the book Perceptrons. This book included a theorem which demonstrated that the most common neural net used at that time (namely Rosenblatt's Perceptron) was unable to answer whether or not a line-drawing was fully connected.

8. But the neural-net movement staged a resurgence in the 1980s when the 'back propagation' method was invented. In this, the strength of each simulated synapse is governed by a learning algorithm that adjusts the synaptic weight or the strength of the output of each artificial neuron after each training trial, thus enabling the net to learn to match the right answer more correctly. This type of self-organization has helped solve a whole range of pattern-recognition problems. But back propagation is not a feasible model for the training occurring in real mammalian biological neural nets.

9. Spectacular progress continues to be made in developing experimental techniques for peering into the brain. According to Kurzweil (2005) the resolution of noninvasive brain-scanning devices has been doubling every 12 months or so (per unit volume). There is also a comparable improvement in the speed of brain scanning image reconstruction.

A commonly used brain-scanning technique is fMRI (functional magnetic resonance imaging). This technique is based on the fact that cerebral blood flow and neuronal activation are coupled. When an area of the brain is in use, blood flow to that region increases. fMRI provides a spatial resolution of ~1 mm, and a time resolution of ~1 second (or 0.1 second for a thin brain slice). It measures blood-oxygen levels, and is an indirect technique for recording neuronal activity. Another such indirect technique is PET (positron emission tomography).  It measures the regional cerebral blood flow (rCBF).

Both fMRI and PET reflect local synaptic activity, rather than the spiking of neurons. They are particularly reliable for recording the relative changes in the state of the brain, for example when a particular task is being carried out by the subject.

10. Another brain-scanning technique is MEG (magnetoencephalography). It measures the magnetic fields outside the skull, coming mainly from the pyramidal neurons of the neocortex. It can achieve millisecond-level temporal resolution, but has a very poor spatial resolution (~1 cm).

11. 'Optical imaging' is an invasive technique capable of providing high spatial and temporal resolution. It involves removing a part of the skull, staining the brain tissue with a dye that fluoresces during neural activity, and imaging the emitted light.

12. When it is feasible to destroy a brain for the purpose scanning it, immensely high spatial resolutions become possible. It has been possible to scan the nervous system of the brain and body of a mouse with a resolution better than 200 nm.

More on probing the brain in the next post.