When I first started studying AI in the mid-1980’s, it seemed that AI researchers were fairly clearly divided into two camps,

Neural Networks: The Promise and the Reality

Ben Goertzel

The first time I saw a human brain in a medical laboratory, I was overwhelmed by a very strange feeling. How, I wondered mutely, can this three-pound gray mass of nerve cells possibly lead to so much? How can all the social and psychological and cultural complexity we see around us and feel within us, come out of this little lump of meat? This lump of meat, not so different from the similar lumps of meat found in the heads of apes and monkeys – or, for that matter, sheep and cows.

The brain is a puzzle, and obviously, one with significance far beyond the ivory tower.

First of all, understanding the brain is important as part of the human race’s quest for self-understanding. The contradictions and dilemmas and great moral questions that follow us from one culture to the next – aggression, sexuality, gender differences, freedom and bondage, death and suffering – all have their roots in the patterns by which electricity courses through these three-pound masses in our heads.

But that’s not all -- the significance of the brain extends beyond even the human condition. For the human brain is, at the moment, our unique example of a highly intelligent system. Some people think whales and dolphins are highly intelligent – but this remains a speculation. Stanislaw Lem, in his famous book Solaris, hypothesized an intelligent ocean on another planet, whose intelligence was revealed to humans in the complexity of its wave patterns and its strange effects on peoples’ minds. But even if Solaris’s real-world equivalent is out there somewhere, we haven’t voyaged there yet. For the moment, the human brain, flawed as it is, is our only incontrovertible example of an intelligent system, so if we want to understand the general nature of this mysterious thing called intelligence – or, more practically, create intelligent devices to improve our lives -- the brain is the only really concrete source of inspiration we have.

Neuroscientists try to understand the workings of this amazing three-pound meat hunk, one cell at a time, one specialized region at a time. The basic principles are simple, but the complexities are astounding. As of now, they know a lot about how the brain is structured – which parts do which sorts of things. And they know a lot about cells and chemicals in the brain. But they have remarkably few hard facts about the dynamics of the brain – how the brain changes its state over time, a process that’s more colloquially referred to with terms like “thinking” and “feeling.”

The main cause of this situation is that there’s no way, right now, to monitor the details of what goes on in the brain all day, how it does its business. Brain scanning equipment, PET and fMRI scanners, have come a long way, but are still far too crude to do much beyond tell us which general regions of the brain a person uses to do which general kinds of thinking. They don’t yet let us monitor the course of thinking itself. A few bold neuroscientists have made their guesses as to the big picture of how the brain works, but the more timid 99.9% remain focused on their own small corner of the puzzle, all too aware of how much more there is to be understood before general theories about brain function can be systematically verified or falsified.

And the difficulty of understanding brain and mind is not restricted to neuroscience. Other disciplines concerned with other aspects of intelligence – psychology, artificial intelligence, philosophy of mind, linguistics – have run up against equally tough dilemmas as the ones that face neuroscience. Psychologists bend over backwards to create complex experiments that will test even their simplest theories, let alone their subtler ones. AI programmers find current computer hardware inadequate to implement their more ambitious designs. Linguists enumerate rule after rule describing human language use, but fail to encompass its full variety. This cross-disciplinary difficulty has led to the creation of a combined interdisciplinary discipline called cognitive science, which wraps all the difficulties up into one unmanageable but fascinating package. Many universities now offer degrees in cognitive science – the study of the baffling phenomenon of brain-mind from every aspect, the piecing together of diverse fragments of evidence to try to arrive at an holistic understanding of this most baffling three-pound meat hunk that lies at the very center of our selves, and holds the key to untold future technologies.

Not surprisingly, cognitive science itself is a rather diverse endeavor, encompassing a number of different approaches to understanding mind-brain. Each approach has its own merits. Among the most fascinating conceptual frameworks under the cognitive science umbrella, however, is the field of artificial neural networks. Scientists and engineers working in this area are creating computer programs that, in some sense, work like the brain does – not just on the overall conceptual level of being intelligent, but by simulating the dynamics of interaction between brain cells. So far none of these neural net programs is anywhere near as intelligent as the brain. But some of them have shed light on various aspects of brain function (memory, learning, disorders like dyslexia), and others are serving very useful functions in the world right now, embedded in other software programs doing everything from translating handwriting into typescript to filtering out porn on the Internet to helping diagnose problems in auto engines.

There is a lot to be learned from these neural network programs’ successes – and from their limitations. We can see just how much, and how little, of what makes the human brain so powerful and wonderful can be captured in simplified, small-scale models of its underlying low-level mechanism.

Neural networks, today, are useful tools for understanding aspects of the dynamics of the brain. And they’ve proved valuable at solving some kinds of computer science and engineering problems. But they don’t form a viable approach to understanding how the brain works in a holistic sense, nor for creating real computational intelligence. For these more ambitious goals, we’ll have to wait for a different kind of science.

Neurons

The brain is made of cells called neurons. There are many other cells there too – glia, for example, that fill up the space between the neurons. But all the evidence – dating back to Ramon y Cajal at the end of the 19’th century -- shows that neurons are the most important ones. Neuron in groups do amazing things but neurons individually seem to be relatively simple. A neuron is a nerve cell. Nerve cells in your skin send information to the brain about what you're touching; nerve cells in your eye send information to the brain about what you're seeing. Nerve cells in the brain send information to the brain about what the brain is doing. The brain monitors itself -- that's what makes it such a complex and useful organ!

How the neuron works is by storing and transmitting electricity. We're all running on electric! To get a concrete sense of this, look at how electroshock therapy affects the brain – or look at people who have been struck by lightning. One many who was struck by lightning was never again was able to feel in the slightest bit cold. He'd go outside in his underwear on a freezing, snowy winter day, and it wouldn't bother him one bit. The incredible jolt of electricity had done something weird to the part of his nervous system that experienced cold....

The neuron can be envisioned as an odd sort of electrical machine, which takes charge in through certain "input connections" and puts charge out through its "output wire." Some of the wires give positive charge -- these are "excitatory" connections. Some give negative charge -- these are "inhibitory." But the trick is that, until enough charge has built up in the neuron, it doesn't fire at all. When the magic "threshold" value of charge is reached, all of a sudden it shoots its load. From a low-level, mechanistic view, this "threshold dynamic" is the basis of the incredibly complexity of what happens in the brain.

Of course, the connections between neurons aren’t really as simple as electrical wires – that’s just a metaphor. In reality, each inter-neuron connection is mediated by a certain chemical called a neurotransmitter. There are hundreds of types of neurotransmitters. When we take drugs, the neurotransmitters in our brain are affected, and neurons may fire when they otherwise wouldn’t, or be suppressed from firing when they otherwise would.

And the near-consensus among neuroscientists is that most learning takes place via modification of the connections between neurons. The idea is simple: Not all connections between neurons conduct charge with equal facility. Some let more charge through than others; they have a "higher conductance." If these conductances can be modified even a little then the behavior of the overall neural network can be modified drastically. This leads to a picture of the brain as being full of "self-supporting circuits," circuits that reverberate and reverberate, keeping themselves going. This concept goes back to Donald Hebb in the 1940's, and it's held up since then through all the advances in neuroscience.

Christof Koch, a well-known neuroscientist, believes there is some fairly subtle electrochemical information processing going on inside each neuron, and the extracellular space between neurons. If this is so, no one knows exactly what role this has in intelligence. A few mavericks go even further, and argue that the neuron itself is a complex molecular computer. Anesthesiologist Stuart Hameroff has suggested that the essence of intelligence lies in the dynamics of the neural cytoskeleton -- in the molecular biology in the walls of the neurons. Roger Penrose, the famous British mathematician, has taken this one step further, arguing that one needs a fancy theory of quantum gravity to explain what's going on in the cytoskeletons of neurons, and tying this in with the idea that quantum theory and consciousness are somehow interrelated. There’s no real evidence for these theories, but it’s an indicator of how little we understand about the brain that these outlier theories can’t be 100% convincingly empirically refuted at this stage.

High-Level Brain Structure

Neurons are the stuff of the brain; but they live at a much lower level than thoughts, feelings, experiences, knowledge. Above the level of interesting mind-stuff, on the other hand, we have the various regions of the brain, with their various specialized purposes. In between is the interesting part – and the least known.

One hears a lot about left brain versus right brain, but in fact the really critical distinction is forebrain versus hindbrain and midbrain. The hindbrain is situated right around the top of the neck. What it does is mostly to regulate the heart and lungs -- and control the sense of taste. The midbrain, resting on top of the hindbrain, integrates information from the hindbrain and from the ears and eyes, and it appears to play a crucial role in consciousness. Collectively, the hindbrain and midbrain are referred to as the brainstem.

The hindbrain is old – it’s basically the same in humans as in reptiles, amphibians, fish, and so forth. The forebrain, on the other hand, is much more complex in mammals than in other animals. It’s where our smarts live, and it has its own highly complex and variegated structure. The mammalian forebrain is subdivided into three parts: the hypothalamus, the thalamus and the cerebral cortex. And then the cortex itself is divided into several parts, the largest of which are the cerebellum and the cerebrum. The cerebellum has a three-layered structure and serves mainly to evaluate data regarding motor functions. The cerebrum is the seat of intelligence -- the integrative center in which complex thinking, perceiving and planning functions occur. How it works we have only a very vague idea.

And all this is just the coarsest level of high-level brain structure. With PET and fMRI brain scan equipment, scientists can go one level deeper. By getting a rough picture of which parts of the brain are getting more attention at what times, they can see what parts of the brain are involved in what activities. So, for example, attention seems to involve three different systems: one for vigilance or alertness, one for sort of executive attention control, and one for disengaging attention from whatever it was focused on before. Three totally different systems, all coming together to let us be attentive to something in front of us.... Depending on which of the three systems is damaged, you get different kinds of awareness deficits, different neurological disorders.

All these different parts of the brain are made of neurons, passing electricity around via neurotransmitters, in reverberating cycles and complex patterns. But the different kinds of neurons and neurotransmitters in the different parts of the brain, and the different patterns of arrangement and connectivity of the neurons in the different parts of the brain, obviously make a huge practical difference.

Atoms and molecules are a reasonable metaphor here. All brain regions are made of neurons, just as all molecules are made of atoms. Different atoms have very different properties, as to different types of neurons. And different molecules, formed from different types of atoms, can do very different things – just as different brain regions are specialized for different functions. Only the very simplest molecules can, in practice, be analyzed in terms of atomic physics – otherwise we have to use the crude heuristics of theoretical chemistry. Complex molecules like proteins can barely even be studied with computer simulations; the physical chemistry is just too tricky. Similarly, so far, only the very simplest brain regions and functions can be understood in terms of neurons and their interactions. And complex brain functions can’t even be understood via computer simulations, yet. There are too many neurons, and the interconnectivity patterns and ensuing dynamics are just too tricky.

Artificial Neural Networks for Biology and Engineering

The model of the brain as a network of neurons, passing charge amongst each other, is a crude approximation to the teeming complexity of the brain as it actually is. But it has a number of advantages. Simplicity and comprehensibility, for instance. And, just as critically, amenability to mathematical analysis and computer simulation. Modeling all the molecules in the brain as a set of equations is an impossible task, in practice. It might be necessary for a real understanding of brain function, but I certainly hope not. Even modeling the chemical dynamics inside a single neuron is a huge exercise, to which some excellent scientists devote their entire lives, and which tells you extremely little about the functioning of the brain as a whole and the emergence of intelligence from matter. On the other hand, if one is willing to view the brain as a neural network, then one can construct mathematical and computational models of the brain quite easily. It’s true that the brain contains roughly a hundred billion neurons, and no mathematician can write down that many equations, and very few computers can simulate a system that large with reasonable efficiency. But at least one can simulate small neural networks, to get some kind of feel for how brain dynamics works.

This is the raison d’etre of the “neural networks” field, which was launched by the pioneering work of cyberneticists Warren McCullough and Walter Pitts, in the early 1940's. What McCullough and Pitts did in their first research paper on neural networks was to prove that a simple neural network model of the brain could do anything – could serve as a "universal computer." At this stage computers were still basically an idea, but Alan Turing and others had worked out the theory of how computers could work, and McCullough and Pitts' work, right there at the beginning, linked these ideas with brain science.

Artificial neural nets have come a long way since McCullough and Pitts. But even after all these years, the main thing this kind of work does is to expose our ignorance.

For instance, to decide which artificial neurons should connect to which, in an artificial brain-simulator program, requires detailed biological knowledge that is hard to come by. Deciding what types of neurons to have in one’s toy neural net requires similarly elusive knowledge, if simulating the brain accurately is really one’s goal. A few neural net specialists focus on acquiring and utilizing this data – Stephen Grossberg of Boston University is the chief example.

Gerald Edelman, who won a Nobel for his work in immunology, has taken a different approach, trying to abstract beyond these difficult issues while remaining within the computational brain modeling domain, by modeling networks of neuronal groups rather than networks of individual neurons. He has proposed some detailed and fascinating theories about the connectivity patterns of neuronal groups, and the dynamics of neural group networks, and embodied these theories in computer simulations of vision processing. Edelman doesn’t consider himself a neural net theorist per se, but the basic line of thinking and mathematical modeling is not far off.

Next , beyond the issue of the lack of sufficient detailed biological data to guide detailed computer modeling, there’s the small question of what an artificial neural network is supposed to do. As even the most overintellectual of us have surely noticed at some time during our lives, the brain is connected to a body. On the other hand, most artificial neural networks are not attached to any real sensors or actuators. There are exceptions, such as neural nets used in vision processing or robotics. But other than these cases, neural network programs must be carefully fed data, and must be carefully studied to decode their dynamical patterns into meaningful results in the given application context. It often turns out that figuring out how to represent data to feed to a neural network program requires more intelligence than the neural network program itself embodies.

In practice, the biological-modeling motivated research carried out by neural net researchers like Grossberg and Edelman is fairly separate from more pragmatic, engineering-oriented neural network research. Most engineering applications of neural nets aren’t based on serious brain modeling at all; rather, the brain’s neural network is taken as general conceptual inspiration for the creation of an artificial neural network in software, and then computationally efficient learning algorithms are applied to this artificial neural network. It seems likely that the standard learning algorithms for artificial neural nets are actually more efficient at learning than the brain’s learning algorithms – in the context of very small neural networks, say hundreds to thousands of neurons. On the other hand, they don’t scale up well to billions of neurons. The brain’s learning algorithms, on the other hand, work badly in the small scale but remarkably well in the large.

For instance, Rulespace (www.rulespace.com) uses a neural network program to recognize Internet pornography. America Online uses it to filter out porn for customers who request this feature. But the neural net inside their program can’t actually read text. Rather, there’s other code that recognizes key words in Web pages, and then uses the presence or absence of each key word in a Web page to control whether a certain neuron in a neural network gets activated (given simulated electricity) or not. The pornographicness or not of the Web page is then determined by whether or not a special neuron (the “output neuron”) is active or not, after activation has been given a while to spread through the network.

In application like Rulespace, a neural network is being used as a simple mathematical widget. Simulation of the brain in any serious sense is not even being attempted. The connectivity pattern of the neurons, and the inputs and outputs of the neurons, are engineered to yield intelligent performance on the particular problem the neural net was design to solve. This is pretty typical in the neural net engineering world.

Real-world neural net engineering gets quite complex. For instance, to get optimal performance for OCR, instead of one neural net, researchers have constructed modular nets, with numerous subnetworks. Each subnetwork learns something very specific, then the subnetworks are linked together into an overall meta-network. One can train a single network to recognize a given feature of a character – say a descender, or an ascender coupled with a great deal of whitespace, or a collection of letters with little whitespace and no ascenders or descenders. But it is hard to train a single network to do several different things – say, to recognize letters with ascenders only, letters with descenders only, letters with both ascenders and descenders, and letters with neither. Thus, instead of one large network, it pays to break things up into a collection of smaller networks in a hierarchical architecture. If the network learned how to break itself up into smaller pieces, one would have a very impressive system; but currently this is not the case, the subnets are carefully engineered by humans.

Some experiments with fairly simple neural nets, such as the ones used in these practical applications, have had fascinating parallels in human psychology. In the mid-80’s, for instance, Rumelhart and McLelland created an uproar with simple neural networks that learn to conjugate verbs. They showed that the networks, in the early stages of learning, made the same kinds of errors as small children. Other people, a few years, trained neural nets to translate written words into sounds. If they destroyed some of the neurons and connections in the network, they showed, they obtained a dyslexic neural net. Which is the same thing that happens if you lesion someone's brain in the right area. You can do the same sort of thing for epilepsy: by twiddling the parameters of a neural network you can get it to have an epileptic seizure. Of course, none of this proves that these neural nets are good brain models in any detailed sense, but it shows that they belong to a general class of “brain-like systems.”

Perhaps the most fascinating neural net project around is Hugo de Garis’s Artificial Brain project, originally conducted at ATR in Japan and currently being pursued at Starlab, in Brussels. This is an attempt to create a hardware platform for advanced artificial intelligence using special chips called Field-Programmable Gate Arrays to implement neural networks. As de Garis says, his CBM platform “will allow the assemblage of 10,000’s of evolved neural net modules. Humanity can then truly start building artificial brains. I hope to put 10,000 modules into a kitten robot brain….” This research remains unfinished, but it will be interesting to see where it leads.

The Mystery of Neurodynamics

Most of the neural networks used in practical applications are pretty simple dynamically. In order to get reliable functionality in their particular domains, the neural networks involved are restricted to very limited behavior regimes. It’s easy to predict what they’ll do overall, although the details of the activation spreading inside them may be wildly fluctuating. Neural nets constructed for biological modeling are usually allowed a freer rein to evolve and grow, but this is rarely the focus of research. In general, the potential for really complex and subtle dynamics in neural networks has hardly been explored at all.

And this is a shame, because, the “threshold” behavior of a neuron conceals the potential for immense dynamical complexity, of a very psychologically relevant nature. Think about it: let's say a neuron holds almost enough charge to meet the threshold requirement, but not quite. Then a chance fluctuation increases its charge just a little bit. Its total store of charge will be pushed over the threshold, and it will shoot its load. A tiny change in input leads to a tremendous change in output -- the hallmark of chaos. But this is just the beginning. This neuron, which a tiny fluctuation has caused to fire, is going to send its input to other neurons. Maybe some of these are also near the threshold, in which case extra input will likely influence their behavior. And these may set yet other neurons off -- et cetera. Eventually some of these indirectly triggered neurons may feed back to the original neuron, setting it off yet again, and starting the whole cycle from the beginning. The whole network, in this way, can be set alive by the smallest fluke of chance! In this way you get chaos and complexity out of these simple formal networks.

The one biological researcher who has really grabbed ahold of this aspect of neural nets is Walter Freeman, who has shown, in his work with the olfactory part of the brain, that real neural networks display the same complex and crazy dynamics as artificial neural nets, and that this complex dynamics is critical to how brains solve the problem of identifying smells. Now, if we use complex, near-chaotic dynamics to identify smells, it’s hardly likely that the neural nets in our frontal lobes use simple dynamics like those embodied in current neural network based software products. We have a long way to go before our toy neural net model catch up with the neural nets in our heads!

Neural Nets and Real AI

Biologists and pragmatic computer scientists each have their own use for neural network models, their own neural network learning schemes, and so forth. My own personal interest in neural nets, on the other hand, has been mainly oriented neither toward brain modeling, nor towards immediate practical engineering applications. Rather, it’s been oriented toward real AI – toward the creation of truly intelligent computer programs, programs that, like humans, know who they are and behave autonomously in the world. Viewed in this light, current neural network research comes up rather lacking. Now, this isn’t a tremendously shocking conclusion, since with the exception of deGaris’s brain machine, researchers are playing around with vastly smaller neural networks than the one the brain contains. But it’s interesting to delve into the precise reasons why current neural net work isn’t terribly relevant to the task of artificial mind creation.

To understand the role of neural nets in the history of AI, one also has to understand their opposition. When I first started studying AI in the mid-1980’s, it seemed that AI researchers were fairly clearly divided into two camps, the neural net camp and the logic-based or rule-based camp. This isn’t quite so true anymore, but it’s still a decent first order approximation.

Whereas neural nets try to achieve intelligence by simulating the brain, rule-based models take a totally different approach. They try to simulate the mind's ability to make logical, rational decisions, without asking how the brain does this biologically. They trace back to a century of revolutionary developments in mathematical logic, culminating in the realization that Leibniz’s dream of a complete logical formalization of all knowledge is actually achievable in principle, although very difficult in practice.

Rule-based AI programs aren’t based on self-organizing networks of autonomous elements like neurons, but rather on systems of simple logical rules. Intelligence is reduced to following orders. In spite of some notable successes in areas like medical diagnosis and chess playing and financial analysis, the biggest thing this approach has taught us is that it’s really hard to boil down intelligent behaviors into sets of rules – the sets of rules are huge and variegated, and the crux of intelligence become the dynamic learning of rules rather than the particular rules themselves.

Now, to most any observer not hopelessly caught up on one or another side of the debate, it’s obvious that both of these ways of looking at the mind – rules or neural nets -- are extremely limited. True intelligence requires more than following carefully defined rules, and it also requires more than random or rigidly laid-out links between a few thousand artificial neurons.

My own attempt at a solution to this problem, in my “Webmind AI Engine” software system, has been somewhat like Gerald Edelman’s. My AI program is based on entities called nodes, that are roughly of the same granularity as Edelman’s “neuronal groups.” Nodes are a bit like neurons – they have a threshold rule in them, and they’re connected by “links” that are a bit like synapses – connections between neurons – in the brain. But the Webmind AI Engine’s nodes have a lot more information in them than neurons. And the links between nodes have more to them than the links in neural net models – they’re not just conduits for simulated electricity; they have specific meanings, sometimes similar to the meanings of logical rules in a rule-based AI system.

The key intuition underlying Edelman’s and my approaches is to focus on the intermediate level of brain/mind organization: larger than the neuron, smaller than the abstract concept. The idea is to view the brain as a collection of clusters of tens or hundreds of thousands of neurons, each performing individual functions in an integrated way. One module might detect edges of forms in the visual field, another might contribute to the conjugation of verbs. The network of neural modules is a network of primitive mental processes rather than a network of non-psychological, low-level cells (neurons). The key to brain-mind, in this view, lies in the way the modules are connected to each other, and they way they process information collectively. The brain is more than a network of neurons connected according to simple patterns, and the mind is more than an assemblage of clever algorithms or logical transformation rules. Intelligence is not following prescribed deductive or heuristic rules, like IBM’s super-rule-based-chess-player Deep Blue; but nor is intelligence the adaptation of synapses in response to environmental feedback, as in current neural net systems. Intelligence involves these things, but at bottom intelligence is something different: the self-organization and mutual intercreation of a network of processes, embodying perception, action, memory and reasoning in a unified way, and guiding an autonomous system in its interactions with a rich, flexible environment.

Onward!

It’s far too early to write a conclusion for the field of neural network research. Right now, things are in all ways extremely primitive. The dynamics of the brain has not been tremendously elucidated by experimentation with neural net models – not yet. But maybe it will be, as these models become larger (due to faster computers and bigger computer networks) and richer in structure (due to greater understanding of the brain, or further development in theoretical AI). So far, there aren’t any major engineering problems that neural nets solve vastly better than other non-brain-inspired algorithms. And so far, nothing close to a fully functioning artificial bug, let alone an artificial human brain, has been produced using neural net software – or any other approach to AI. We’re at the beginnings of this fascinating area of cognitive science research, not the end. If neural net researchers are willing to grow their field to embrace more and more of the multileveled complexity of the three-pound enigma in our skulls, then we can expect more and more wonders to emerge from their work as the years roll by.