Structure of Intelligence -- Copyright Springer-Verlag © 1993 |
The universe is a labyrinth made of labyrinths. Each leads to another.
And wherever we cannot go ourselves, we reach with mathematics.
-- Stanislaw Lem, Fiasco
Over the last century, psychology has become much less of an art and much more of a science. Philosophical speculation is out; data collection is in. In many ways this has been a very positive trend. Cognitive science (Mandler, 1985) has given us scientific analyses of a variety of intelligent behaviors: short-term memory, language processing, vision processing, etc. And thanks to molecular psychology (Franklin, 1985), we now have a rudimentary understanding of the chemical processes underlying personality and mental illness. However, there is a growing feeling -- particularly among non-psychologists (see e.g. Sommerhoff, 1990) -- that, with the new emphasis on data collection, something important has been lost. Very little attention is paid to the question of how it all fits together. The early psychologists, and the classical philosophers of mind, were concerned with the general nature of mentality as much as with the mechanisms underlying specific phenomena. But the new, scientific psychology has made disappointingly little progress toward the resolution of these more general questions.
One way to deal with this complaint is to dismiss the questions themselves. After all, one might argue, a scientific psychology cannot be expected to deal with fuzzy philosophical questions that probably have little empirical significance. It is interesting that behaviorists and cognitive scientists tend to be in agreement regarding the question of the overall structure of the mind. Behaviorists believe that it is meaningless to speak about the structures and processes underlying behavior -- on any level, general or specific. And many cognitive scientists believe that the mind is a hodge-podge of special-case algorithms, pieced together without any overarching structure. Marvin Minsky has summarized this position nicely in his Society of Mind (1986).
It is not a priori absurd to ask for general, philosophical ideas that interlink with experimental details. Psychologists tend to become annoyed when their discipline is compared unfavorably with physics -- and indeed, the comparisonis unfair. Experimental physicists have many advantages over experimental psychologists. But the facts cannot be ignored. Physics talks about the properties of baseballs, semiconductors and solar systems, but also about the fundamental nature of matter and space, and about the origin of the cosmos. The physics of baseball is much more closely connected to experimental data than is the physics of the first three minutes after the Big Bang -- but there is a continuum of theory between these two extremes, bound together by a common philosophy and a common set of tools.
It seems that contemporary psychology simply lacks the necessary tools to confront comprehensive questions about the nature of mind and behavior. That is why, although many of the topics considered in the following pages are classic psychological topics, ideas from the psychological literature are used only occasionally. It seems to me that the key to understanding the mind lies not in contemporary psychology, but rather in a newly emerging field which I will call -- for lack of a better name -- "complex systems science." Here "complex" does not mean "complicated", but rather something like "full of diverse, intricate, interacting structures". The basic idea is that complex systems are systems which -- like immune systems, ecosystems, societies, bodies and minds -- have the capacity to organize themselves. At present, complex systems science is not nearly so well developed as psychology, let alone physics. It is not a tightly-knit body of theorems, hypotheses, definitions and methods, but rather a loose collection of ideas, observations and techniques. Therefore it is not possible to "apply" complex systems science to the mind in the same way that one would apply physics or psychology to something. But complex systems science is valuable nonetheless. It provides a rudimentary language for dealing with those phenomena which are unique to complex, self-organizing systems. And I suggest that it is precisely these aspects of mentality which contemporary psychology leaves out.
More specifically, the ideas of the following chapters are connected with four
"complex systems" theories, intuitively and/or in detail. These are: the theory of pattern (Goertzel, 1991), algorithmic information theory (Chaitin, 1987), the theory of multiextremal optimization (Weisbuch, 1991; Dixon and Szego, 1978; Goertzel, 1989), and the theory of automata networks (Derrida, 1987; Weisbuch, 1991).
The theory of pattern provides a general yet rigorous way of talking about concepts such as structure, intelligence, complexity and mind. But although it is mathematically precise, it is extremely abstract. By connecting the theory of pattern with algorithmic information theory one turns an abstract mathematical analysis of mind into a concrete, computational analysis of mind. This should make clear the limited sense in which the present theory of mind is computational, a point which will be elaborated below. Most of the ideas to be presented are not tied to any particular model of computation, but they are discussed in terms of Boolean automata for sake of concreteness and simplicity.
Pattern and algorithmic complexity give us a rigorous framework for discussing various aspects of intelligence. The theory of multiextremaloptimization, which is closely tied to the abstract theory of evolution (Kauffman, 1969; Langton, 1988), gives us a way of understanding some of the actual processes by which intelligences recognize and manipulating patterns. Perception, control, thought and memory may all be understood as multiextremal optimization problems; and recent theoretical and computational results about multiextremal optimization may be interpreted in this context. And, finally, the theory of automata networks -- discussed in Appendix 2 -- gives a context for our general model of mind, which will be called the "master network". The master network is not merely a network of simple elements, nor a computer program, but rather a network of programs: an automata network. Not much is known about automata networks, but it is known that in many circumstances they can "lock in" to complex, self-organizing states in which each component program is continually modified by its neighbors in a coherent way, and yet does its individual task effectively. This observation greatly increases the plausibility of the master network.
The analysis of mind to be given in the following chapters is expressed in computational language. It is therefore implicitly assumed that the mind can be understood, to within a high degree of accuracy, as a system of interacting algorithms or automata. However, the concept of "algorithm" need not be interpreted in a narrow sense. Penrose (1989), following Deutsch (1985), has argued on strictly physical grounds that the standard digital computer is probably not an adequate model for the brain. Deutsch (1985) has proposed the "quantum computer" as an alternative, and he has proved that -- according to the known principles of quantum physics -- the quantum computer is capable of simulating any finite physical system to within finite accuracy. He has proved that while a quantum computer can do everything an ordinary computer can, it cannot compute any functions besides those which an ordinary computer can compute (however, quantum computers do have certain unique properties, such as the ability to generate "truly random" numbers). Because of Deutsch's theorems, the assertion that brain function is computation is not a psychological hypothesis but a physical, mathematical fact. It follows that mind, insofar as it reduces to brain, is computational.
I suspect that most of the structures and processes of mind are indeed explicable in terms of ordinary digital computation. However, I will suggest that the mind has at least one aspect which cannot be explained in these terms. Chapter 11, which deals with consciousness, is the only chapter which explicitly assumes that the mind has to do with quantum computation rather than simply digital computation.
Many people are deeply skeptical of the idea that the mind can be understood in terms of computation. And this is understandable. The brain is the only example of intelligence that we know, and it doesn't look like it's executingalgorithms: it is a largely incomprehensible mass of self-organizing electrochemical processes. However, assuming that these electrochemical processes obey the laws of quantum physics, they can be explained in terms of a system of differential equations derived from quantum theory. And any such system of differential equations may be approximated, to within any desired degree of accuracy, by a function that is computable on a quantum computer. Therefore, those who claim that the human mind cannot be understood in terms of computation are either 1) denying that the laws of quantum physics, or any similar mathematical laws, apply to the brain; or 2) denying that any degree of understanding of the brain will yield an understanding of the human mind. To me, neither of these alternatives seems reasonable.
Actually, there is a little more to the matter than this simple analysis admits. Quantum physics is not a comprehensive theory of the universe. It seems to be able to deal with everything except gravitation, for which the General Theory of Relativity is required. In fact, quantum theory and general relativity are in contradiction on several crucial points. The effect of gravity on processes occurring within individual organisms is small and easily accounted for, so these contradictions would seem to be irrelevant to the present considerations. But some scientists -- for instance, Roger Penrose, in his The Emperor's New Mind (1989) -- believe that the combination of quantum physics with general relativity will yield an entirely new understanding of the physics of the brain.
It is worth asking: if Penrose were right, what effect would this have on the present considerations? Quantum theory and general relativity would be superseded by a new Grand Unified Theory, or GUT. But presumably it would then be possible to define a GUT computer, which would be capable of approximating any system with arbitrary accuracy according to the GUT. Logically, the GUT computer would have to reduce to a quantum computer in those situations for which general relativistic and other non-quantum effects are negligible. It would probably have all the capacities of the quantum computer, and then some. And in this case, virtually none of the arguments given here would be affected by the replacement of quantum physics with the GUT.
To repeat: the assumption that brain processes are computation, if interpreted correctly, is not at all dubious. It is not a metaphor, an analogy, or a tentative hypothesis. It is a physical, mathematical fact. If one assumes -- as will be done explicitly in Chapter 4 -- that each mind is associated with the structure of a certain physical system, then the fact that a sufficiently powerful computer can approximate any physical system with arbitrary precision guarantees that any mind can be modeled by a computer with arbitrary precision. Whether this is a useful way to look at the mind is another question; but the validity of the computational approach to mind is not open to serious scientific dispute.
Since the arguments to follow are somewhat unorthodox, it seems best to statethe main goals in advance:
1) To give a precise, general mathematical definition of intelligence which is "objective" in that it does not refer to any particular culture, species, etc.,
2) To outline a set of principles by which a machine (a quantum computer, not necessarily a Turing machine) fulfilling this definition could be constructed, given appropriate technology,
3) To put forth the hypothesis that these same principles are a crucial part of the structure of any intelligent system,
4) To elucidate the nature of and relationships between the concepts involved in these principles: induction, deduction, analogy, memory, perception, motor control, optimization, consciousness, emotion,....
The line of argument leading up to these four goals is as follows. Chapters 1 through 4 lay the conceptual foundations for the remainder of the book. Basic mathematical concepts are reviewed: of Turing machines, algorithmic information pattern, and aspects of randomness and optimization. This theoretical framework is used to obtain precise definitions of "intelligence", "complexity", "structure", "emergence," and other crucial ideas.
For instance, the structure of an entity is defined as the set of all patterns in that entity; and the structural complexity of an entity is defined as (roughly speaking) the total algorithmic complexity of all the patterns comprising the structure of that entity. The concept of unpredictability is analyzed according to the theory of pattern, and intelligence is defined as the ability to optimize complex functions of unpredictable environments.
In Chapters 5 through 8, the framework erected in the previous chapters is applied to what Peirce called the three fundamental forms of logic: induction, deduction and analogy. Each of the forms is characterized and explored in terms of algorithmic information theory and the theory of pattern. Induction is defined as the construction, based on the patterns recognized in the past, of a coherent model of the future. It is pointed out that induction cannot be effective without a reliable pattern recognition method to provide it with data, and that induction is a necessary component of pattern recognition and motor control.
Analogy is characterized, roughly, as reasoning of the form "where one similarity has been found, look for more". Three highly general forms of analogy are isolated, analyzed in terms of the theory of pattern, and, finally, synthesized into a general framework which is philosophically similar to Gentner's (1983) "structure-mapping" theory of analogy. Edelman's theory of Neural Darwinism is used to show that the brain reasons analogically.
The structure of long-term memory is analyzed as a corollary of the nature of analogical reasoning, yielding the concept of a structurally associative memory -- a memory which stores each entity near other entities with similar structures, and continually self-organizes itself so as to maintain this structure.
Finally, deduction is analyzed as a process which can only be useful to intelligence insofar as it proceeds according to an axiom system which is amenable to analogical reasoning. This analysis is introduced in the context of mathematical deduction, and then made precise and general with the help of the theory of pattern.
Chapters 9 and 10 deal with the perceptual-motor hierarchy, the network of pattern-recognition processes through which an intelligence builds a model of the world. This process makes essential use of the three forms of reasoning discussed in the previous chapters; and it is also extremely dependent on concepts from the theory of multiextremal optimization.
The perceptual hierarchy is, it is proposed, composed of a number of levels, each one recognizing patterns in the output of the level below it. This pattern recognition is executed by applying an approximation to Bayes' rule from elementary probability theory, which cannot be effective without aid from induction and deduction. The activity of the various levels is regulated according to a "multilevel methodology" (Goertzel, 1989) which integrates top-down and bottom-up control. Neurological data supports this general picture, and recent computer vision systems based on miniature "perceptual hierarchies" have been very effective.
The motor control hierarchy is closely linked with the perceptual hierarchy and operates somewhat similarly, the difference being that its task is not to recognize patterns but rather to select the actions which best fulfill the criteria assigned to it. Building on the brain model given in Chapter 6, specific model for the brain's perceptual-motor hierarchy is proposed.
Chapter 11 deals with consciousness and emotion -- the two essential aspects of the construction of the subjective, interior world. Consciousness is analyzed as a process residing on the higher levels of the perceptual hierarchy, a process whose function is to make definite choices from among various possibilities. It is suggested that complex coordination of the perceptual hierarchy and the motor control hierarchy may not be possible in the absence of consciousness. And, following Goswami (1990) and others, it is argued that an ordinary computer can never be conscious -- but that if a computer is built with small enough parts packed closely enough together, it automatically ceases to function as a Turing machine and becomes fundamentally a "quantum computer" with the potential for consciousness. The problem of reconciling this quantum theory of consciousness with the psychological and biological conceptions of consciousness is discussed.
Following Paulhan (1887) and Mandler (1985), emotion is characterized as something which occurs when expectations are not fulfilled. It is argued that human emotion has a "hot" and a "cold" aspect, and that whereas the "cold" aspect is a structure that may be understood in terms of digital computation, the "hot" aspect is a peculiar chemical process that is closely related to consciousness.
Finally, Chapter 12 presents the theory of the master network: a network ofautomata which achieves intelligence by the integration of induction, deduction, analogy, memory, perception, control, consciousness and emotion. It is pointed out that, according to the definition of intelligence given in Chapter 4, a sufficiently large master network will inevitably be intelligent. And it is also observed that, if one is permitted to postulate a "sufficiently large" network, nearly all of the structure of the master network is superfluous: intelligence can be achieved, albeit far less efficiently, by a much simpler structure. Finally, it is suggested that, in order to make sense of this observation, one must bring physics into the picture. It is not physically possible to build an arbitrarily large network that functions fast enough to survive in reality, because special relativity places restrictions on the speed of information transmission and quantum theory places restrictions on the minimum space required to store a given amount of information. These restrictions give rise to the hypothesis that it is not physically possible to build an intelligent machine which lacks any one of the main components of the master network.
It must be emphasized that these various processes and structures, though they are analyzed in separate chapters here, need not be physically separate in the body of any given intelligence. For one thing, they are intricately independent in function, so why not in implementation? And, furthermore, it seems unlikely that they are physically separate in the human brain. In the final section, I give a demonstration of how one may design an intelligent machine by combining the theory of the master network with Edelman's Neural Darwinism. In this demonstration, the various components of the master network are bound together according to an implementation-specific logic.
Finally, it must also be emphasized that the master network is not a physical structure but a pattern, an abstract logical structure -- a pattern according to which, or so I claim, the system of patterns underlying intelligent behavior tends to organize itself. It consists of two large networks of algorithms (the structurally associative memory and the perceptual-motor hierarchy), three complex processes for transmitting information from one network to another (induction, deduction, analogy), and an array of special-purpose auxiliary optimization algorithms. Each of these networks, processes and algorithms may be realized in a variety of different ways -- but each has its own distinctive structure, and the interconnection of the five also has it own distinctive structure. Of course, an intelligence may also possess a variety of other structures -- unrelated structures, or structures intricately intertwined with those described here. My hypothesis is only that the presence of the master network in the structure of an entity is a necessary and sufficient condition for that entity to be intelligent.
A scientific theory must be testable. A test can never prove a theory true, and since all but the simplest theories can be endlessly modified, a test canrarely prove a complex theory false. But, at very least, a test can indicate whether a theory is sensible or not.
I am sorry to say that I have not been able to design a "crucial experiment" -- a practical test that would determine, all at once, whether the theory of the master network is sensible or not. The situation is rather similar to that found in evolutionary biology. There is no quick and easy way to test the theory of evolution by natural selection. But there are numerous pieces of evidence, widely varying in nature, reliability and relevance. How to combine and weight these various pieces of evidence is a matter of intuition and personal bias.
I certainly do not mean to imply that the theory of the master network is as well supported as the theory of evolution by natural selection -- far from it. But it is not implausible that, in the near future, various sorts of evidence might combine to form a fairly convincing case for the theory. In this sense, I think the ideas proposed here are testable. Whether there will ever be a more effective way to test hypotheses about self-organizing systems such as minds and ecosystems is anybody's guess.