Ben Goertzel
October 7, 2004
OK, for starters I’ll get
the conventional-book-review-ish stuff out of the way. Jeff Hawkins’ new book On Intelligence
is an excellent book, well worth reading whether you’re a total novice to the
cognitive and neural sciences or an expert researcher in one or more related
domains. Don’t let the fact that
Hawkins is a wealthy computer entrepreneur (a founder of Palm and Handspring) fool
you – he’s also a very serious thinker with some innovative and sophisticated
things to say about the nature of brain and intelligence. With this book, he joins a small, elite crew
of theorists who have spanned the conceptual and empirical aspects of brain and
mind theory. His concepts are both
philosophically significant, and intimately tied to the empirical realities of
scientific research. And furthermore,
the book is extremely well-written and readable, no doubt in part due to the
efforts of coauthor Sandra Blakeslee, a well-known science journalist.
If you want more details on
the book, check out the many glowing reviews
on Amazon.com, and also the official website associated with the book:
What I’m going to give you
in the rest of this “reaction/review” is my personal take on Hawkins’ ideas –
where they’re right, where they’re wrong, and where they’re significantly
incomplete. That is, this is going to
be a critical review of Hawkins’ ideas, rather than of his book per
se. Actually, I think the book is
pretty much a perfect exposition of the ideas contained in it; and this is a
big piece of praise, because there aren’t too many books about which I’d make
such a statement (I definitely wouldn’t say anything similar about any of my
own books so far, though I have some hopes for one of my current
works-in-progress).
My perspective here is
blatantly opinionated: I have my own particular view on the nature of
intelligence, which I’ve been refining for many years during my career in
academia and industry. The reason I’ve
considered it worthwhile to write down my perspective on Hawkins’s ideas is
precisely that his thinking is, in many respects, pretty damn similar to my
own. The issues he’s been thinking
through are, to a large extent, the same as the ones I’ve been thinking
through. He has more of a neuroscience
slant, similar to my perspective in the mid-1990’s; lately I’ve taken my work
in a more AI-ish direction, but even so, our conceptual landscapes are not all
that differently shaped. Because of
this, I think I’ve been able to pretty quickly suss out the strengths and
weaknesses in his approach to understanding biological and digital
intelligence.
One way to get a quick idea
of my own perspective is to read the following brief paper on the Novamente AI
system, currently under development by myself and my colleagues:
I’ll begin with some
comments on Hawkins’ overall philosophy, then discuss his comments on the
nature of conventional neuroscience and AI research, and then delve into the
details of his theories. At the end
I’ll give some reflections on the relationship between neuroscience and AI.
I believe Hawkins’ overall
philosophy of mind is correct. It’s far
from original, and he doesn’t claim it to be -- it’s one that he shares with
many others, including (to make a fairly arbitrary list) myself, Ray Kurzweil,
Douglas Hofstadter, Marvin Minsky, Gregory Bateson, Eric Baum and Charles S.
Peirce. The basic philosophy is that
minds are sets of patterns associated with certain physical systems, aka
“brains.” A mind is a particular
pattern-of-arrangement of patterns. And
what a mind does is to recognize patterns in the world around it, and in
itself, and then to enact patterns in the world and in itself. The wording I’ve used in the last few
sentences is my own – Hawkins uses his own phraseology rather than explicit
Goertzellian patternism – but he gets across the same basic ideas. Of course, this philosophy in itself doesn’t
tell you very much about the mind – to
get anywhere significant one needs to make specific commitments about what kind
of arrangement-of-patterns a mind is, and how it is that certain physical
systems cause this kind of arrangement of patterns to emerge from themselves. But at least he gets the philosophy right in
the first place, which is necessary in order for any further progress to be
made.
A consequence of the
patternist philosophy of mind – be it of the Goertzellian, Hawkinsian,
Hofstadterian or any other variety – is that minds are not tied to human or
mammalian brains or any other particular type of physical system. Minds can be made to emerge from any
sufficiently flexible physical substrate – such as, for example, digital computers. This general statement also doesn’t tell you
very much of use – it doesn’t tell you how to create a mind from a digital
computer, for example.
It’s consistent with the
patternist philosophy to believe, for instance, that although digital minds are
possible, engineering a digital mind by any means other than slavishly
imitating the details of the human brain is impractically difficult. This is basically the perspective taken by
Eric Baum in his recent book What Is
Thought? Baum argues that mind
is in essence a fairly simple thing, concerned with finding programs that
encapsulate data in minimally complex forms – but that the definition of
“minimally complex” for any realistic intelligence relies on a huge amount of
in-built knowledge. Humans get this
in-built knowledge via evolution via our genes; and Baum conjectures that there
is no way for computer software to display human-level intelligence unless it’s
supplied with a vast mass of in-build implicit background knowledge similar to
the knowledge he thinks humans obtain through their genes. This knowledge, in his theory, isn’t
explicit knowledge like that in the Cyc database; it’s implicit knowledge
regarding subtle patterns of how the perceptual, manipulable, social and
internal world of humans on Earth is structured.
Hawkins, on the other hand,
is more optimistic than this, regarding the hope for software
intelligence. Hawkins doesn’t think we
need to emulate the human brain in detail.
What he thinks is that we need to understand the basic principles by
which the brain works – and then, once we’ve gotten these principles down pat,
we’ll be able to figure out how to make computer software programs that obey
these same principles, and human-level or superior computer intelligence will be
the results. I tend to agree with his
intuition in this regard – so long as one remains on an abstract level as was
done in the previous few sentences. But
once one digs a little deeper to see what this actually means some “hidden
rocks” emerge.
The real question with this
approach is just how far one has to abstract the principles guiding the human
brain, in order to get something that can then be specialized to the computer
software domain, in a way that yields tractably implementable software. Hawkins seems to think one doesn’t need to
abstract all that far – so far as I can tell from the way he words things in
his book, he seems to think that there’s some “algorithm” of brain function
that can be ported to software, yielding a practical approach to software
intelligence. On the other hand, I
suspect that the level of “algorithms” (at least, taking this word as typically
used in computer sciences) is way too concrete to serve the intended purpose. I think that the algorithms used in the
brain are highly specialized to the particular properties of neural wetware,
and that intelligent computer software – if it’s going to make at all
reasonably efficient use of the hardware it runs on – is going to have to use
totally different algorithms, embodying the same basic conceptual
processes. (Now of course, one can play
with words here and say that these “basic conceptual processes” are in fact
abstract algorithms, so that, even according to my perspective, there are algorithms
spanning neural and software intelligence.
I can accept this perfectly well – the wording isn’t important to me –
my point is that one is going to have to abstract very far from neural
process to get something that’s efficiently implementable on modern or near
future computing technology. I will
make this point in a very clear and concrete way a little further on.)
Before leaving philosophy I
should mention a couple somewhat peripheral philosophical points on which I
don’t agree with Hawkins: consciousness and the Singularity.
His take on consciousness is
basically in the reductionist “What’s all the fuss about?” vein. I don’t think this detracts from Hawkins’
brain theory at all, any more than Dennett’s similar perspective detracts from
his cognitive psychology theorizing in his mal-named book Consciousness
Explained. In essence, Hawkins
argues that, to whatever extent the concept of “consciousness” can’t be boiled
down to brain theory, it’s simply a bunch of hooey.
My own view on consciousness
is subtler and has proved difficult for me to communicate effectively; if you
want to try to understand what the heck I’m talking about in this area, see my
online essay Patterns
of Awareness. In short, my view is
that the physical and patternist views of the mind are valid ones, but there
are also other valid views of the mind, such as a purely experiential view, in
which conscious experience is more foundational than physical systems or
linguistic communications. Conscious
experiences are associated with patterns, and patterns are associated with
physical systems, but none of these is fully reducible to the other.
But anyway, one doesn’t need
to accept Hawkins’ reductionist view of consciousness to like his brain/mind
theories; and nor do you have to accept my multi-perspectival view of
consciousness to accept my cognitive theories or my approach to AI.
The other philosophical
topic on which I take issue with Hawkins is the Singularity (a term that he
doesn’t mention, which I think is an error in itself, in a book dealing to an
extent with the long-term future of AI).
Perplexingly, he brushes aside questions of the long-term risks of
superhuman AI, focusing instead on the useful applications that human-level AI
will have for humanity, in the medium term.
He makes the correct point that roughly-human-level AI’s will have
dramatically different strengths and weaknesses from human being, due to
different sensors and actuators and different physical infrastructures for their
cognitive dynamics. But he doesn’t even
touch the notion of self-modifying AI – the concept that once an AI gets smart
enough to modify its own code, it’s likely to get exponentially smarter and
smarter until it’s left us humans in the dust.
He doesn’t even bother to argue against this notion, he just ignores it
entirely!
Conjecturally, I attribute
this oversight to his focus on software intelligence that closely emulates the
human brain. The human brain doesn’t
radically self-modify (because we can’t easily see our neurodynamics, and
messing with our brains is generally directly life-threatening), therefore if
AI closely enough emulates the human brain it won’t radically self-modify
either. But of course, even if AI were
to start out by fairly closely emulating the human brain, it would still have
much greater capability for self-modification than a human brain due to its
digital implementation, and then it would likely gradually revise itself
further and further away from its human initial condition, in a direction of greater
and greater intelligence.
Hawkins spends a while, in
the beginning of the book, meditating on the shortcomings of neuroscience and
artificial intelligence as currently practiced by the majority of researchers. I tend to agree with most of his
critiques. However, the ultimate
conclusions he draws are strikingly different in regard to the two disciplines. In the case of neuroscience, he advocates a
different way of doing neuroscience – and he’s even founded a small institute
to pursue neuroscience Hawkins-style.
In the case of AI, on the other hand, he basically advocates that AI
researchers should give up and wait for the neuroscientists to show them the way.
Regarding neuroscience,
Hawkins gives his own version of the familiar complaint that nearly all
mathematicians, physical scientists or computer scientists have about biology:
too many details, too much fragmentation, and not enough questing for simple
unifying principles and theories! To an
extent this complaint is unjustified – often the biological world just isn’t as
orderly and clean and abstractly-encapsulable as those of us trained in the
“harder sciences” would like it to be.
And to an extent it is justified – in some ways the discipline of
biology has overreacted to the diverse messiness of its subject matter and
developed a culture of detail-focus that has trouble finding beautiful
biological generalizations in the rare cases where they do exist. In the domain of genomics, the existence of
a network centralized databases has helped to partially overcome the tendency
of biology toward fragmentation; similar databasing efforts are underway in
brain science.
In 2002, Hawkins founded the
Redwood Neuroscience Institute (RNI), with the explicit goal of paying a few
outstanding scientists to seek an integrative understanding of brain function –
as opposed to doing what most brain scientists are paid to do, which is to
analyze some particular aspect of brain function in microscopic detail. This is, in my opinion, a really wonderful
thing for a wealthy software entrepreneur to do with some of his money. Many of the ideas in On Intelligence
seem to have achieved concrete form via Hawkins’ interactions with the
scientists he hired to work with him at RNI.
Regarding AI, on the other
hand, Hawkins correctly chastises classical logic-oriented AI theorists for
focusing on toy problems and ignoring the real nature of intelligence, which
has to do with the ability to generalize broadly into different domains of
experience. He correctly points out
that neural net AI, after getting off to an interesting but slow start,
degenerated precisely as it became more popular – turning primarily into a
study of narrowly-focused computer algorithms with scant relevance to either
brain function or general intelligence; deserving the name “neural network” in
that their basic computational elements embody crude mathematical models of
neurons, yet having little do to with neuroscience in either network structure
or dynamics. He correctly points out
that most of the famous recent successes of AI, such as the chess champion
program Deep Blue, were achieved via fairly straightforward heuristic
algorithms, which have little to do with human intelligence or any kind of general
intelligence.
I agree with all these
critiques of his regarding AI as currently practiced, but I don’t consider them
to invalidate the pursuit of AI separately from neuroscience. Rather, I think the problem is that the
field of AI has come to focus on “narrow AI” – programs that solve
particularly, narrowly-defined problems – rather than “artificial general
intelligence” (AGI). Of course, there
is no such thing as a truly general intelligence, except in the theoretical
domain of AI on infinitely powerful computers as explored by Marcus Hutter,
Juergen Schmidhuber and others in their recent mathematical work. But clearly humans display much greater
general scope of intelligence than any current AI program, and clearly the AI
field has not been nearly enthusiastic enough about pursuing AI programs
displaying a reasonable degree of general intelligence.
I think the AI field has
come to focus on narrow AI for much the same reason that the neuroscience field
has come to focus on the analysis of highly particular biological
phenomena. Namely: it’s easier to get
quick results if you’re doing something relatively narrow and simple. And whether in academia or a corporate setting,
scientists are always under a lot of pressure to get quick results, for publication
or productization as well as general reputation-building. And in difficult areas like these, even
relatively “narrow and simple” research is often pretty damn interesting.
I have experienced the
pressure toward narrow AI research quite powerfully in my own career. As an academic for 8 years, I wrote a number
of books and papers on the nature of intelligence, taking an integrative point
of view involving cognitive science, artificial general intelligence, philosophy
of mind and abstract mathematics. I
conceived some interesting designs for AGI systems but they were too much for
me to implement on my own. I
implemented some narrow AI programs to generate interesting results to publish
papers on. I tried and failed to get
government grant money to build AGI systems; but did get some bits and pieces
of grant money for narrow AI work, which was more popular with funding sources
due to its apparently greater chance of success. Then, I left academia for industry and started an ambitious AGI
R&D project in the context of a dot-com era startup company. But that company ran out of money and died,
and the project had to be abandoned halfway through for legal reasons. So I started over again with a team of
software-engineer colleagues from the previous company. Funded by a one-year one-man grant from the
Jeffrey Epstein Foundation, I spent a year working out a new, crisper
mathematical design for an AGI system (Novamente);
my colleagues started a software consulting company, and together we worked out
an efficient software framework for implementing my new design. But even the new simpler design is a
multi-year, multi-programmer project: AGI just isn’t all that simple! No one appeared to fund our new AGI project,
so we decided to fund it ourselves via creating commercial software programs
based on our software. But of course,
you can’t create commercial software based on your AGI system until your AGI
system is all done. So what we’re doing
is creating commercial narrow AI programs, using the software framework that
we’re building out with our AGI design in mind. Each commercial narrow AI program uses some stuff that’s part of
our general AI design, and some other stuff that’s particular to the domain
where the program is being applied.
Slowly, we’re building out our AGI system at the same time as making a
modest amount of money from our narrow AI applications. If we should make a large sum of money from
any of these applications, then to be sure, we’ll use the profits to fund our
AGI research, and the path toward AGI will occur at a much accelerated pace!
Ok -- the point of that
digression into my personal history was not just to let off steam (no,
really!) – the point is that, in AI just as in neuroscience, a combination of
practical pressures and disciplinary-psychology issues have tended to push
research in a non-optimal direction, and away from a general and integrative
understanding of the topic matter.
And yet, in both AI and
neuroscience, research has still been highly productive even though
nonoptimal. We neither understand the
brain nor do we have a thinking machine.
But neuroscientists have understood a lot about the brain; and AI
scientists have developed a lot of great technology, such as probabilistic
inference algorithms, evolutionary learning algorithms, and so forth. Hawkins is using the detail-knowledge
neuroscientists have gathered to help him build his integrative brain theory,
and I’m using some of the algorithms narrow-AI scientists have created to try
to build an integrative AGI system.
And, interestingly, just as
places like RNI are slowly bringing some focus onto integrative brain
understanding, it seems that the mainstream AI community is slowly regaining an
interest in artificial general intelligence.
One piece of evidence is that later this month (October 2004) in
Washington DC, the American Association for AI is organizing a symposium on Achieving
Human-Level Intelligence through Integrated Systems. I’ll be attending and presenting a talk
there, along with a couple other Novamente folks. So far as I can tell none of the other speakers there are going
to be describing approaches to AGI that are nearly as “serious” as Novamente,
in terms of the combination of a viable engineering implementation and a
detailed mathematical and cognitive theory.
But at least this sort of research is becoming almost legitimate
again, rather than being completely relegated to the sidelines.
Now, we finally get to the
meat of the matter! The centerpiece of
Hawkins’ book is his brain/mind theory.
His discussions on the future of AI, and of consciousness and creativity
and related ideas, are not particularly original nor tremendously deep – the
only new points he has in those areas are ones that ensue pretty directly from
his hypotheses about neuroscience and basic cognitive structure/dynamics.
Where is his brain/mind
theory right, and where is it wrong?
Actually, I don’t think it is
significantly wrong, at least not in terms of sins of commission. I tend to largely agree with his take on
the brain -- which is not surprising because he and I have been inspired by
much of the same neuroscience work, and I’ve previously published some
speculations on brain function that overlap significantly with his ideas. My only real complaints about his
brain/mind theory regard sins of omission. I think he oversimplifies some things fairly seriously – giving
them very brief mention when they’re actually quite long and complicated stories. And some of these omissions, in my view, are
not mere “biological details” but are rather points of serious importance for
his program of abstracting principles from brain science and then
re-concretizing these principles in the context of digital software.
Let me start with Hawkins’
mind theory, and then segue into his brain theory. The two fit tightly together, as well they should.
On the pure cognitive level,
he refers to his approach as a “memory-prediction model.” This means he thinks the key features of the
human brain-mind are:
Indeed, this is hard to
disagree with. Marcus Hutter’s general
mathematical theory of intelligence – see his book Universal Artificial
Intelligence
-- agrees with this perspective, as does my own quasi-mathematical
theory of intelligence articulated in my 1993 book The Structure of
Intelligence:
The questions then become:
Both of these questions
break down into multiple subquestions, including some easier and some harder
subquestions. My feeling is that
Hawkins does not address the harder subquestions.
On a conceptual level, it
seems to me that the “memory-prediction” view should be explicitly traced back
to the patternist philosophy of mind.
Hawkins does this implicitly but he doesn’t really make a point of
it. Storing memories that are
abstracted versions of percepts, actions and thoughts (Hawkins refers to these
abstracted versions as “invariants”) involves the recognition of patterns among
and within percepts, actions and thoughts.
On the other hand, making predictions involves recognizing temporal
patterns and then assuming they’ll continue.
So his memory-prediction approach is quite consistent with his overall
patternist philosophy.
One point Hawkins doesn’t
really cover is how a mind/brain chooses which predictions to make, from among
the many possible predictions that exist.
Three sensible potential answers exist here:
Eric Baum focuses on the
simplicity aspect, but also on the genetic-heritage aspect, which he refers to
as “inductive bias.” My own approach to
mind-modeling, similarly to Marcus Hutter’s, focuses on the simplicity and
goal-orientation aspects – I define an intelligent system as one that achieves
complex goals in complex environments, and posit that a system tends to make
predictions oriented toward predicting which of its actions will help it to fulfill
its goals. An important point is that,
once one goes beyond really trivial predictions, the search space for
“reasonably simple” patterns becomes very large – so that one needs either
specific goals or a specific inductive bias to guide one’s search. Hawkins doesn’t really deal with this
issue, though he doesn’t dismiss it as irrelevant either.
The essence of On
Intelligence is a set of hypotheses regarding how the brain carries out
memory and prediction. Following a
train of thought pioneered by Vernon Mountcastle and others, Hawkins focuses on
the columnar structure of the neocortex, and he proposes that this corresponds
to the hierarchical structure of mind – lower levels of the columnar hierarchy
corresponding to less abstract patterns, and higher and higher levels
corresponding to progressively more and more abstract patterns. The dynamics of perception, action and
cognition then involves information moving up and down this hierarchy of
patterns among patterns among patterns.
One of the book’s more
innovative proposals involves the mechanics of pattern nesting. Hawkins proposes that there are neurons or
neuronal groups that represent patterns as “tokens,” and that these tokens are
then incorporated along with other neurons or neuronal groups into larger
groupings representing more abstract patterns.
This seems clearly to be correct, but he doesn’t give much information
on how these tokens are supposed to be formed.
It’s not clear that Hebbian learning is a sufficient mechanism. The brain may contain some additional
dynamic, which causes attractors (see below) on one level in the cortical
hierarchy to link to individual neurons one level higher in the hierarchy. This is an extremely important point,
because it’s a link between neurodynamics and symbolic processing – symbol
creation and manipulation being one of the hallmarks of abstract, higher-level
human thought. This is one area where I
have some hope Hawkins’ team at the RNI will provide deep new insights over the
next few years: it’s at the heart of Hawkins’ theory and it’s also palpably
central to the mind-brain connection.
I presented a somewhat
similar theory of cortical function in an article in Complexity in 1996,
also included as a chapter in From Complexity to Creativity -- see crude
HTML dump at
What’s similar in my theory
and Hawkins’ is the alignment of the cortical-column hierarchy with the
cognitive hierarchy that guides pattern recognition and pattern creation. What Hawkins adds are a lot of excellent
details regarding the behavior of this hierarchy in different parts of the
brain, the use of the hierarchy for temporal prediction, the interaction
between the cortical-column hierarchy and the hippocampus, and so forth. Many
of these details came out of neurophsyiological research done in the interval
between 1996 and 2004, including some knowledge-integration done at the RNI.
So, what’s wrong with
Hawkins’ picture of brain function?
Nothing’s exactly wrong with it, so far as I can tell. I think it’s wonderful – as far as it
goes. It’s an outstanding
extrapolation from the wild mess of available data. But it’s badly incomplete.
Specifically, it seems to me to leave out three very important things:
In fact, none of these is
quite left out in Hawkins’ book (though
2 comes very close), but they’re only very briefly and shallowly mentioned–
whereas from my point of view, they’re extremely essential to understanding the
brain and how it gives rise to the mind.
Attractors are a staple
concept from nonlinear dynamics. An
attractor is basically a dynamical pattern in a system, which tends to continue
once it gets started. Attractors don’t
need to be permanent; they can be “terminal attractors” that stay around for a
while and then disappear. A number of neuroscientists
have written about the probable key role of attractors in the human brain;
Walter Freeman is probably the most distinguished example. Hawkins doesn’t refer to this literature at
all in his book; but in fact I think it’s highly relevant to his brain theory. Attractors among cortical columns are, in my
opinion, critical to human intelligence.
I talked a great deal about the potential role of attractors in mind and
brain in my books Chaotic Logic and From Complexity to Creativity,
written in the mid-1990’s; rough and messy HTML dumps of these books are
available online at
Though he doesn’t favor the
language of “attractors,” this notion is central to Gerald Edelman’s theory of
Neural Darwinism – a theory that I reviewed extensively in my 1993 book The
Evolving Mind , see
Hawkins was inspired by
Vernon Mountcastle’s theories about cortical columns, developed in the
1970’s. I first learned about
Mountcastle’s work from Gerald Edelman’s books. Edelman talks about “neuronal
groups” rather than columns, but it’s largely the same idea. But Edelman then takes the concept one step
further and talks about “neural maps” – assemblies of neuronal groups that
carry out particular perception, cognition or action functions. Neural maps, in essence, are sets of
neuronal groups that host attractors of neurodynamics. And Edelman then observes, astutely, that
the dynamics of the population of neuronal groups, over time, is likely to obey
a form of evolution by natural selection.
This line of thinking is
very interesting to me because one of the most exciting developments in the AI
field during the last few decades is the notion of “evolutionary programming” –
learning-algorithms that operate via emulating the process of evolution by
natural selection. How fascinating if
the brain also operates in this way!
This kind of evolutionary
cognition, if it exists as Edelman postulates, has deep implications for the
nature of creativity; see my conceptual theory of creativity from From Complexity
to Creativity:
Hawkins argues that
creativity is essentially just metaphorical thinking, generalization based on
memory. While this is true in a grand
sense, it’s not a very penetrating statement.
There are a lot of possible generalizations based on memory, so the
question is how does the mind choose the good generalizations from the mass of
bad ones? In his discussion, Hawkins evokes
images of random search, but obviously he knows the search isn’t random. Evolutionary learning is the most powerful
general search mechanism known to computer science, and is also hypothesized by
Edelman to underly neural intelligence.
This sort of idea, it seems to me, should be part of any synthetic
approach to brain function.
Finally, the third point
that’s drastically underplayed in Hawkins’ treatment: Hebbian learning. Hawkins mentions the notion, and observes
correctly that Hebbian learning in the brain is a lot subtler than the simple
version that Donald Hebb laid out in the late 40’s. But he largely portrays these variations as biological details,
and then shifts focus to the hierarchical architecture of the cortex. Of course, the architecture is important –
but I believe that the details of Hebbian learning are extremely important
too. Because I believe that, if its
parameters are tuned right, then Hebbian learning on the neuron level can lead
to probabilistic logical reasoning on the level of neural maps. I’ve argued this point of view in some
depth in my essay “Hebbian Logic”:
I must stress that the above
essay is speculative rather than rigorous, and is somewhat rough-drafty in
form. I haven’t proved anything there –
I’ve just outlined some evocative ideas, and done some simple algebra to show
that they’re plausible. A lot of
mathematical and computer-simulation work needs to be done to validate the Hebbian
Logic idea. However, regarding Hawkins’
work, my point is that this type of idea is entirely missing from his
work. He doesn’t address at all the
question of how abstract symbolic reasoning arises from neuron-level or
neural-cluster-level dynamics. He lumps
abstract symbolic reasoning into the grab-bag categories of “memory” and
“prediction” – which is correct in a general sense, but not very
informative.
For the reader with a
technical background, it’s worth briefly noting the weak link in my Hebbian
Logic proposal. I haven’t yet figured
out an elegant, biologically plausible way to make Hebbian learning give rise
to a particular phenomenon in formal logic: the binding of existential
quantifiers to other quantifiers. Every
other aspect of logic seems to me to boil down nicely to Hebbian learning based
phenomena, but not that one. Of course,
it’s worth noting that the human mind is pretty bad at doing reasoning doing
this type of complex logic rule; so it may be that the brain implements this
sort of thing using some kind of “horrible hack.” But anyway, Hawkins doesn’t really get to the point of banging
his head against tricky issues like this one, because he doesn’t try to
elaborate the basic Hebbian learning mechanism in a way that will explain how
it can give rise to high-level cognitive behaviors.
This discussion of reasoning
and neurodynamics ties in with Hawkins’ critique of AI, which in my view is
overly harsh. He dismisses work on
formal logic based reasoning as irrelevant to “real intelligence.” But if it turns out that, as I claim,
Hebbian learning on the neural map level in fact is a statistical approximation
to probabilistic logic, then in fact probabilistic logic is not at all
irrelevant to real intelligence – not even to real biological intelligence,
let alone to real digital intelligence (which may potentially be engineered to
rely on probabilistic logic more directly than the brain does).
Maybe my speculations about
Hebbian learning and probabilistic logic are wrong (though, OK, personally I’m
confident that they’re so sensible they have to be right in essence, even if the
details are significantly different from what I’ve speculated). But some such theory has got to be posited,
if one is going to have a theory of brain-mind. It’s not enough just to say that logical inference comes about as
a consequence of memory and prediction.
Of course, Hawkins realizes this isn’t enough – but nor does he
highlight the fact that this is an area where a lot more work is needed.
So – to sum up – I think
Hawkins’ statements about brain function are pretty much correct … i.e., his
analysis and integration of the known data is adept, and his intuitive
speculations generally agree with my own intuitive speculations! However, he leaves out some important
things, and there is a pattern to these omissions. What he omits are, for instance,
These omissions are,
interestingly, two areas where brain dynamics appears to overlap with
contemporary AI research, two main strains within which are evolutionary
computing and probabilistic inference.
So, Hawkins’ near-complete dismissal of modern AI research goes hand in
hand with his sidestepping of those ways in which brain function resonates well
with modern AI research!
In terms of his
conceptualization of cognitive function in terms of memory and prediction, my
suggestion is that Hawkins’ omissions are key points regarding knowledge
representation and the learning of predictive patterns. Memory of abstract patterns requires a
suitably abstract knowledge representation – probabilistic logic is one such
representation, which plausibly may emerge from neural structures and dynamics;
Hawkins does not propose any concrete alternative. Learning of predictive patterns requires an explicit or implicit
search through a large space of predictive patterns; evolutionary learning
provides one approach to this problem, with computer science foundations and
plausible connections to brain function; again, Hawkins does not propose any
concrete alternative.
By way of conclusion, I’ll
now get back to the crucial question of how far one has to abstract away from
brain function, to get to something that can be re-specialized into efficient
computer software. My intuition is that
this will require a higher level of abstraction than Hawkins seems to
believe. But I stress that this is a matter
of intuitive judgment – neither of us really knows.
For instance, my own feeling
is that implementing Hebbian learning in software on contemporary computers is
likely to be a big waste. If it’s
really true that Hebbian learning emergently gives rise to probabilistic
inference, then why not just implement probabilistic inference instead? Similarly, if what’s important about the
dynamics of neural maps is that they display evolutionary learning, then why
not just implement evolutionary learning in some way that’s more natural for
digital computers? The hierarchical
structure of the cortex is clearly critical – but why not create an
hierarchically structured network adapting itself via probabilistic inference and
evolutionary learning, in a way that’s natural for digital computers? Well, this is exactly what we’re trying to
do with our Novamente AI system.
Of course, to interpret the
Novamente design as an “abstraction from the brain” is to interpret this phrase
in a fairly extreme sense – we’re abstracting general processes like
probabilistic inference and evolutionary learning and general properties
like hierarchical structure from the brain, rather than particular
algorithms. One may well be able to
create reasonably efficient digital AGI systems that don’t abstract quite this
far from what the brain does. On the
other hand, perhaps one can’t. It’s a
mistake to underestimate the extent to which the brain’s “algorithms” are fit
to the particular strengths and weaknesses of the wetware for which they
evolved.
Although I’m
(unsurprisingly) most psyched about the Novamente approach, I think it’s also
quite worthwhile to pursue AGI approaches that are closer to the brain level –
there’s a large space between detailed brain simulation and Novamente, including
neuron-level simulations, neural-cluster-level simulations, and so forth. In fact, the Novamente software framework
could even be used as a platform for the creation of such AGI systems.
It’s true that the creation
of these “intermediate-neural-fidelity” systems will become easier once more is
understood about the brain. But even
so, as a matter of research planning, I don’t agree with Hawkins that it’s
optimal for AGI research to wait for neuroscience to point the way. Quite the contrary: my suspicion is that the
two disciplines should work together.
Insights into the nature of evolutionary learning and probabilistic
inference, obtained via AI work, are likely to lead to insights into how these
processes are implemented in the brain; just as a deeper understanding of
neurodynamics will doubtless help us structure and refine our AGI programs.
Finally, I’ve asked myself
whether Hawkins’ ideas, so far, contain any insights helpful to my work with
Novamente. But I haven’t come up with
anything so far. The Novamente AI
design, at this point, embodies solutions to a lot of hard problems in AGI
design, and I believe that only one significant problem is left (in addition to
a whole bunch of engineering and tuning regarding various problems that have
been “solved in essence”):
(A snippet of technical
detail: The definition of “moderately large” here is: representable using a few
hundred nodes in a knowledge representation called a “Novamente combinator tree.” Each node represents some Novamente concept
or procedure, or some basic programming construct like a number or a sum or a
loop.) The approach taken to this
problem in Novamente involves a combination of probabilistic evolutionary
programming (the Bayesian Optimization Algorithm applied to “combinator trees”)
and probabilistic inference. But the
control of probabilistic inference is itself a significant problem – the
learning of inference control procedures it in itself a problem of “learning
(moderately) large, complex procedures via breaking them down appropriately
into hierarchical of smaller procedures.”
Teaching Novamente will be a matter of guiding the system through a
bootstrapping process by which moderately complex inference control procedures
allow the system to learn slightly more complex inference control procedures,
etc.
My conclusion, however, is
that Hawkins’ book doesn’t really present a solution to this tough problem, except
in a very vague and general way. The
problem we’re wrestling with is clearly reflected in Hawkins’ brain theory – in
his language, it’s essentially the problem of how Hebbian learning acts in a
coordinated way through the cortical hierarchy, allowing unified learning of
complex neural maps spanning multiple layers and columns. But at this stage all he has to offer
regarding this problem is some suggestive hand-waving. Clearly he thinks the solution involves
tokenization of patterns represented as patterns of activity among multiple
neurons – but the question then becomes how the multi-level tokenization
process is guided, something he doesn’t really address. If he does solve this problem within his
brain theory, the solution may well be very useful for Novamente. For the present time, however, we’ll
continue to explore our own solutions to the problem – we have some in mind that
seem highly promising, though refining and testing them will be a lot of work.
Naahhh…. I don’t have a conclusion. Everything important regarding these topics
is still unresolved!
Jeff Hawkins has written a
very nice book, about some very interesting theories, some his own and some
extracted from the neuroscience research literature. He hasn’t penetrated all the way into the mysteries of the
brain/mind yet – but then, he doesn’t claim he has. He’s reasonable all the way through – which is disappointingly
rare in these domains – and he’s innovative in some important respects. I think his research program is fairly
likely to lead to fundamental progress in understanding the brain – and I
wouldn’t say that about most neuroscience research initiatives I read
about. As yet his research program hasn’t
led to anything that seems really striking to me -- based on my prior knowledge
and prior thinking and writing on related topics -- but heck, RNI has only
existed since 2002.
As I’ve noted above in
detail, I think Hawkins’ current theories leave out some important points about
brain dynamics -- but I also think that, by following his methodology of
integrating all available scientific data into a solid mathematical and
conceptual picture, he’ll probably eventually incorporate these points into a
future version of his theory (most likely expressed in his own language rather
than mine!). I find his comments on AI
much less penetrating than his comments on neuroscience, which is probably
because he’s right in the thick of neuroscience research, whereas he’s
commenting on AI as a relative outsider (albeit a fairly highly informed
outsider). What I’ve done in this
review/reaction is mostly to elaborate on Hawkins’ ideas in the context of my
own work in AI and theoretical cognitive science -- highlighting both the
similarities and the differences.
Hopefully this exercise has been of some interest to you!