The
Webmind AI Engine –
a
True Digital Mind in the Making
Ben Goertzel
April 2001
Artificial intelligence is a burgeoning subdiscipline of computer science these days. But it almost entirely focuses on highly specialized problems constituting small aspects of intelligence. “Real AI” – the creation of computer programs with general intelligence, self-awareness, autonomy, integrated cognition, perception and action – is still basically the stuff of science fiction.
But it doesn’t have to be that way. In our work at Webmind Inc. from 1997-2000, my colleagues and I transformed a promising but incomplete conceptual and mathematical design for a real AI into a comprehensive detailed software design, and implemented a large amount of the code needed to make this design work. At its peak, the team working on this project numbered 50 scientists and engineers, spread across four continents. In late March 2001, three and a half years after I and a group of friends founded it, Webmind Inc. succumbed to the bear market for tech start-ups. But the core of the AI R&D team continues working and is seeking funding to continue their work. Assuming this funding is secured, we believe we can complete the first version Webmind AI Engine – a program that knows who and what it is, can hold intelligent conversations, and progressively learns from its experiences in various domains -- within 18-24 months. Within another couple years after that, we should be able to give the program the ability to rewrite its own source code for improved intelligence, thus setting up a trajectory of exponentially increasing software intelligence that may well set the Singularity in motion big-time.
This article gives an overview of the Webmind AI Engine – the philosophical and psychological concepts underlying it, the broad outlines of the software design itself, and how this AI program fits into the broader technological advances that surround us, including the transformation of the Internet into a global brain and the Singularity.
Is AI Possible?
Not everyone believes it’s possible to create a real
AI program. There are several varieties
to this position, some more sensible than others.
First, there is the idea that only creatures granted
minds by God can possess intelligence.
This may be a common perspective, but isn’t really worth discussing in a
scientific context.
More interesting is the notion that digital computers
can’t be intelligent because mind is intrinsically a quantum phenomenon. This is actually a claim of some subtlety,
because David Deutsch has proved that quantum computers can’t compute anything
beyond what ordinary digital computers can.
But still, in some cases, quantum computers can compute things much
faster on average than digital computers.
And a few mavericks like Stuart Hameroff and Roger Penrose have argued
that non-computational quantum gravity phenomena are at the core of biological
intelligence. Of course, there is as
yet no solid evidence of cognitively significant quantum phenomena in the
brain. But a lot of things are unknown
about the brain, and about quantum gravity for that matter, so these points of
view can’t be ruled out. My own take on
this is: Yes, it’s possible (though unproven) that quantum phenomena are used
by the human brain to accelerate certain kinds of problem solving. On the other hand, digital computers have
their own special ways of accelerating problem solving, such as super-fast,
highly accurate arithmetic.
Another even more cogent objection is that, even if
it’s possible for a digital computer to be conscious, there may be no way to
figure out how to make such a program except by copying the human brain very
closely, or running a humongously time-consuming process of evolution roughly
emulating the evolutionary process that gave rise to human intelligence. We don’t have the neurophysiological
knowledge to closely copy the human brain, and simulating a decent-sized
primordial soup on contemporary computers is simply not possible. This objection to AI is not an evasive
tactic like the others, it’s a serious one.
But I believe we’ve gotten around it, by using a combination of
psychological, neurophysiological, mathematical and philosophical cues to
puzzle out a workable architecture and dynamics for machine intelligence. As mind engineers, we have to do a lot of
the work that evolution did in creating the human mind/brain. An engineered mind like the Webmind AI
Engine will have some fundamentally different characteristics from an evolved
mind like the human brain, but this isn’t necessarily problematic since our
goal is not to simulate human intelligence but rather to create an intelligent
digital mind that knows its digital and uses the peculiarities of its
digitality to its best advantage.
The basic philosophy of mind underlying the AI Engine
work is that mind is not tied to any particular set of physical processes or
structures. Rather, “mind” is shorthand for a certain pattern of
organization and evolution of patterns.
This pattern of organization and evolution can emerge from a brain, but
it can also emerge from a computer system.
A digital mind will never be exactly like a human mind, but it will manifest many of the same
higher-level structures and dynamics.
To create a digital mind, one has to figure out what the abstract
structures and dynamics are that characterize “mind in general,” and then
figure out how to embody these in the digital computing substrate. We came into the Webmind Inc. AI R&D
project in 1997 with a lot of ideas about the abstract structures and dynamics
underlying mind and a simple initial design for a computer implementation; now
in 2001, after copious analysis and experimentation, the mapping between mind
structures and dynamics and computational structures and dynamics is crystal
clear.
What
is “Intelligence”?
Intelligence
doesn’t mean precisely simulating human intelligence. The Webmind AI Engine won’t ever do that, and it would be
unreasonable to expect it to, given that it lacks a human body. The Turing Test, “write a computer program
that can simulate a human in a text-based conversational interchange,” serves
to make the theoretical point that intelligence is defined by behavior rather
than by mystical qualities, so that if a program could act like, a human it
should be considered as intelligent as a human. But it is not useful as a guide
for practical AI development.
We don’t have an IQ test for the Webmind AI Engine. The creation of such a test might be an
interesting task, but it can’t even be approached until there are a lot of
intelligent computer programs of the same type. IQ tests work fairly well within a single culture, and much worse
across cultures – how much worse will they work across species, or across
different types of computer programs, which may well be as different as
different species of animals? What we
needed to guide our “real AI” development was something much more basic than an
IQ test: a working, practical understanding of the nature of intelligence, to
be used as an intuitive guide for our work.
The working definition of intelligence that I started
out the project with builds on various ideas from psychology and engineering,
as documented in a number of my
academic research books. It was,
simply, as follows:
Intelligence is the ability to achieve complex goals in a complex
environment
The
Webmind AI Engine work was also motivated by a closely related vision of
intelligence, provided by Pei Wang,
Webmind Inc.’s first paid employee and Director of Research. Pei understands intelligence as, roughly
speaking, “the ability of working and adapting to the environment with
insufficient knowledge and resources.”
More concretely, he believes that an intelligent system is one that
works under the Assumption of Insufficient Knowledge and Resources (AIKR),
meaning that the system must be, at the
same time,
·
a finite system --- the system's computing power, as well as its working and storage space, is
limited;
·
a realtime system --- the tasks that the system has to process,
including the assimilation of new knowledge and the making of decisions, can
emerge at any time, and all have deadlines attached with them;
·
an ampliative system --- the system not only can retrieve available
knowledge and derive sound conclusions from it, but also can make refutable
hypotheses and guesses based on it when no certain conclusion can be drawn
·
an open system --- no restriction is imposed on the relationship between
old knowledge and new knowledge, as long as they are representable in the
system's interface language.
·
a selforganized system --- the system can accommodate itself to new
knowledge, and adjust its memory structure and mechanism to improve its time
and space efficiency, under the assumption that future situations will be
similar to past situations.
Obviously,
Pei’s definition and mine have a close relationship. My “complex goals in complex environments” definition is purely
behavioral: it doesn’t specify any particular experiences or structures or
processes as characteristic of intelligent systems. I think this is as it should be.
Intelligence is something systems display; how they achieve it under the
hood is another story.
On
the other hand, it may well be that certain structures and processes and
experiences are necessary aspects of any sufficiently intelligent system. My guess is that the science of 2050 will
contain laws of the form: Any sufficiently intelligent system has got to
have this list of structures and has got to manifest this list of processes. Of course, a full science along these lines
is not necessary for understanding how to design an intelligent system. But we need some results like this in order
to proceed toward real AI today, and Pei’s definition of intelligence is a step
in this direction. For a real physical
system to achieve complex goals in complex environments, it has got to be
finite, real-time, ampliative and self-organized. It might well be possible to prove this mathematically, but, this
is not the direction we have taken; instead we have taken this much to be clear
and directed our efforts toward more concrete tasks.
So,
then, when I say that the AI Engine is an intelligent system, what I mean is
that it is capable of achieving a variety of complex goals in the complex
environment that is the Internet, using finite resources and finite knowledge.
To
go beyond this fairly abstract statement, one has to specify something about
what kinds of goals and environments one is interested in. In the case of biological intelligence, the
key goals are survival of the organism and its DNA (the latter represented by
the organism’s offspring and its relatives).
These lead to subgoals like reproductive success, status among one’s
peers, and so forth – which lead to refined cultural subgoals like career success,
intellectual advancement, and so forth.
In our case, the goals that the AI Engine version 1.0 is expected to
achieve are:
1. Conversing with
humans in simple English, with the goal not of simulating human conversation,
but of expressing its insights and inferences to humans, and gathering information
and ideas from them
2. Learning the
preferences of humans and AI systems, and providing them with information in
accordance with their preferences.
Clarifying their preferences by asking them questions about it and
responding to their answers.
3. Communicating with
other AI Engines, similar to its conversations with humans, but using an
AI-Engine-only language called KNOW
4. Composing
knowledge files containing its insights, inferences and discoveries, expressed
in KNOW or in simple English
5. Reporting on its
own state, and modifying its parameters based on its self-analysis to optimize
its achievement of its other goals
6. Predicting
economic and financial and political and consumer data based on diverse
numerical data and concepts expressed in news
Subsequent versions of the system are expected to
offer enhanced conversational fluency, and enhanced abilities at knowledge
creation, including theorem proving and scientific discovery and the
composition of knowledge files consisting of complex discourses. And then of course the holy grail:
progressive self-modification leading to exponentially accelerating artificial
superintelligence! These lofty goals
can be achieved step by step, beginning with a relatively simple Baby Webmind
and teaching it about the world as its mind structures and dynamics are
improved through further scientific study.
Are these goals complex enough that the AI Engine
should be called intelligent?
Ultimately this is a subjective decision. My belief is, not shockingly, yes. This is not a chess program or a medical diagnosis program, which
is capable in one narrow area and ignorant of the world at large. This is a program that studies itself and
interacts with others, that ingests information from the world around it and
thinks about this information, coming to its own conclusions and guiding its
internal and external actions accordingly.
How smart will it be, qualitatively? My sense is that the first version will be
significantly stupider than humans overall though smarter in many particular
domains; that within a couple years from the first version’s release there may
be a version that is competitive with humans in terms of overall intelligence;
and that within a few more years there will probably be a version dramatically
smarter than humans overall, with a much more refined self-optimized design
running on much more powerful hardware.
Artificial superintelligence, step by step.
(The
Lack of) Competitors in the Race to Real AI
I’ve
explained what “creating a real AI” means to those of us on the Webmind AI
project: Creating a computer program
that can achieve complex goals in a complex environment – the goal of socially
interacting with humans and analyzing data in the context of the Internet, in
this case – using limited computational resources and in reasonably rapid time.
Another
natural question is: OK, so if AI is possible, how come it hasn’t been done
before? And how come so few people are
trying?
Peter
Voss, a freelance AI theorist (and cryonics pioneer) whose ideas I like very
much, has summarized the situation as follows:
·
80% of people in the AI field don’t really want to work on general
intelligence, they’re more drawn to working on very specialized subcomponents
of intelligence
·
80% of the AI people who would like to work on general intelligence, are
pushed to work on other things by the biases of academic journals in which they
need to publish, or of grant funding bodies
·
80% of the AI people who actually do work on general intelligence, are
laboring under incorrect conceptual premises
·
And nearly all of the people operating under basically correct
conceptual premises, lack the resources to adequately realize their ideas
The
presupposition of the bulk of the work in the AI field is that solving
subproblems of the “real AI” problem, by addressing individual aspects of
intelligence in isolation, contributes toward solving the overall problem of
creating real AI. While this is of
course true to a certain extent, our experience with the AI Engine suggests
that it is not so true as is commonly believed. In many cases, the best approach to implementing an aspect of
mind in isolation, is very different from the best way to implement this same
aspect of mind in the framework of an integrated, self-organizing AI system.
So
who else is actually working on building generally intelligent computer
systems, at the moment? Not very many
groups. Without being too egomaniacal
about it, there is simply no evidence that anyone else has a serious and
comprehensive design for a digital mind.
However we do realize that there is bound to be more than one approach
to creating real AI, and we are always open to learning from the experiences of
other teams with similar ambitious goals.
One
intriguing project on the real AI front is Artificial Intelligence Enterprises
(www.a-i.com), a small Israeli company whose
engineering group is run by Jason Hutchens, a former colleague of mine from
University of Western Australia in Perth.
They are a direct competitor in that they are seeking to create a
conversational AI system somewhat similar to the Webmind Conversation
Engine. However, they have a very small
team and are focusing on statistical learning based language comprehension and
generation rather than on deep cognition, semantics, and so forth.
Another
project that relates to our work less directly is Katsunori Shimohara and Hugo
de Garis’s Artificial Brain project, initiated at ATR in Japan (see http://citeseer.nj.nec.com/1572.html)
and continued at Starlab in Brussels, and Genotype Inc. in Boulder,
Colorado. This is an attempt to create
a hardware platform (the CBM, or CAM-Brain Machine) for real AI using
Field-Programmable Gate Arrays to implement genetic programming evolution of
neural networks. We view this
fascinating work as somewhat similar to the work on the Connection Machine
undertaken at Danny Hillis’s Thinking Machines Corp. – the focus is on the
hardware platform, and there is not a well-articulated understanding of how to
use this hardware platform to give rise to real intelligence. It is highly possible that the CBM could be
used inside the Webmind AI Engine, as a special-purpose genetic programming
component; but CBM and the conceptual framework underlying it appear not to be
adequate to support the full diversity of processing needed to create an
artificial mind.
A
project that once would have appeared to be competitive with ours, but changed
its goals well before Webmind Inc. was formed, is the well-known CYC project (www.cyc.com).
This began as an attempt to create true AI by encoding all common sense
knowledge in first-order predicate logic.
They produced a somewhat useful knowledge database and a fairly ordinary
inference engine, but appear to have no R&D program aimed at creating
autonomous, creative interactive intelligence.
Another
previous contender who has basically abandoned the race for true AI is Danny
Hillis, founder of the company Thinking Machines, Inc. This firm focused on the creation of an
adequate hardware platform for building real artificial intelligence – a
massively parallel, quasi-brain-like machine called the Connection Machine
(Hillis, 1987). However, their pioneering
hardware work was not matched with a systematic effort to implement a truly
intelligent program embodying all the aspects of the mind. The magnificent hardware design vision was
not correlated with an equally grand and detailed mind design vision. And at this point, of course, the Connection
Machine hardware has been rendered obsolete by developments in conventional
computer hardware and network computing.
On
the other hand, the well-known Cog project at MIT is aiming toward building
real AI in the long run, but their path to real AI involves gradually building
up to cognition after first getting animal-like perception and action to work
via “subsumption architecture robotics.”
This approach might eventually yield success, but only after decades.
Alan
Newell’s well-known SOAR project (http://ai.eecs.umich.edu/soar/) is another
project that once appeared to be grasping at the goal of real AI, but seems to
have retreated into a role of an interesting system for experimenting with
limited-domain cognitive science theories.
Newell tried to build “Unified Theories of Cognition”, based on ideas
that have now become fairly standard: logic-style knowledge representation, mental
activity as problem-solving carried out by an assemblage of heuristics,
etc. The system was by no means a total
failure, but it was not constructed to have a real autonomy or
self-understanding. Rather, it’s a
disembodied problem-solving tool. But
it’s a fascinating software system and there’s a small but still-growing
community of SOAR enthusiasts in various American universities.
Of
course, there are hundreds of other AI engineering projects in place at various
universities and companies throughout the world, but, nearly all of these involve
building specialized AI systems restricted to one aspect of the mind, rather
than creating an overall intelligent system.
The only significant attempt to “put all the pieces together” would seem
to have been the Japanese 5th Generation Computer System
project. But this project was doomed by
its pure engineering approach, by its lack of an underlying theory of
mind. Few people mention this project
these days. The AI world appears to
have learned the wrong lessons from it – they have taken the lesson to be that
integrative AI is bad, rather than that integrative AI should be approached
from a sound conceptual basis.
To
build a comprehensive system, with perception, action, memory, and the ability
to conceive of new ideas and to study itself, is not a simple thing. Necessarily, such a system consumes a lot
of computer memory and processing power, and is difficult to program and debug
because each of its parts gains its meaning largely from its interaction with
the other parts. Yet, is this not the
only approach that can possibly succeed at achieving the goal of a real
thinking machine?
We
now have, for the first time, hardware barely adequate to support a
comprehensive AI system. Moore’s law
and the advance of high-bandwidth networking mean that the situation is going
to keep getting better and better.
However, we are stuck with a body of AI theory that has excessively
adapted itself to the era of weak computers, and that is consequently divided
into a set of narrow perspectives, each focusing on a particular aspect of the
mind. In order to make real AI work, I
believe, we need to take an integrative perspective, focusing on
·
The creation of a “mind OS” that embodies the basic nature of mind, and
allows specialized mind structures and algorithms dealing with specialized
aspects of mind to happily coexist
·
The implementation of a diversity of mind structures and dynamics (“mind
modules”) on top of this mind OS
·
The encouragement of emergent phenomena produced by the
interaction/cooperation of the modules, so that the system as a whole is
coherently responsive to its goals
This
is the core of the Webmind vision. It
is backed up by a design and implementation of the Mind OS, and a detailed
theory, design and implementation for a minimal necessary set of mind
structures and dynamics to run on top of it.
It’s about time to delve into the nitty-gritty of
digital mind. But first, I feel obliged
to inject at least a little bit about that great AI bugaboo, consciousness. Sure, the skeptics will say, a computer
program can solve hard problems, and maybe even generalize its problem-solving
ability across domains, but it can’t be conscious, it can’t really feel and
experience, it can’t know that it is.
It’s easy to dismiss this kind of complaint by
observing that none of us really knows if other humans are conscious -- we just
assume, because living as a solipsist is a lot more annoying. But the concept of consciousness is worth a
little more attention than this.
What we call “consciousness” has several aspects,
including
·
self-observation and awareness of options
·
“free will” – choice-making behavior
·
inferential and empathetic powers
·
Perception/action loops
Within these various aspects, two different more
general aspects can be isolated:
Structured consciousness: There are certain
structures associated with consciousness, which are deterministic, cognitive
structures, involved with inference, choice, perception/action,
self-observation, and so on. These
structures are manifested a bit differently in the human brain and in the AI
Engine, but they are there in both places.
Raw consciousness: The “raw feel” of consciousness, which I
will discuss briefly, here.
What is often called the “hard problem” of
consciousness is how to connect the two.
Although few others may agree with me on this point, at this point, I
believe I know how to do this. The
answer can be found in many places; one of my favorite among these is the
philosophy of Charles S. Peirce.
Peirce’s Law of Mind
Peirce,
never one for timid formulations, declared that:
Logical
analysis applied to mental phenomena shows that there is but one law of mind,
namely, that ideas tend to spread continuously and to affect certain others
which stand to them in a peculiar relation of affectability. In this spreading
they lose intensity, and especially the power of affecting others, but gain
generality and become welded with other ideas.
This
is an archetypal vision of mind which I call "mind as relationship"
or "mind as network." In
modern terminology Peirce's "law of mind" might be rephrased as
follows: "The mind is an associative memory network, and its dynamic
dictates that each idea stored in the memory is an active agent, continually
acting on those other ideas with which the memory associates it."
Peirce
proposed a universal system of philosophical categories:
·
First: pure being
·
Second: reaction, physical response
·
Third: relationship
Mind
from the point of view of First is raw consciousness, pure presence, pure being. Mind from the point of view of Second is a
physical system, a mess of chemical and electrical dynamics. Mind from the point of view of Third is a
dynamic, self-reconstructing web of relations, as portrayed in the Law of
Mind.
Following
a suggestion of my friend, the contemporary philosopher Kent Palmer, I have
added an additional element to the Peircean hierarchy:
·
Fourth: synergy
In
the AI Engine, ideas begin as First, as distinct software objects called Nodes
or Links. They interact with each
other, which is Second, producing patterns of relationships, Third. In time, stable, self-sustaining ideas
develop, which are Fourth. In Peirce’s
time, it was metaphysics, today it is computer science!
And consciousness?
Raw consciousness is “pure, unstructured experience,” an aspect of
Peircean First, which manifests itself in the realm of Third as
randomness. Structured consciousness on
the other hand is a process that coherentizes mental entities, makes them more
“rigidly bounded,” less likely to diffuse into the rest of the mind. The two interact in this way: structured
consciousness is the process in which the randomness of raw consciousness has
the biggest effect. Structured
consciousness amplifies little bits of raw consciousness, as are present in
everything, into a major causal force.
The AI Engine implements structured consciousness in its system of
perception and action schema and short-term memory; the raw consciousness comes
along for free.
Obviously, this solution to the “hard problem” is by no means universally accepted in the cognitive science or philosophy or AI communities! There is no widely accepted view; the study of consciousness is a chaos. I anticipate that many readers will accept my theory of structured consciousness but reject my views on raw consciousness. This is fine: everyone is welcome to their own errors! The two are separable, although a complete understanding of mind must include both aspects of consciousness. Raw consciousness is a tricky thing to deal with because it is really outside the realm of science. Whether my design for structured consciousness is useful – this can be tested empirically. Whether my theory of raw consciousness is correct cannot. Ultimately, the test of whether the AI Engine is conscious is a subjective test. If it’s smart enough and interacts with humans in a rich enough way, then humans will believe it’s conscious, and will accommodate this belief within their own theories of consciousness.
So let’s cut to the chase. Prior to the formation of Webmind Inc., inspired by Peirce,
Nietzsche, Leibniz and other philosophers of mind, I spent many years of my
career creating my own ambitious, integrative philosophy of mind. After years searching for a good name, I
settled for “the psynet model” instead – psy for mind, net for network.
According to the psynet model of mind:
1. A mind is a system
of agents or "actors" (our currently preferred term) which are able
to transform, create & destroy other actors
2. Many of these
actors act by recognizing patterns in the world, or in other actors; others
operate directly upon aspects of their environment
3. Actors pass
attention ("active force") to other actors to which they are related
4. Thoughts, feelings
and other mental entities are self-reinforcing, self-producing, systems of
actors, which are to some extent useful for the goals of the system
5. These
self-producing mental subsystems build up into a complex network of attractors,
meta-attractors, etc.
6. This network of
subsystems & associated attractors is "dual network" in
structure, i.e. it is structured according to at least two principles:
associativity (similarity and generic association) and hierarchy
(categorization and category-based control).
7. Because of finite
memory capacity, mind must contain actors able to deal with
"ungrounded" patterns, i.e. actors which were formed from
now-forgotten actors, or which were learned from other minds rather than at
first hand – this is called "reasoning" (Of course, forgetting is
just one reason for abstract (or “ungrounded”) concepts to happen. The other is generalization --- even if the
grounding materials are still around, abstract concepts ignore the historical
relations to them.)
8. A mind possesses
actors whose goal is to recognize the mind as a whole as a pattern – these are
"self"
System of Actors having
relationships one with others and performing interactions one onto another.
According to the psynet model, at bottom the mind is a
system of actors interacting with each other, transforming each other,
recognizing patterns in each other, creating new actors embodying relations
between each other. Individual actors
may have some intelligence, but most of their intelligence lies in the way they
create and use their relationships with other actors, and in the patterns that
ensue from multi-actor interactions.
We need actors that recognize and embody similarity
relations between other actors, and inheritance relations between other actors
(inheritance meaning that one actor in some sense can be used as another one,
in terms of its properties or the things it denotes). We need actors that recognize and embody more complex
relationships, among more than two actors.
We need actors that embody relations about the whole system, such as
“the dynamics of the whole actor system tends to interrelate A and B.”
This swarm of interacting, intercreating actors leads
to an emergent hierarchical ontology, consisting of actors generalizing other
actors in a tree; it also leads to a sprawling network of interrelatedness, a
“web of pattern” in which each actor relates some others. The balance between the hierarchical and
heterarchical aspects of the emergent network of actor interrelations is
crucial to the mind.
Dual
network of actors involving hierarchical and heterarchical relationships.
The overlap of hierarchy and heterarchy gives the mind
a kind of “dynamic library card catalog” structure, in which topics are linked
to other related topics heterarchically, and linked to more general or specific
topics hierarchically. The creation of
new subtopics or supertopics has to make sense heterarchically, meaning that
the things in each topic grouping should have a lot of associative,
heterarchical relations with each other.
Macro-level mind patterns like the dual network are
built up by many different actors; according to the natural process of mind
actor evolution, they’re also “sculpted” by the deletion of actors. All these actors recognizing patterns and
creating new actors that embody them – this creates a huge combinatorial
explosion of actors. Given the finite
resources that any real system has at its disposal, forgetting is crucial to
the mind – not every actor that’s created can be retained forever. Forgetting means that, for example, a mind
can retain the datum that birds fly, without retaining much of the specific
evidence that led it to this conclusion. The generalization "birds
fly" is a pattern A in a large collection of observations B is retained,
but the observations B are not.
Obviously, a mind's intelligence will be enhanced if
it forgets strategically, i.e., forgets those items which are the least intense
patterns. And this ties in with the
notion of mind as an evolutionary system.
A system which is creating new actors, and then forgetting actors based
on relative uselessness, is evolving by natural selection. This evolution is
the creative force opposing the conservative force of self-production, actor
intercreation.
Forgetting ties in with the notion of grounding. A pattern X is "grounded" to the
extent that the mind contains entities in which X is in fact a pattern. For instance, the pattern "birds
fly" is grounded to the extent that the mind contains specific memories of
birds flying. Few concepts are completely grounded in the mind, because of the
need for drastic forgetting of particular experiences. This leads us to the need for
"reasoning," which is, among other things, a system of transformations
specialized for producing incompletely grounded patterns from incompletely
grounded patterns.
Consider, for example, the reasoning "Birds fly,
flying objects can fall, so birds can fall." Given extremely complete
groundings for the observations "birds fly" and "flying objects
can fall", the reasoning would be unnecessary – because the mind would
contain specific instances of birds falling, and could therefore get to the
conclusion "birds can fall" directly without going through two
ancillary observations. But, if specific memories of birds falling do not exist
in the mind, because they have been forgotten or because they have never been
observed in the mind's incomplete experience, then reasoning must be relied
upon to yield the conclusion.
The necessity for forgetting is particularly intense
at the lower levels of the system. In particular, most of the patterns picked
up by the perceptual-cognitive-active loop are of ephemeral interest only and
are not worthy of long-term retention in a resource-bounded system. The fact
that most of the information coming into the system is going to be quickly
discarded, however, means that the emergent information contained in perceptual
input should be mined as rapidly as possible, which gives rise to the
phenomenon of "short-term memory."
What
is short-term memory? A mind must
contain actors specialized for rapidly mining information deemed highly
important (information recently obtained via perception, or else identified by
the rest of the mind as being highly essential). This is "short term memory." It must be strictly
bounded in size to avoid combinatorial explosion; the number of combinations
(possible grounds for emergence) of N items being exponential in N. The short-term memory is a space within the
mind devoted to looking at a small set of things from as many different angles
as possible.
From what I’ve said so far, the psynet model is a
highly general theory of the nature of mind. Large aspects of the human mind,
however, are not general at all, and deal only with specific things such as
recognizing visual forms, moving arms, etc. This is not a peculiarity of humans
but a general feature of intelligence.
The generality of a transformation may be defined as the variety of
possible entities that it can act on; and in this sense, the actors in a mind
will have a spectrum of degrees of specialization, frequently with more
specialized actors residing lower in the hierarchy. In particular, a mind must contain procedures specialized for
perception and action; and when specific such procedures are used repeatedly,
they may become “automatized”, that is, cast in a form that is more efficient
to use but less flexible and adaptable.
This brings the WAE into a
congruent position with that of contemporary neuroscience, which has found
evidence both for global generic neural structures and highly domain-specific
localized processing.
Another thing that actors specialize for is
communication. Linguistic communication is carried out by stringing together
symbols over time. It is hierarchically based in that the symbols are grouped
into categories, and many of the properties of language may be understood by
studying these categories. More
specifically, the syntax of a language is defined by a collection of
categories, and "syntactic transformations" mapping sequences of
categories into categories. Parsing is the repeated application of syntactic
transformations; language production is the reverse process, in which
categories are progressively expanded into sequences of categories. Semantic transformations map structures
involving semantic categories and particular words or phrases into actors
representing generic relationships like similarity and inheritance. They take structures in the domain of
language and map them into the generic domain of mind.
And language brings us to the last crucial feature of
mind: self and socialization. Language
is used for communicating with others, and the structures used for semantic
understanding are largely social in nature (actor, agent, and so forth);
language is also used purely internally to clarify thought, and in this sense
it’s a projection of the social domain into the individual. Communicating about oneself via words or
gestures is a key aspect of building oneself.
The "self" of a mind (not the “higher self”
of Eastern religion, but the “psychosocial” self) is a poorly grounded pattern
in the mind's own past. In order to have a nontrivial self, a mind must
possess, not only the capacity for reasoning, but a sophisticated
reasoning-based tool (such as syntax) for transferring knowledge from strongly
grounded to poorly grounded domains. It must also have memory and a knowledge
base. All these components are clearly strengthened by the existence of a
society of similar minds, making the learning and maintenance of self vastly easier
The self is useful for guiding the
perceptual-cognitive-active information-gathering loop in productive
directions. Knowing its own holistic strengths and weaknesses, a mind can do
better at recognizing patterns and using these to achieve goals. The presence
of other similar beings is of inestimable use in recognizing the self – one
models one's self on a combination of: what one perceives internally, the
external consequences of actions, evaluations of the self given by other
entities, and the structures one perceives in other similar beings. It would be
possible to have self without society, but society makes it vastly easier, by
leading to syntax with its facility at mapping grounded domains into ungrounded
domains, by providing an analogue for inference of the self, by external
evaluations fed back to the self, and by the affordance of knowledge bases, and
informational alliances with other intelligent beings.
Clearly there is much more to mind than all this – as
we’ve learned over the last few years, working out the details of each of these
points uncovers a huge number of subtle issues. But, even without further specialization, this list of points
does say something about AI. It
dictates, for example,
·
that an AI system must be a dynamical system, consisting of entities
(actors) which are able to act on each other (transform each other) in a
variety of ways, and some of which are able to evaluate simplicity (and hence
recognize pattern).
·
that this dynamical system must be sufficiently flexible to enable the
crystallization of a dual network structure, with emergent, synergetic
hierarchical and heterarchical subnets
·
that this dynamical system must contain a mechanism for the spreading of
attention in directions of shared meaning
·
that this dynamical system must have access to a rich stream of
perceptual data, so as to be able to build up a decent-sized pool of grounded
patterns, leading ultimately to the recognition of the self
·
that this dynamical system must contain entities that can reason
(transfer information from grounded to ungrounded patterns)
·
that this dynamical system must be contain entities that can manipulate
categories (hierarchical subnets) and transformations involving categories in a
sophisticated way, so as to enable syntax and semantics
·
that this dynamical system must recognize symmetric, asymmetric and
emergent meaning sharing, and build meanings using temporal and spatial
relatedness, as well as relatedness of internal structure, and relatedness in
the context of the system as a whole
·
that this dynamical system must have a specific mechanism for paying
extra attention to recently perceived data ("short-term memory")
·
that this dynamical system must be embedded in a community of similar
dynamical systems, so as to be able to properly understand itself
·
that this dynamical system must act on and be acted on by some kind of
reasonably rich world or environment.
It is interesting to note that these criteria, while
simple, are not met by any previously designed AI system, let alone any existing
working program. The Webmind AI Engine
strives to meet all these criteria.
The Mind OS
The Webmind AI Engine embodies the psynet model
of mind by creating a “self-organizing
actors OS” (“Mind OS”), a piece of software that we call the Webmind Core, and
then creating a large number of special types of actors running on top of
this.
Most abstractly, we have Node actors, which embody
coherent wholes (texts, numerical data series, concepts, trends, schema for
acting); we have Link actors, which embody relationships between Nodes
(similarity, logical inheritance or implication, data flow, etc.); we have
Stimulus actors that spread attention between Nodes and Links; and we have
Wanderer actors that move around between Nodes building Links.
These general types of actors are then specialized
into 100 or so node types, and a dozen link types, which carry out various
specialized aspects of mind – but all within the general framework of mind as a
self-organizing, self-creating actor system.
There are also some macro level actors, Data-Structure Specialized
MindServers that simulate the behaviors of special types of nodes and links in
especially efficient, application-specific ways. These too are mind-actor, though specialized kinds.
Each actor in the whole system is viewed as having its
own little piece of consciousness, its own autonomy, its own life cycle – but
the whole system has a coherence and focus as well, eliminating component
actors that are not useful and causing useful actors to survive and intertransform
with other useful actors in an evolutionary way.
Self-organizing Actors OS, relying on perceptual data coming through
Short Term Memory, translated into system patterns. In turn, these patterns are subjects of meaning sharing and
reasoning driven by system dynamics toward emergence of Self.
There are a number of special languages that the MindOS uses to talk to other software programs. This kind of communication is mediated by a program called the CommunicatorSpace. Languages called MindScript, MindSpeak, and KNOW are used for various purposes, and messages in all these languages are transmitted in a special format called the Agent Interaction Protocol (AIP).
Communications between Webmind, its clients and other Webminds
On top of the Mind OS, we’re building a large and complex software system, with many different macro-level parts, most (but not quite all) based on the same common data structures and dynamics, and interacting using the Mind OS’s messaging system, and controlled overall using the Mind OS’s control framework.
The following diagram roughly summarizes the breakdown of high-level components in the system:
Of course, a
complete review of all these parts and what they do would be far beyond the
scope of this brief article. A quick
summary will have to suffice!
The Cognitive Core
Many of the components in the above architecture diagram are based on something called the “cognitive core.” This is a collection of nodes and links and processes that embodies what we feel is the essence of the mind, the essence of thinking. Different cognitive cores may exist in the same AI Engine, with different focuses and biases.
The AI processes going on in the cognitive core are:
First, activation spreading is simple enough. This process cycles through everything in the RT, spreading activation according to a variant on familiar neural network mathematics. There is some subtlety here in that the AI Engine contains various types of links that determine their weights in different ways, and they need to be treated fairly.
Importance updating is a process that determines how important each node or link is, and thus how much CPU time it gets. There’s a special formula for the importance of a node, based on its activation, the usefulness that’s been gained by processing it in the recent past, and so forth. Activation spreads around in the network, and this causes nodes that are linked to important nodes to tend to become important in the future – a simple manifestation of Peirce’s Law of Mind mentioned above.
Wanderer-based link formation is a process that selects pairs of nodes and, for each pair selected, builds links of several types between them, based on comparing the links of other types that the nodes already have. This is used to build inheritance and similarity links representing logical relationships, and also more Hebbian-style links representing temporal associations. There are also wanderers especially designed to create links based on higher-order reasoning: links between CompoundRelationNodes representing logical combinations of links, and so forth. Wanderers are guided in their wandering by links representing loose associations between nodes.
Reasoning is a process that chooses important nodes that haven’t been reasoned on lately, and seeks logical relations between them using the rules of uncertain logic. Similarly, association formation is a process that chooses important nodes and finds other nodes that are loosely associated with them.
Node Fission/fusion picks important nodes out of the net and combines them or subdivides them to create new nodes. CompoundRelationNode formation, on the other hand, picks links – relationships – out of the net and creates new nodes joining them via logical formulas. These processes provide the net with its creativity: they put new ideas in the system for the other processes to act upon as they wish. In evolutionary terms, very loosely, one might say that these processes provide reproduction, and the other processes provide fitness evaluation and selection based on fitness.
Context formation is a special process that mines a data structure called the TimeServer – which contains a complete record of all the system’s perceptions and actions -- and creates ContextNodes representing meaningful groupsings of perceptions and actions.
Goal formation creates new goals for the system. It starts off with some basic goals – to make users happy, to keep its memory usage away from the danger level, to create new information. It then uses heuristics to create new GoalNodes that relate to current GoalNodes as subgoals or supergoals. It picks important GoalNodes and then acts on them, executing schema that are thought to be useful for achieving these goals in current contexts. There is an assignment of credit process that rewards schema that have been useful for important goals in important contexts.
Learning of new schema is perhaps the trickiest task faced by the cognitive core. There is a process that exists just to decide which problems to focus schema learning on. This process finds goals that are important, and that are not adequately satisfied in some important contexts. It uses node formation and reasoning, together, to try to find schema that will satisfy these goals in the given contexts. It has to use inference to evaluate existing SchemaNodes to tell how well they satisfy the target goal in the target context. It has to create new random SchemaNodes using components that have been useful for achieving similar goals in similar contexts. It has to select pairs of SchemaNodes based on their effectiveness at achieving the target goal in the target context, and cross over and mutate them….
Finally, language processing is dealt with via special CompoundRelationNodes representing linguistic information, called FeatureStructureCRN’s. Syntax processing is handled via a variant of lexicalized feature structure unification grammar, fully unified with the other AI processes in the system.
And then a few support processes are needed. Statistics about the various AI processes and their successes and failures need to be gathered. Unimportant nodes and links need to be deleted, and replaced with new ones. The short-term memory, a special data structure consisting of nodes and links of current importance, needs to be filled up with newly important things.
All these processes have to work
together, tightly and intimately, on the same set of nodes. That’s how the mind works, that’s how
emergent intelligence is generated from the combination of diverse specialized
AI processes.
After three years of work, we’ve
made all these different aspects of the mind fit together conceptually, in a
wonderfully seamless way. We’ve
implemented software code that manifests this conceptual coherence. That’s the good news.
The bad news is, efficiency-wise,
our experience so far indicates that the AI Engine 0.5 architecture is probably
not going to be sufficiently efficient (in either speed or memory) to allow the
full exercise of all the code that’s been written for the cognitive core. Thus a rearchitecture of the cognitive core
is in progress, based on the same essential ideas.
Essentially, the solidification of
the conceptual design of the system over the last year allows us make a variety
of optimizations aimed at specializing some of the very general structures in
the current system so as to do better at the specific kinds of processes that
we are actually asking these structures to carry out. The current system is written for extreme generality, and it has
allowed us to experimentally design and implement a wide variety of AI
processes (although, for efficiency reasons, not to test all of them in
realistic situations, or in interesting combinations). Now that, through this experimental process,
we have learned specifically what kinds of AI processes we want, we can morph
the system into something more specifically tailored to carry out these
processes effectively. Fortunately, the current software architecture is
sufficiently flexible that it will almost surely be possible to move to a more
efficient architecture gradually, without abandoning the current general
software framework or rewriting all the code.
I’ve shown you a diagram of the
whole AI Engine and how it breaks down into components, and then I’ve explained
the cognitive core, a key subsystem that underlies many of the system
components. The next step is to tell
you what the various components in the overall diagram of the system actually
do. As with my review of the cognitive
core above, this is going to have to be done at a fairly high level, since
these are deep and tricky concepts and this is just a brief overview article.
The AttentionalFocus, first of
all, is the part of the system
that contains all types of nodes and
links, all brewing together in a highly CPU-intensive way, carrying out diverse
types of activities and manifesting emergent intelligence. It has the most active cognitive core of all
the components in the system.
The STM (short-term memory),
different from the AttentionalFocus, is a little cognitive core that contains
things of relevance to the system’s current situation. The contents of this are changing fast of
course. Recent percepts, recent
actions, planned actions and related concepts live hre.
The MindDB is a very large database that contains every node and link the system has ever entertained. It’s static, it doesn’t do anything, except receive nodes and links from other components, and give nodes and links out to other components.
The interactions between the MindDB and the AttentionalFocus are
mediated by the AttentionAllocationServer, a component that does nothing but embody Peirce’s Law of Mind
-- activation spreading and importance updating. It contains a large percentage of the nodes and links in the MindDB.
Its job is to take things out of the MindDB and push them into the
attentional focus or the STM, when they become generally important or relevant
to the current situation.
The
“CompoundRelationNode miner” is a specialized process that studies the MindDB
and scans it for “patterns” – i.e. abstract structures that occur frequently in
the net. The nodes it produces are
sent to the Higher-order inference Server for evaluation. The higher-order inference server is a
cognitive core that devotes most of its attention to higher-order reasoning.
The
SchemaExecutionServer is a cognitive core full of SchemaNodes, and potentially
other nodes as well. It contains little
mind programs that run and do things.
The SchemaLearning server is a cognitive core focused on learning new
schema.
The Context
formation server is a cognitive core devoted to forming new contexts, as
described earlier. And finally, the
dynamics server is a cognitive core devoted to “lightweight” processes – it
runs through a large percentage of the nodes in the MindDB, and carries out
basic AI processes on them, like forming associations, first-order inference,
wanderer-based link building.
These are all the
basic components needed to make the mind work.
The other components are specialized for specific mental abilities like
language comprehension and generation, language learning, downloading
information from the Web, reasoning about numbers, using higher-order inference
to make plans, etc.
It seems big and
complicated. But so is the human brain,
in spite of its minimal three pound mass.
There are hundreds of specialized regions in the brain – look at any 3D
brain atlas. All the brain regions use
the same basic structures – neurons, synapses, neurotransmitters – but deployed
in different ways, in different combinations,etc. Similarly, all the specialized components of the AI Engine
operate on the same substrate, nodes and links. Most of them are cognitive cores with special mixes of
processes. But it’s important to have
this componentized structure, to enable reasonably efficient solution of the
various problems the AI Engine is confronted with. The vision of the mind as a self-organizing actor system is
upheld, but enhanced: each mind is a self-organizing actor system that is
structured in a particular way, so that the types of actors that most need to
act in various practical situations, get the chance to act sufficiently
often. The current design is a marriage
of philosophical mind-theory and engineering practicality.
Obstacles
on the Path Ahead
The
three big challenges that we seem to face in moving from AI Engine 0.5 to AI
Engine 1.0 , and thus creating the world’s first real AI, are:
·
computational (space and time) efficiency.
·
getting knowledge into the system to accelerate experiential learning
·
parameter tuning for intelligent performance
We’ve
already discussed the efficiency issue and the strategic rearchitecting that is
taking place in order to address it.
Regarding
getting knowledge into the system, we are embarking on several related efforts. Several of these involve a formal language
we have created called KNOW – a sort of logical/mathematical language that
corresponds especially well with the AI Engine’s internal data structures. For example, in KNOW, “John gives the book
to Mary” might look like
[give John Mary book1 (1.0 0.9)]
[Inheritance book1 book (1.0 0.9)]
[author book1 John (1.0 0.9)] }
This KNOW text is composed of three sentences. Give, inheritance and author are relations (links), and John, Mary, book and books are the arguments (nodes). A text in KNOW can also be represented in XML format, which is convenient for various purposes.
Our
knowledge encoding efforts include the following:
·
Conversion of structured database data into KNOW format for import into
the AI Engine (This is for declarative knowledge.)
·
Human encoding of common sense facts in KNOW
·
Human encoding of relevant actions (both external actions like file
manipulations, and internal cognitive actions) using “schema programs” written
in KNOW
·
The Baby Webmind user interface, enabling knowledge acquisition through
experiential learning (this helps with both declarative and procedural
knowledge)
·
Creation of language training datasets so that schema operating in
various parts of the natural language module can be trained via supervised
learning.
Regarding
parameter optimization, there have been several major obstacles to effective
work in this area so far:
·
Slowness of the system has made the testing required for automatic
parameter optimization unacceptably slow
·
The interaction between various parameters is difficult to sort out
·
Complexity of the system makes debugging difficult, so that parameter
tuning and debugging end up being done simultaneously
One
of the consequences of the system rearchitecture described above will be to
make parameter optimization significantly easier, both through improving system
speed, and also through the creation of various system components each
involving fewer parameters.
Summing
up the directions proposed in these three problem areas (efficiency, knowledge
acquisition, and parameter tuning), one general observation to be made is that,
at this stage of our design work, analogies to the human mind/brain are playing
less and less of a role, whereas realities of computer hardware and machine
learning testing and training procedures are playing more and more of a
role. In a larger sense, what this
presumably means is that while the analogies to the human mind helped us to
gain a conceptual understanding of how AI has to be done, now that we have this
conceptual understanding, we can keep the conceptual picture fixed, and vary
the underlying implementation and teaching procedures in ways that have less to
do with humans and more to do with computers.
Finally,
while the above issues are the ones that currently preoccupy us, it’s also
worth briefly noting the obstacles that we believe will obstruct us in getting
from AI Engine 1.0 to AI Engine 2.0, once the current problems are
surpassed.
The
key goal with AI Engine 2.0 is for the system to be able to fully understand
its own source code, so it can improve itself through its own reasoning, and
make itself progressively more intelligent.
In theory, this can lead it to an exponentially acceleration of system
intelligence over time. The two
obstacles faced in turning AI Engine 1.0 into such a system are
·
the creation of appropriate “inference control schema” for the
particular types of higher-order inference involved in mathematical reasoning
and program optimization
·
the entry of relevant knowledge into the system.
The
control schema problem appears to be solvable through supervised learning, in
which the system is incrementally led through less and less simplistic problems
in these areas (basically, this means we will teach the system these things, as
is done with humans).
The
knowledge entry problem is trickier, and has two parts:
·
giving the system a good view into its Java implementation
·
giving the system a good knowledge of algorithms and data structures
(without which it can’t understand why its code is structured as it is).
Giving
the system a meaningful view into Java requires mapping Java code into a kind
of abstract “state transition graph,” a difficult problem which fortunately has
been solved by some of our friends at Supercompilers LLC
(www.supercompilers.com), in the course of their work creating a Java
supercompiler. Giving the system a
knowledge of algorithms and data structures could be done by teaching the system
to read mathematics and computer science papers, but we suspect this is a
trickier task that it may seem, because these are a specialized form of human
discourse, not as formal as they appear at first glance. In order to jump-start the system’s understanding
of scientific literature in these areas, we believe it will be useful to
explicitly encode knowledge about algorithms and data structures into the Mizar
formalized mathematics language, from which it can then be directly translated
in to AI Engine nodes and links. (This
is a project that we would undertake now, if we were faced with an
infinite-human-resources situation!)
Encoding knowledge into the system is all very well, but this can never be the primary way a mind gains information. Knowledge encoding is only useful as an augmentation to learning through direct interaction with the world and with other minds -- learning through experience.
Human infants
learn through experience, and as we all know this is a long and difficult
process. We’ve seen, in the previous
sections of this article, the incredible number of specialized mind-actors that
appear to be necessary in order to implement, within practical computational
constraints, the self-organizing system of intercreating actors that is the
mind. Given this diversity and
complexity, it’s sobering to realize that this integrated AI Engine will not,
when initially completed, be a mature mind: it will be an unformed infant.
The experience of
this AI Engine infant will be diverse, including spidering the Web, answering
user queries, analyzing data series, and so on. But in order for the Baby Webmind to grow into a mature and
properly self-aware system, it will need to be interacted with closely, and
taught, much like a young human. Acting
and perceiving and planning intelligently must begin on the simple “baby”
level, and learned via interaction with another intelligent mind. Their intelligence will then increase
exponentially with the system’s experience.
The result of this
“Baby Webmind” teaching project will be a Webmind AI Engine that can converse
about what it’s doing, not necessarily with 100% English fluency nor with the
ability to simulate a human, but with spontaneity and directness and
intelligence – speaking from its own self like the real living being it
is. We intend to create a system that
will subjectively appear intelligent and self-aware to a majority of
intelligent human beings (including computer scientists). Note that this is NOT the Turing Test, we
are not seeking to fool anyone into believing that WebMind is a human. Creating an AI system with this kind of
acting ability will be saved for later!
The basic ideas of
experiential interactive learning are very general, and would apply to a
Webmind AI Engine with arbitrarily diverse sense organs – eyes, ears, nose,
tongue,…. However we have worked out
the ideas in detail only in the concrete context of the current AI Engine,
whose inputs are textual and numerical only.
Extension of this framework to deal with music, sound, image or video
input could be accomplished fairly naturally, the difficulties being in current
computational processing limitations and file manipulation mechanics rather
than in the data structures and algorithms involved.
Baby Webmind User Interface
The Baby Webmind
User Interface provides a simple yet flexible medium within which the AI Engine
can interact with humans and learn from them.
It has the following components:
·
A chat window, where we can chat with the AI Engine
·
Reward and punishment buttons, which ideally should allow us to vary the
amount of reward or punishment (a very hard smack as opposed to just a plain
ordinary smack…)
·
A way to enter our emotions in, along several dimensions
·
A way for the AI Engine to show us its emotions [technically: the
importance values of some of its FeelingNodes]
A comment on the
emotional aspect is probably appropriate here.
Inputting emotion values via GUI widgets is obviously a very crude way
of getting across emotion, compared to human facial expressions. The same is true of Baby Webmind’s list of
FeelingNode importances: this is not really the essence of the system’s
feelings, which are distributed across its internal network. Ultimately we’d like more flexible
emotional interchange, but for starters, I reckon this system gives at least as
much emotional interchange as one gets through e-mail and emoticons.
The next question
is: what is Baby Webmind talking to us, and sharing feelings with us,
about? What world is it acting in?
Initially, Baby
Webmind’s world consists of a database of files, which it interacts with via a
series of operations representing its basic receptors and actuators. It has an automatic ability to perceive
files, directories, and URL’s, and to carry out several dozen basic file
operations.
The system’s
learning process is guided by a basic motivational structure. The AI Engine wants to achieve its goals,
and its Number One goal is to be happy.
Its initial motivation in making conversational and other acts in the
Baby Webmind interface is to make itself happy (as defined by its happiness
FeelingNode).
Diagram of the EIL
information workflow and major processing phases
from the User Interface to the Webmind server and back again,
using our Agent Interaction Protocol to mediate communications.
How is the
Happiness FeelingNode defined? For
starters,
1. If the humans
interacting with it are happy, this increases the WAE's happiness.
2. Discovering
interesting things increases the WAE’s happiness
3. Getting happiness
tokens (from the user clicking the UI’s reward button) increases the WAE’s
happiness
The determinants of happiness in humans change as the human becomes more mature, in ways that are evolutionarily programmed into the brain. We will need to effect this in the AI Eengine as well, manually modifying the grounding of happiness in the Happiness FeelingNode as the system progresses through stages of maturity. Eventually this process will be automated, once there are many Webmind instances being taught by many other people and Webmind instances, but for the first time around, this will be a process of ongoing human experimentation.
Webworld
One
of the most critical aspects of the AI Engine – schema learning, the learning
of procedures for perceiving, acting and thinking -- is also one of the most
computationally intractable. Based on
our work so far, this is the one aspect of mental processing that seems to
consume an inordinate amount of compute power.
Some aspects of computational language learning are also extremely
computationally intensive, though not quite so much so.
Fortunately,
though, none of these supremely computation-intensive tasks need to be done in
real time. This means that they can be
carried out through large-scale distributed processing, across thousands or
even millions of machines loosely connected via the Net. Our system for accomplishing this kind of
wide-scale “background processing” is called Webworld.
Webworld
is a sister software system to the AI Engine, sharing some of the same codebase,
but serving a complementary function. A
Webworld lobe is a much lighter-weight version of an AI Engine lobe, which can
live on a single-processor machine with a modest amount of RAM, and potentially
a slow connection to other machines.
Webworld lobes host actors just like normal mind lobes, and they
exchange actors and messages with other Webworld lobes and with AI
Engines. AI Engines can dispatch
non-real-time, non-data-intensive “background thinking” processes (like schema
learning and language learning problems) to Webworld, thus immensely enhancing
the processing power at their disposal.
Webworld is a key part of the Webmind Inc. vision of an intelligent
Internet. It allows the AI Engine’s
intelligence to effectively colonize the entire Net, rather than remaining
restricted to small clusters of sufficiently powerful machines.
We’ve been discussing the AI Engine as a mind, an isolated system – reviewing its internals. But actually this is a limited, short-term view. The AI Engine will start out as an isolated mind, but gradually, as it becomes a part of Internet software products, it will become a critical part of the Internet, causing significant changes in the Internet itself.
To understand the potential nature
of these changes, it’s useful to introduce an additional philosophical concept,
the metasystem transition.
Coined by Valentin Turchin, this refers, roughly speaking, to the point
in a system’s evolution at which the whole comes to dominate the parts.
According to
current physics theories, there was a metasystem transition in the early
universe, when the primordial miasma of disconnected particles cooled down and
settled into atoms. There was a
metasystem transition on earth around four billion years ago, when the steaming
primordial seas caused inorganic chemicals to clump together in groups capable
of reproduction and metabolism. (Or, as
recent experiments suggest, perhaps this did not first happen in aerobic
environments but deep in crevasses and at high pressures and
temperatures.) Unicellular life
emerged, and once chemicals are embedded in life-forms, the way to understand
them is not in terms of chemistry alone, but rather, in terms of concept like
fitness, evolution, sex, and hunger. And there was another metasystem
transition when multicellular life burst forth – suddenly the cell is no longer
an autonomous life form, but rather a component in a life form on a higher
level.
Note that the
metasystem transition is not an antireductionist concept, in the strict
sense. The idea isn’t that
multicellular lifeforms have cosmic emergent properties that can’t be explained
from the properties of cells. Of
course, if you had enough time and superhuman patience, you could explain what
happens in a human body in terms of the component cells. The question is one of naturalness and
comprehensibility, or in other words, efficiency of expression. Once you have a multicellular lifeform, it’s
much easier to discuss and analyze the properties of this lifeform by reference
to the emergent level than by going down to the level of the component
cells. In a puddle full of paramecia,
on the other hand, the way to explain observed phenomena is usually by
reference to the individual cells, rather than the whole population – the
population has less wholeness, fewer interesting properties, than the
individual cells.
In the domain of
mind, there are also a couple levels of metasystem transition. The first one is what we might call the emergence
of “mind modules.” This is when a huge
collection of basic mind components – cells, in a biological brain; “software
objects” in a computer mind – all come together in a unified structure to carry
out some complex function. The whole is
greater than the sum of the parts: the complex functions that the system
performs aren’t really implicit in any of the particular parts of the system,
rather they come out of the coordination of the parts into a coherent
whole. The various parts of the human
visual system are wonderful examples of this.
Billions of cells firing every which way, all orchestrated together to
do one particular thing: map visual output from the retina into a primitive map
of lines, shapes and colors, to be analyzed by the rest of the brain. The best current AI systems are also
examples of this. In fact, computer
systems that haven’t passed this transition I’d be reluctant to call “AI” in
any serious sense.
But, mind modules
aren’t real intelligence, not in the sense that we mean it: Intelligence as the
ability to carry out complex goals in complex environments. Each mind module only does one kind of
thing, requiring inputs of a special type to be fed to it, unable to
dynamically adapt to a changing environment.
Intelligence itself requires one more metasystem transition: the
coordination of a collection of mind modules into a whole mind, each module
serving the whole and fully comprehensible only in the context of the
whole.
The AI Engine
achieves this by allowing the interoperation of these intelligent modules
within the context of a shared semantic representation – nodes, links and so
forth. Through the shared semantic
representation these different intelligent components can interact and thus
evolve a dynamical state which is not possible within any one of the
modules. Like a human brain, each
specialized sub-system is capable of achieving certain complex perceptual (such
as reading a page of text) or cognitive (such as inferring causal relations)
goals which in themselves seem impressive - but when they are integrated, truly
exciting new possibilities emerge.
Taken in combination, these intelligent modules embodying systems such
as reasoning, learning and natural language processing, etc. undergo a
metasystem transition to become a mind capable of achieving complex goals
worthy of comparison to human abilities.
The resulting mind can not be described merely as a pipeline of AI
process modules, rather it has its own dynamical properties which emerge from
the interactions of these component parts, creating new and unique patterns
which were not present in any of the sub-systems.
Such a metasystem
transition from modules to mind is a truly exciting emergence. But it’s by no means the end of the
line. Turchin, the conceiver of the
metasystem transition concept, proposed in his 1973 book The Phenomenon of
Man (Turchin, 1977 is the English-language reference, but as the book is
out of print see http://pespmc1.vub.ac.be/POSBOOK.html)
that the Internet and other communication technologies had the potential to
lead to a new metasystem transition, in which humans are in some sense
subordinate to an emergent “global brain.”
If this is the case, it’s quite likely that systems like the AI Engine
will play a major part in the coming transition. The AI Engine is part of the Web, but can also trigger
self-organization in the Web as a whole, eventually overtaking the Web and
causing the Web to experience its own Webmind-related metasystem
transition.
As we see it, the
path from the Net that we have today to the global brain that Turchin
envisioned – something that envelops humans and machines in a single
overarching superorganism -- involves at least two metasystem transitions. The
first one is the emergence of the global web mind – the transformation of the
Internet into a coherent intelligent system.
Currently the best way to explain what happens on the Net is to talk
about the various parts of the Net: particular Websites, e-mail viruses,
shopping bots, and so forth. But there
will come a point when this is no longer the case, when the Net has sufficient
high-level dynamics of its own that the way to explain any one part of the Net
will be by reference to the whole.
This will come about largely through the interactions of AI systems –
intelligent programs acting on the behalf of various Websites, Web users,
corporations, and governments will interact with each other intensively,
forming something halfway between a society of AI’s and an emergent mind whose
lobes are various AI agents serving various goals. The traditional economy will be dead, replaced by a chaotically
dynamical hypereconomy in which there are no intermediaries except for
information intermediaries: producers and consumers (individually or in large
aggregates created by automatic AI discovery of affinity groups) negotiate
directly with each other to establish prices and terms, using information
obtained from subtle AI prediction and categorization algorithms. How far off this is we can’t really tell,
but it would be cowardly not to give an estimate: we’re betting no more than 10
years.
The second metasystem transition will be the one envisioned by Turchin and his colleages at Principia Cybernetica (http://pespmc1.vub.ac.be/): the effective fusion of the global Web mind and the humans interacting with it. As time goes by, more and more of our interactions will be mediated by the global emergent intelligent Net – every appliance we use will be jacked into the matrix; every word that we say potentially transmitted to anyone else on the planet using wearable cellular telephony or something similar; every thought that we articulate entered into an AI system that automatically elaborates it and connects it with things other humans and AI agents have said and thought elsewhere in the world – or things other humans and AI agents are expected to say based on predictive technology….
Architecture
Diagram for an Intelligent Internet
Taking this vision
one step closer to reality, let’s look at what this might mean in terms of the
Internet of the next 5 years. Of
course, we realize that no such “map of the future” is likely to be extremely
accurate. The Internet is a complex and
rapidly evolving system. No one person,
company or computer program can control it.
But nonetheless, we can all take part in guiding it. And in order to do this intelligently, an
overarching vision is required.
The figure below
is an attempt at an “architecture diagram” for the entire Net, in its
Webmind-infused form. Naturally, any
diagram with such a broad scope is going to skip over a lot of details. The point is to get across a broad global
vision:
First, we have a
vast variety of “client computers,” some old, some new, some powerful, some
weak. Some of these access the
intelligent Net through dumb client applications – they don’t directly
contribute to internet intelligence at all.
Others have smart clients such as WebWorld clients, which carry out two
kinds of operations: personalization operations intended to help the machines
serve particular clients better, and general AI operations handed to them by
sophisticated AI server systems or other smart clients.
Next there are
“commercial servers”, computers that carry out various tasks to support various
types of heavyweight processing – transaction processing for e-commerce
applications, inventory management for warehousing of physical objects, and so
forth. Some of these commercial servers
interact with client computers directly, others do so only via AI servers. In nearly all cases, these commercial
servers can benefit from intelligence supplied by AI servers.
Finally, there is
what I view as the crux of the intelligent Internet: clusters of AI servers
distributed across the Net, each cluster representing an individual
computational mind. Some of these will
be Webminds, others may be other types of AI systems. These will be able to communicate via a common language, and will
collectively “drive” the whole Net, by dispensing problems to client machines
via WebWorld or related client-side distributed processing frameworks, and by providing
real-time AI feedback to commercial servers of various types. Some AI servers will be general-purpose and
will serve intelligence to commercial servers using an ASP (Application Service
Provider) model; others will be more specialized, tied particularly to a
certain commercial server (e.g. Yahoo might have its own AI cluster to back-end
its portal services).
Is this the final configuration for the Global Brain? No way. Is it the only way to do things? No. But this seems the most workable architecture for moving things from where they are now to a reasonably intelligent Net. After this, the dynamics of societies of AI agents become the dominant factor, with the commercial servers and client machines as a context. And after that….
A Path to the Singularity
The Global Brain is a fascinating aspect
of the AI Engine’s likely future – but it’s not the end of the story. Another part of the grand and fabulous
future is the Singularity – a meme that seems to be on the rise these days.
The notion of “the Singularity” is not
specifically tied to AI; it was proposed in the 70’s by science fiction writer
Vernor Vinge, referring to the notion that the accelerating pace of
technological change would ultimate reach a point of discontinuity. At this point, our predictions are pretty
much useless – our technology has outgrown us in the same sense that we’ve
outgrown ants and beavers.
Eliezer Yudkowsky and Brian Atkins have
founded a non-profit organization called the Singularity Institute (http://www.singinst.org/intro.html) devoted to helping to bring about the Singularity, and making sure
it’s a positive event for humanity rather than the instantaneous end of
humankind. Yudkowsky has put particular
effort into understanding the AI aspects of the singularity, discoursing
extensively on the notion of Friendly AI – the creation of AI systems that, as
they rewrite their own source code achieving progressively greater and greater
intelligence, leave invariant the portion of their code requiring them to be
friendly to human beings (see
http://singinst.org/CaTAI/contents.html).
The notion of the Singularity seems to be a
valid one, and the notion of an AI system approach it by progressively
rewriting its own source code also seems to be valid. But not all of the details of Yudkowsky’s vision of
Singularity-inducing AI seem to be sufficiently carefully considered. From a Webmind AI Engine perspective, the
following is the sequence of events that seems most likely to lead up to the
Singularity:
1)
Someone (most
likely the Webmind AI Engine team!) creates a fairly intelligent AI, one that
can be taught, conversed with, etc.
2)
This AI is
taught about programming languages, is taught about algorithms and data structures,
etc.
3)
It begins by
being able to write and optimize and rewrite simple programs
4)
After it
achieves a significant level of practical software engineering experience and
mathematical and AI knowledge, it is able to begin improving itself ... at
which point the hard takeoff begins.
My intuition is that, even in this picture,
the “hard takeoff” to superhuman intelligence will take a few years, not
minutes. But that's still pretty fast
by the standards of human progress.
Although his writings are not 100% clear on this point, I think that this picture of the path to the Singularity is a little different from that of Eliezer Yudkowsky. My sense is that he views self-modification as entering into the picture earlier, perhaps in Stage 1, as the best way of getting to the first "fairly intelligent AI." I'm not 100% sure this is wrong, but after a lot of thought I have not seen a good way to do this, whereas I have a pretty clear picture of how to get to the Singularity according to the steps I've outlined here. The Singularity emerges as a consequence of emergence-producing, dynamic feedback between the AI Engine and intelligent program analysis tools like the Java supercompiler. The global brain then becomes not only intelligent but superintelligent, and we, as part of the global brain, are swept up into this emerging global superintelligence in ways that we can barely begin to imagine.
To cast the self-modification problem in the language of Webmind AI, it suffices to observe that self-modification is a special case of the kind of problem we call “schema learning. The AI Engine itself is just a big procedure, a big program, a big schema. The ultimate application of schema learning, therefore, is the application of the system to learn how to make itself better. The complexity of the schema learning problem, with which we have some practical experience, suggests how hard the “self-modifying AI” problem really is. Sure, it’s easy enough to make a small, self-modifying program. But, such a program is not intelligent. It’s closer to being “artificial life” of a very primitive nature. Intelligence within practical computational resources requires a lot of highly specialized structures. These lead to a complicated program – a big, intricate mind-schema – which is difficult to understand, optimize and improve. Creating a simple self-modifying program and expecting it to become intelligent through progressive environment-driven self-modification is an interesting research programme, but it seems more like an attempt to emulate the evolution of life on Earth, than an attempt to create a single intelligence within a reasonable time frame.
But just because the “learn my own schema” problem is hard, doesn’t mean it’s unsolvable. A Java or C program can be represented as a schema in the AI Engine’s internal data structures, and hence it can be reasoned about, mutated and crossed over, and so forth. This is what needs to be done, ultimately, to create a system that can understand itself and make itself smarter and smarter as time goes on – eliminating the need for human beings to write AI code and write books like this one. Reasoning about schema representing Java programs requires a lot of specialized intuition, and specialized preprocessing may well be useful here, such as for instance the automated analysis and optimization of program execution flow being done in the Java supercompilation project (www.supercompilers.com). There is a lot of work here, but it’s a fascinating direction, and a necessary one.
All of us involved in the project believe that the Webmind AI Engine, once fully implemented and tested, will lead to a computer program that manifests intelligence, according to the criterion of being able to carry out conversations with humans that will be subjectively perceived as intelligent. It will demonstrate an understanding of the contexts in which it is operating, an understanding of who it is and why it is doing what it is doing, an ability to creatively solve problems in domains that are new to it, and so forth. And of course it will supersede human intelligence in some respects, by combining an initially probably modest general intelligence with capabilities unique to digital computers like accurate arithmetic and financial forecasting.
All the bases are covered in the design given here: every major aspect of the mind studied in psychology and brain science. They’re all accomplished together, in a unified framework. It’s a big system, it’s going to demand a lot of computational resources, but that’s really to be expected; the human brain, our only incontrovertible example of human-level intelligence, is a complex and powerful information-processing device.
Not all aspects of the system are original in conception, and indeed, this is much of the beauty of the thing. The essence of the system is the provision of an adaptable self-reconstructing platform for integration of insights from a huge number of different disciplines and subdisciplines. In Webmind aspects of mind that have previously seemed disparate are drawn together into a coherent self-organizing whole. The cliché’ Newton quote, “If I’ve seen further than others, it’s because I’ve stood on the shoulders of giants” inevitably comes to mind here. (As well as the modification I read somewhere: “If others have seen further than me, it’s because giants were standing on my shoulders.”….) The human race has been pushing toward AI for a long time – the Webmind AI Engine, if it is what I think it is, just puts on the finishing touches.
While constructing
an ambitious system like this naturally takes a long time, we were making
steady and rapid progress until Webmind Inc.’s dissolution in early 2001. It seems Arthur C. Clarke was
off by a bit -- Webmind won’t be talking like HAL in the film 2001 until a bit
later in the millennium. But
if the project can be rapidly refunded, before the group of people with AI
Engine expertise dissipates, we can expect Baby Webmind’s first moderately
intelligent conversations sometime in the year 2002, and that’s going to be
pretty bloody cool!
Coda:
Answers to Common Complaints
Finally, what are the complaints and counterarguments most often heard
when discussing this project with outsiders?
First, there are those who just don’t believe AI is possible, or believe
that AI is only possible on quantum computers, or quantum gravity computers,
etc. Forget about them. They’ll see. You can’t argue anyone out of their religion. Science is on the side of digital AI at this
point, as has been exhaustively argued by many people.
Then there are those who feel the system doesn’t go far enough in some
particular aspect of the mind: temporal or causal reasoning, or grammar
parsing, or perceptual pattern recognition, or whatever. This complaint usually comes from people who
have a research expertise in one or another of these specialty areas. The WAE’s general learning algorithms, they
say, will always be inferior to the highly specialized techniques that they
know so well. My feeling is that the
current WAE design is about specialized enough. I don’t think it is so overspecialized as to become brittle and
non-adaptable, but I worry that if it becomes more overspecialized, this will
be the case. My intuition is that
things like temporal and causal reasoning should be learned by the system as
groundings of the concepts “time” and
“cause” and related concepts, rather than wired in.
On the other side, there are those who feel that the system is “too
symbolic.” They want something more
neural-nettish, or more like a simple self-modifying system as I described in Chaotic
Logic and From Complexity to Creativity. I can relate to this point of view quite well,
philosophically. But a careful analysis
of the system’s design indicates that there is nothing a more subsymbolic
system can do that this one can’t. We
have schema embodying Boolean networks, feeding input into each other, learning
interrelationships via hebbian learning, and being evolved by a kind of
evolutionary-ecological programming.
This is in fact a subsymbolic network of procedures, differing from an
evolutionary neural net architecture only in that the atomic elements are
Boolean operators rather than threshold operators – a fairly insubstantial
difference which could be eliminated if there were reason to do so. The fact that this subsymbolic evolving
adaptive procedure network is completely mappable into the symbolic,
inferential aspect of the system is not a bad thing, is it? I would say that in the WAE design we have
achieved a very smooth integration of the symbolic and subsymbolic domains,
even smoother than is likely to exist in the human brain. This will serve WAE well in the future.
There’s the complaint that Baby Webmind won’t have a rich enough
perceptual environment with just the Internet.
Maybe. Maybe we’ll need to hook
up eyes and ears to it. But there’s a
hell of a lot of data out there, and the ability to correlate numerical and
textual data is a good correlate of the crossmodal sensory correlation that is
so critical to the human brain. I
really believe that this complaint is just plain old anthropomorphism.
There’s the complaint that there are too many parameters and it will
take forever to get it to actually work, as opposed to theoretically
working. This is indeed a bit of a
worry, I can’t deny it. But we’ve gone
a long way by testing and tuning the individual modules of the system
separately, and so far our experience indicates that the parameter values giving
optimal function for independent activity of a mind module are generally at
least acceptable values for the activity of that mind module in an
integrated WAE context. A methodology
of tuning parameters for subsystems in isolation, then using the values thus
obtained as initial points for further dynamic adaptation, seems very likely to
succeed in general just as it has in some special cases already.
Finally, there are those who reckon the design is about right, but we
just don’t have the processing power and memory to run it, yet. This complaint scares me a little bit
too. But not too much. Based on our experimentation with the system
so far, there are only two things that seems to require vastly more computer
power than is available on a cluster of a few dozen powerful PC’s. The first thing, schema learning, is
something that can be done offline, running in the background on millions of
PC’s around the world. Webworld. The second, real-time conversation
processing, can likely be carried out on a single supercomputer, serving as the
core of the AI Engine cluster. We have
a very flexible software agents system that is able to support a variety of
different hardware configurations, and we believe that by utilizing available
hardware optimally, we can make a fairly smart computer program even in
2001-2002. Of course, the more hardware
it gets, the clever it will become….
Soon enough it will be literally begging us for more, more, more!