Drives:
Problem: How do we
program a robot to explore the world and generalize from it?
Feelings and Characteristics:
Self-preservation.
Gentleness
Imitation
Pain
Anger
Pleasure
Curiosity, desire to
explore
Short attention span
Striving for
relationships, understanding
Sequences as
relationships
Simulation
Awareness of self,
not-self\
Response to
"no"
Ability to extrapolate,
simulate a sequence, given repeated reinforcement
The Protean
adaptability of the human mind
Visual Properties:
How do
we remember when we've done something before?
How do we
convert what would otherwise be a featureless, continuous time
track into a sequence of memorable events?
The
principal animation track consists of the robot's location,
direction of gaze, and whatever else is happening to it.
Given a
location and a direction of gaze, the expected image is
reconstructable. However, if anything changes, we remember the
scene as it was before the change, together with the scene as it
appeared after the change. Also, if there were any unique sounds
or other happenings, we will remember the scene and the
surrounding circumstances. For example, most people remember
where they were and what they were doing when they heard that JFK
had been shot.
Objects,
sounds, smells, tastes, and above all, events, can trigger
recollections of particular action sequences when we "went
so-and-so and did such-and-such".
How about
this hypothesis? We remember the unexpected. We also attach
weights to what we remember. If we glance at something or note
its existence out of the corner of an eye, we don't tie it to the
day's animation track (or we do so with such a low weight that it
is soon forgotten). If in the process, we absorb a new level of
detail, we may remember the detail without remembering when we
saw it. However, if something unexpected happens, then we tend to
associate the object with the action sequence and to attach a
higher weight to the association, remembering it better.
We remember
the animation track surrounding an emotionally-charged event. I
remember the night I spent in the hospital after my
tonsillectomy. I remember the events surrounding the time I had
laughing gas.
The
learning of relationships is independent of remembering the
animation tracks at the times when they were learned.
We have the
ability to project trends. For example, if something is slowing
down or speeding up, we will project a continuation of this
trend. Of course, we have to be able to abstract the general
concept of "slowing down" or "speeding up".
Sounds:
Once a
sound has become familiar, we become comfortable with it even if
we don't know what it is (unless it's something we deem harmful
or ominous).
With
sounds, as with everything else, we abstract larger and larger
patterns.
Recognizing
timbre and unique voiceprint might be at the lowest level above
speech recognition itself.
Recognizing
accents and speech stylese.g., whiny, bubbly,
staccatomight be the next level up.
Recognizing
someone's pet phrases and expressions entails a high level of
verbal analysis.
The highest
levels of speech presuppose a general knowledge of the world.
Higher-Level Reasoning:
How do we
go about solving problems and inventing solutions?
For
example, how does the robot grasp the idea of using a concave
shape to hold water? It already knows the concept of gravity and
that water will fall from prior experience. It also knows that as
long as an object is supported by something, it won't fall. The
robot can see that the water in a glass of water isn't falling.
But how does the robot's little mind generalize to the idea that
water must be cupped to keep it from falling down?
Idea: The robot
might pick up the glass of water and move it around. Then since
other glasses are interchangeable with the given glass, and since
other things that are shaped like a glass may be included in the
generic classification called "glasses", it might be
that the robot would expect that water could be held up by
anything that is classified as a glass. However, this doesn't
really account for the mentation which says "I've got a
problem. How do I solve it?", and then proceeds to invent a
solution.
We would
like something more than a trial-and-error discovery that
cup-shaped things hold water. We would like the realization that
liquids must be held in containers, and then the insight that
says, "Hey! If I use a cup-shaped container, it ought to
hold water!"
The robot
is building a world model.
Purpose
enters in here. The idea of trying to create a tool
Concavity
is not a vary obvious common property. But what's really in order
is observing the property and behavior of water and then
The robot
might play with the water. It might tip the water in the course
of examining it and might observe that the water fell down. Then
through repeated trials, it might observe that the water spilled
out and fell down when it was tilted just beyond the edge of the
container. It might shake the container and cause the water to be
spilled out of it. It mightand here's where we get into
inventionpour the water into another container and observe
that it was no longer in the original container. (One of the
lessons it would have to learn would be that after the water
spilled out of the first container, it was no longer there.)
Before we
deal with invention, we must learn verbs, adjectives, and
adverbs.
How would
the robot learn its colors?
We would
show it many different red objects while saying the word
"red". The robot would have to determine that what all
of the objects had in common was "redness".
The robot
could be trained by guiding it in pointing to red objects and
then letting it find and point to red objects on its own.
We wouldn't
want to cross-correlate all red objects with each other. This
means that there must an attribute of redness which exists
independently of any given object. Otherwise, we would have to
cross-correlate "redness" among all the red objects.
(In a way, we'll be doing that, in the sense that we'll have
pointers from every red-colored object or feature to a
"red" attribute stored only once for each remembered
shade of red. To a certain extent, there may be pointers from the
"red" attribute back to the red objects.) It follows
that there will be entities other than unique objects and unique
events in the database. Generic objects and generic events may
also be stored like these attributes, with two-way pointers back
to unique objects and unique events. Here, we may want to allow
pointers back to all the objects and events themselves. After
all, this would only double the number of required pointers. The
pointers will have weights attached to them that will designate
the strength of the association and that will gradually be
reduced over time. We might want to use four bytes for the
pointers to allow up to 4,294,967,296 table entries. (Three bytes
would give us 16,777,216 entries in each table or file and would
probably be sufficient.) "Red" might include a very
approximate range of RGB values and the word "red" in
text and spoken English. Or one might use pointers to the word
"red" in the OCR file and the sound bit of the word
"red" in the speech recognition file. With each shade
of red, we will need to store the RGB values (or alternatively,
the chrominance values) that define it.
Colors,
like most other attributes, are human inventions. The color
spectrum is continuous. There is no such thing as the color
"red". "Red" is an arbitrary abstraction
enforced by language. Furthermore, there are various subdivisions
of "red" such as "carmine",
"scarlet", "crimson", "brick-red"
(whatever color that is), and so forth. And this is true in
general, from colors through numbers to events. (Identifying
colors will somewhat facilitated by the human propensity to print
the primary colors rather than borderline colors (which can be
handled with appellations such as "yellow-green").
There will be a hierarchy of colors
Identifying
objects by an attribute such as color is tantamount to functional
inversion. Given a function, find its inverse.
Brief
generic animation tracks, like opening a drawer, pouring water,
and all the other 1,001 common micro-moves we make each day,
would be stored like attributes in a generic file.