Artificial Intelligence

Uncanny AI: Artificial Intelligence in The Uncanny Valley
By David Hayward
I ring the bell and Trip answers the door.
"How's it going asshole?", I ask, and instantly his face falls. My mouth opens and I feel a quick
spike of guilt before telling myself he's not real. In silence, with a heartbreakingly sad look, Trip
slowly shuts the door in my face. Well that was new. I reload.
I expected the AI to break, not look like a kicked puppy at the right moment. On each run
through Façade it pushes back at me a little, marginally understanding a bit of what I'm doing
and saying. It does feel broken, but within a small and repetitive setting it keeps on creating
microscopic and novel bits of emotional engagement. All of those fall to experience eventually:
my unconscious gradually catches up with my conscious knowledge, learning that Trip and
Grace are things, not people.
The ability of things to fool us is often a question of resolution. Something that can fool you at a
glance will not stand against a close look or a prolonged gaze. Spend long enough watching a
magician doing the same trick, or see it from the right vantage, and eventually you'll unravel it.
This goes for CGI imagery of people, such as photo-realistic vector art or 3D models. What
looks incredibly realistic at a distance may not under closer scrutiny. When we’re accustomed to
or expectant of it, a lack of detail can be stylistic to the point of being painterly, but when
unexpected it can pitch a nearly photographic representation headlong into the Uncanny Valley.
The valley has enjoyed widespread discussion in relation to the appearance of CGI humans over
the past few years, and while thinking about it recently, the very worst moments of my social life
flashed in front of me interlocked with thoughts of some of the best game AI. I now think that
the uncanny valley applies to behavior too.
There's a small minority of people who are consistently strange in particular ways. You've
probably met a few of them. Human though they are, interaction with them doesn't follow the
usual dance of eye contact, facial expressions, intonations, gestures, conversational beats, and so
forth. For most, it can be disconcerting to interact with such people. Often, it's not their fault, but
even so the most extreme of them can seem spooky, and are sometimes half jokingly referred to
as monstrous or robotic.
I don't mean to pick on them as a group; nearly all of us dip into such behavior sometimes,
perhaps when we're upset, out of sorts, or drunk. Relative and variable as our social skills are, AI
is nowhere near such a sophisticated level of interactive ability. It is, however, robotic.
Monstrous and sometimes unintentionally comedic; the intersection of broken AI and spooky
people is coming.
The problem is compounded by the fact that there's no way to abstract behavior or make it
"cute". Cuteness is visual, so by rendering it as a cartoon even the repellent appearance of an
ichor-dripping elder god can be offset. In a similar way, by its visual characteristics a Tickle Me
Elmo doll pushes a lot of our "cute" buttons. However, when it's set on fire and continues to
giggle, kick it's feet and shout "Stop! Stop! It tickles!" while it burns into a puddle of fuming
goo, it seems horrific, profane and hilarious by turns.
That’s programmed behavior pushed out of context, and the highly specialized fragments of AI
currently integrated into video games easily break in the same way when they stray from their
intended stages.
Strange or sick behavior can't be abstracted into a cuter, more appealing version of itself unless
it's made burlesque, naive, or consequence free, and of course this would have drastic narrative
effects. While a story can be told through any number of sensory aesthetics, behavior itself
works through time, its meaning often independent of representation. That's extremely important
for interactive media.
There are lots of things across all media that can already fool us. The crucial question, though, is
how well do they do it? Distance and brevity obscure all manner of flaws, but at some point in a
game, the player can always get closer or look for longer.
This applies to absolutely every aspect of simulation, but the aspects centered on other humans
are critical. We're a very social species, and as a result large amounts of our cognitive resources
are thrown into the assessment of other human beings. For instance, we show extraordinary
specialization in recognizing, processing and categorizing the faces of other humans. We're
acutely aware of whether or not other people are looking at us. We spend every second of
interaction inferring the emotional state, values, and likely actions of others.
Of all the sensory data we deal with, other people are among the most relevant to our existence,
so of course we have some highly specialized capacities to deal with it. Speech, movement, body
language, behavior, and consistency of actions are all things we're well accustomed to.
That means people are much more difficult to simulate than rocks and trees, not just because of
relative complexity, but because we're more wired to scrutinize our fellow humans. In film and
real-time rendering alike, the plastic sheen of 90's CGI has given way to environments my
unconscious mind doesn't balk at and just accepts even if not quite photoreal, but simulated
people continue to pop out of them as fake.
Whether or not something is "realistic" is largely a red herring. The more important test is
whether or not it's convincing, and I suspect behavior will prove to be a much bigger challenge
than appearance.
Simulated appearance can be constructed from various elements that we are presently mastering.
behavior is a complex, dynamic, context sensitive system that, in addition to dealing with
immediate situations, can also operate informed by elaborate historical contexts and long term
aims. Where actions and physicality are based on syntax, the behaviors underlying the vast scope
of human actions, along with the limited repertoires imparted to AI, are often about meaning and
have a rich undercurrent of semantic relations.
Real human behavior, for the most part, seamlessly elicits my empathy, and also tells me that, in
turn, others understand and empathise with me. It also tends to demonstrate consistency, and at
some point can generally be expected to explain any inconsistencies.
At best, such dynamics exist in a fragmented fashion if at all in game AI, which generally
follows a very predictable cycle no matter how good it is: When it's new it may surprise me a
few times with various tricks, and will tend to elicit empathy too, but every time a human
seeming art asset or piece of behavior is instanced or recurs, my empathy diminishes. This
continues until eventually I can let my Id go to town on NPCs without feeling bad. The greater
the degree to which AI repeats itself, the more likely this result is.
Beyond patchy AI, the emotional engagement of a game is in the motivation I have to achieve
goals, which are nothing but syntax. Games can and do rise above this. At present, there seem to
be two ways in which they can use NPC behavior to drive emotionally engaging narrative and
social interaction.
The first is traditional, non-interactive storytelling. By putting a game on rails or inserting huge
cutscenes, a lot of traditional media techniques are of course open to game developers.
The second way is to use convincing fragments of interaction. This is more adaptive, but as yet
not sustainable through time. For example, in F.E.A.R., at one point when I did particularly well
at taking down a group of soldiers, the last one exclaimed "No fuckin' way!" just before I
dispatched him. Though it was of course pre-recorded voice acting, the triggering of it was very
well timed and created a brilliant moment, raising the game above the syntax of combat. In that
instant the soldier was a character, not an entity.
Of course, any attempt to extend that into a conversation rather than a fight would, at present,
break rapidly. This is exactly what happened repeatedly in Façade. No matter how many sad
looks Trip shot at me, I'd always catch him doing something inhuman shortly after. Many game
AIs have engaged and convinced me for a moment or two, but ultimately a five second Turing
test isn't a very high benchmark.
As a result of this limit of game AI, I automatically assume it won't be convincing and forgive
any errors it makes, such as running into things, repeating itself, taking unnaturally long pauses
during conversation, and staring at me. Fragmented AI regularly communicates its inhumanity
and punctures immersion.
However, it is becoming increasingly sophisticated, and that means that as it engages more of the
parts of our brain used in socialization, it will pass a point where it will stop looking like good AI
and start looking like bad acting or dysfunctional behavior. When interactive entertainment hits
that point, it won't just be something we can laugh at like a B-movie, because it won't be a
passive experience. It's going to be reaching out to us and pushing all the wrong buttons.
There are limited examples of it happening already. In Half Life 2, Alyx being programmed to
look at the player while talking to them to create a sense of eye contact was a step above the
previous generation of art and AI, but the illusion snapped when she was talking to me on a
descending lift: Her eyes kept slowly rolling upward then flicking back down to me, because the
point she was scripted to look at wasn't updating as fast as my location. If a real human did that
near me, I'd be concerned for their well-being.
In that game, despite every emotionally convincing moment delivered by the combination of
story telling, AI and art assets, it only took that one error to unhook a great big wedge of my
empathy and make me laugh.
The closer a representation of a human is to reality, the slighter the flaws that can suddenly deanimate it. AI systems are rather fragile right now, whereas organic intelligence is decidedly
robust, being able to operate in and adapt to a multitude of contexts.
We tend to take our own adaptivity for granted because it's such an everyday thing, and it's often
the oddities of humans that make them more interesting and charming. Only certain subsets of
characteristics make socialization more challenging, and even then it can be offensive to define
them as flaws.
Sometimes it's just that people are a little de-socialized, but even so I think an important and
much more formal connection between people and present level game AI can be found in
psychiatry: the autism spectrum.
This spectrum is a psychiatric construct that defines various behavioral symptoms as disorders,
varying in severity. Stated very simplistically, some positions on the spectrum involve enhanced
specialization and lack of social ability, but it should be stressed that this is not a trade off.
The extraordinarily talented autistic savants sometimes paraded on TV and brought especially
into public consciousness by the film Rain Man only comprise a fraction of autistic people. Also,
while it is well known and obvious that they have limited social ability, an incredibly important
component of autism is rarely discussed in popular culture: The ability of autistic people to
understand the subjective viewpoints of others is drastically impaired.
To illustrate, an autistic child is shown a model in which person A puts an object away in front of
person B, then leaves the room. Person B then takes the object and conceals it in a different
location, then person A re-enters. If told that person A wants the object back and asked to show
where he will go first to get it, an autistic child will likely point straight to the hiding place used
by person B.
Asperger’s syndrome is a less severe part of the spectrum, in which people generally show some
form of above normal mental ability coupled with somewhat obsessive interests, and are
somewhat disconnected and uncomprehending of the emotions of those around them. It's
sometimes claimed that Albert Einstein had Asperger’s.
In its high specialization and complete inability to understand people, game AI shows very
similar symptoms to people on the autism spectrum. It doesn't really have a place on the
spectrum itself though, because it breaks out of the far end, being so narrowly active and
empathically blind as to be beyond autistic.
The more comprehensive it gets though, the less machine like it seems and the closer it comes to
behaving like a particular subset of unusual human beings. Advanced AI will probably follow a
reverse trajectory down the autism spectrum before it really fools us.
As a result, I suspect that consultation with and evaluation by psychology departments may
become relevant to game AI in the coming years, given that they're the most comprehensive
resource in existence on human behavior,
Psychology generally has a hard time, often being accused of unscientific practice, and
psychiatry is also accused of prejudice in the way it defines certain things as disorders.
Psychology has been through so many upheavals, and has so many schools and movements, that
to even define it as a single thing can seem like a stretch sometimes.
Furthermore, despite over a century of study, much of the human mind and brain remain a black
box to us. We see what is going on from the outside, but have so far had only limited ability to
measure and peer into what's going on internally.
This is where game developers are going to have some significant advantages over
psychologists. If looked at from the point of view of hard data and proven theories, psychology is
very difficult to penetrate.
However, where a scientist must test, measure, revise and prove things, game developers can
simulate and create systems. Looked at in terms of unproven theories, psychology is a
smorgasbord of ifs, maybes, and analytical skills rather than hard facts.
A lot of what we intuitively know about people remains immeasurable because of limitations on
our technology and knowledge. For instance, the positive and negative valence of emotional
states in others is obvious to most people through facial expression, voice intonation, posture,
and so forth, yet none of these constitute an impossible to fake, objectively reliable measure, and
magnetic resonance imaging has not yet reached a fine enough resolution to allow sufficient
neurological observation.
While reliable enough for everyday interaction, the signs we read by second nature are not
absolute. It is our unconscious knowledge of how humans behave that enables us to pick out the
good fakes, and bringing that knowledge to light will take a lot of study and analysis.
Comprehensive knowledge of the mechanics upon which human behavior operates is a tall order,
but luckily, while still a mountain we're yet to scale, a well informed AI performance is not so
ambitious. By building towards more convincing AI, game developers are not becoming
scientists, merely better magicians.