Gazing and Frowning as a New HCI Technique

Gazing and Frowning as a New Human–Computer
Interaction Technique
VEIKKO SURAKKA, MARKO ILLI, and POIKA ISOKOSKI
University of Tampere, Tampere University Hospital
The present aim was to study a new technique for human–computer interaction. It combined the use of two modalities, voluntary
gaze direction and voluntary facial muscle activation for object pointing and selection. Fourteen subjects performed a series of
pointing tasks with the new technique and with a mouse. At short distances the mouse was significantly faster than the new
technique. However, there were no statistically significant differences at medium and long distances between the techniques.
Fitts’ law analyses were performed both by using only error-free trials and using also data including error trials (i.e., effective
target width). In all cases both techniques seemed to follow Fitts’ law, although for the new technique the effective target width
correlation coefficient was smaller R = 0.776 than for the mouse R = 0.991. The regression slopes suggested that at very long
distances (i.e., beyond 800 pixels) the new technique might be faster than the mouse. The new technique showed promising
results already after a short practice and in the future it could be useful especially for physically challenged persons.
Categories and Subject Descriptors: H.5.2 [Information Interfaces and Presentation]: User Interfaces—Evaluation/
methodology, input devices and strategies, interaction styles
General Terms: Experimentation, Human Factors, Performance
Additional Key Words and Phrases: Gaze direction, electromyography, facial muscle activity
1.
INTRODUCTION
The use of the hands has been and still is the dominant way to control computers. Recently there have
been many attempts to develop alternative human–computer interaction (HCI) techniques. A central
motivation for this has been the attempt to make HCI more multimodal, more natural, and intuitive.
The ultimate goal of these developments is that HCI would be more like human–human interaction.
Another motive for developing alternative user interfaces is improving the possibilities of functionally
challenged persons to use information technology. One line of investigations in the search for alternative
HCI techniques has been the use of eye movements and gaze direction in a computer user interface.
Earlier studies on eye movement tracking have explored human perceptual and cognitive processes
(see a review by Rayner [1998]). Recently, it has been realized that the monitoring of eye movements
with modern eye trackers could be used also for controlling computers.
This research was supported by the Academy of Finland (project 177857), Tampere Graduate School in Information Science and
Engineering, and the University of Tampere Foundation.
Authors’ addresses: V. Surakka, Research Group for Emotions, Sociality, and Computing, Tampere Unit for Computer–
Human Interaction, Department of Computer and Information Sciences, University of Tampere, Finland, FIN-33014 and
Department of Clinical Neurophysiology, Tampere University Hospital, P.O. Box 2000, FIN-33521 Tampere, Finland; email:
[email protected]; M. Illi, P. Isokoski, Research Group for Emotions, Sociality, and Computing, Tampere Unit for Computer–
Human Interaction, Department of Computer and Information Sciences, University of Tampere, Finland FIN-33014.
Permission to make digital/hard copy of part of this work for personal or classroom use is granted without fee provided that the
copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of publication, and its date
of appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers,
or to redistribute to lists, requires prior specific permission and/or fee.
c 2004 ACM 1544-3558/04/0700-0040 $5.00
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004, Pages 40–56.
Gazing and Frowning as a New HCI Technique
•
41
Interaction techniques that are based on the use of eye movements (i.e., gaze-based techniques) have
been considered promising alternatives for the hand-based techniques for many reasons. In real life it
is natural for people to look at objects and perform other tasks with hands at the same time, thus gazebased techniques could offer means to extend the bandwidth of input in HCI [Jacob and Karn 2003].
Eye movements require little conscious effort and in real life people look spontaneously at the objects
of interest. Similarly, in interacting with computers people usually direct their gaze at the objects of
interest and so these techniques utilize the modality that people use naturally in information seeking
[Sibert and Jacob 2000].
Further, in hand-based techniques, subjects have to first find the object of interest by targeting gaze
on an object of interest, and only after that the cursor can be manually adjusted over the object. In
gaze-based pointing techniques, the gaze is already on a right object at the moment the subject looks
at it. Clearly an advantage in comparison to hand-based techniques is the fact that eye movements can
be very fast. For example, saccadic eye movements can be as fast as 500 deg/s [Rayner 1998]. Thus, eye
movements are inherently superior in speed when compared to any other modality for pointing objects
[Sibert and Jacob 2000; Ware and Mikaelian 1987]. In addition to the fact that eye movements can be
faster than the use of hands a further advantage is that the subject can give the inputs for a computer
without using her/his hands [Jacob and Karn 2003]. This leaves the subject’s hands free for other
purposes. The hands free advantage is especially important for disabled persons who are incapable of
using their hands. In terms of user experiences there are some observations that interfaces utilizing
eye movements can be more enjoyable than traditional user interfaces [Sibert and Jacob 2000; Salvucci
and Anderson 2000; Tanriverdi and Jacob 2000]. There is also evidence that combining the use of gaze
direction with other input techniques (e.g., hand) requires little additional effort and users can learn
to use gaze as an input with other input modalities quickly and naturally [Sibert and Jacob 2000; Zhai
et al. 1999; Zhai 2003].
It would be ideal if a computer could be used and controlled simply by looking at it. However, when
voluntarily directed gaze has been used unimodally as the only method for computer input there have
been difficulties in finding a suitable method for selecting objects. In general, these methods suffer from
the problem that objects become selected every time the subject looks at them. This drawback has been
called the Midas touch problem. There have been different ways in trying to solve this problem. In the
dwell-time protocol, an object is selected, when the user’s gaze has dwelled on the object long enough
[Jacob 1991; Ware and Mikaelian 1987]. The use of on-screen selection buttons and the use of a manual
button press have been alternative ways to avoid the Midas touch problem [Ware and Mikaelian 1987].
All these methods suffer from certain problems. With on-screen selection buttons the problem is that
there cannot be any menu items between the target and the selection button. The use of a manual
hardware button results in the loss of hands free advantage. The use of the dwell-time protocol is slow
due to the extra selection time, and the user still cannot keep her or his gaze on an object too long without
selecting it. It is confusing to the user when her/his gaze sometimes leads to a selection of the object and
at other times it does not. Studies that have compared these methods are rare, but Ware and Mikaelian
[1987] found that a hardware button was faster than the dwell time of 0.4 s for all their pointing task
target sizes. The use of a dwell-time protocol resulted in less errors than the use of a manual button
press for selecting objects. Jacob [1991] found the dwell-time approach to be more convenient than the
button press in practice. Short dwell times of 150–250 ms can be applied in some cases, for example, in
undoing wrong selections, but ultimately the length of the optimal dwell time depends on the subject
and the task [Jacob 1991; Stampe and Reingold 1995]. As noted, for example, by Zhai [2003] the eye
is primary a perceptual organ and because of this it is not well suited for controlling tasks, such as
activating targets. For this reason a method called manual and gaze input cascaded (MAGIC) pointing
has been developed. It combines the use of gaze-direction tracking and manual mouse control. The gaze
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
42
•
V. Surakka et al.
is used for rough estimation of the user’s probable pointing and selection intention after which only
small manual mouse operations are needed for target selections. The advantage of this method is that
the system automatically moves the pointer near the objects of interest without any voluntary gaze
operations. On the other hand, the hands free advantage is lost with this method [Zhai 2003; Zhai et al.
1999]. Thus, development of methods that utilize gaze direction without the use of dwell time and that
still preserve the hands free advantage would have many potential applications. One way of addressing
this problem would be to use the bioelectrical or physiological signals of the human body.
Recently, studies that have monitored human physiological signals as an alternative method and as
an extension in HCI have emerged. Human physiological signals have been used in psychophysiological
research for quite a long time, but the idea of using human biological signals in HCI is more recent
[e.g., Kübler et al. 1999; Laakso et al. 2001; Lusted and Knapp 1996; Partala et al. 2001; Partala and
Surakka 2003; Wolpaw et al. 2002]. Interesting and promising experiments have used the monitoring
of voluntarily produced changes in the electrical activity of the human body. The ultimate way of
connecting the user and a computer is the use of brain–computer interfaces (BCIs) (for a first attempt
see Vidal [1973]). These interfaces are useful for the users who are so severely disabled that they have
lost all voluntary muscle control and are locked in to their bodies [e.g., Moore and Kennedy 2000]. There
are several challenges in the development of BCIs, but there have also been successful experiments
both with scalp recorded signals and intracortically implanted electrodes. These interfaces have been
used, for example, for tracking the direction of gaze and manipulating two-dimensional movements
in a graphical user interface [Kübler et al. 1999; Moore and Kennedy 2000; Wolpaw et al. 2002]. At
present BCIs can reach, at best, a data transfer rate of 25 bits/min. This means that they are useful to
those with the most severe neuromuscular malfunctions (see a review by Wolpaw et al. [2002]). If both
voluntary control of eye movements and at least one or two muscles are intact then much more efficient
data transfer can be acquired with much lighter hardware and software solutions.
There have been successful experiments for recognizing voluntarily produced changes in electrical
activity of facial muscles with neural networks in order to be used for HCI [Laakso et al. 2002]. A
device called “Cyberlink” can be used to monitor eye movements with electrooculography (EOG), muscle
activity with electromyography (EMG), and brain activity with electroencephalography (EEG) [Doherty
et al. 1999; see also Allanson et al. 1999]. Barreto et al. [2000] developed a method that combined the
use of EOG (from cranial muscles) and EEG (from occipital lobe) signals for two-dimensional cursor
movements by applying amplitude thresholds and power spectral density estimations. Using relatively
small object sizes they found that the subjects were able to control buttons (i.e., the time from start
to stop button) in an average of 16.3 s. More closely related to the work with eye tracking technology
and eye movement measurement, Tecce et al. [1998] recorded vertical and horizontal EOG signals in
order to track gaze direction and convert it to controlling of a graphical user interface. By using a sort
of dwell time (the cursor remained in a certain position for a certain time frame) they showed that the
users were able to spell words and sentences with their voluntary eye movements. The speed of this
interface was one character every 2.6 s, which was suggested to be quite favorable in comparison to
BCIs (i.e., one character every 26 s) [Tecce et al. 1998].
The idea of combining voluntarily directed eye movements (i.e., voluntarily controlled fixations and
gaze direction) and voluntarily produced changes in the level of electrical activity of facial muscles as a
new HCI technique has recently been studied and theorized [Partala et al. 2001; Surakka et al. 2003].
There are several reasons for these developments. In the area of HCI it is important to model human
communication in order to create interaction techniques that would be natural and versatile. It is known
that part of human communication relies heavily on nonverbal communication, especially on facial
expressions [e.g., Dimberg 1990; Hietanen et al. 1998; Surakka and Hietanen 1998]. It is also known that
many facial actions and expressions are activated spontaneously in human–human communication, but
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
Gazing and Frowning as a New HCI Technique
•
43
they can also be activated voluntarily [Fridlund 1991; Surakka and Hietanen 1998]. Facial actions and
expressions result from muscle contractions caused by electric muscle action potentials [Ekman and
Friesen 1978; Fridlund and Cacioppo 1986]. Because changes in electrical activity of facial muscles can
be generated at will, and they can be registered and monitored in real time with EMG, they offer a
tempting alternative for present HCI techniques.
Another motive for the development of this alternative technique has been the fact that the use
of eye movements as a unimodal user interface suffers from certain problems (e.g., the Midas touch
in selecting objects). Thus, the idea has been to test the suitability of voluntary facial activity to act
as a counterpart for the mouse button press (i.e., object selection) in association with voluntary eye
movements. Furthermore, if facial activity can be used as a part of gaze-based techniques the hands
free advantage would be preserved and the technique would be available and usable also to persons
with physical challenges. The possibilities of functionally impaired persons to use facial activity needs
to be, of course, evaluated on a case by case basis, but clearly people who have a spinal cord lesion in
a level that they even have lost their accurate control of hands can still benefit from the use of facial
muscle activity. Data from spinal cord injury program in Washington School of Medicine show that, for
example in the USA “adolescents and young adults (aged 15–24) are at highest risk of spinal injuries,
which result in lifelong needs for special services.” In USA about 50% of all spinal cord injuries are
classified as quadriplegics. About 36% of all quadriplegics are classified as complete and about 64%
as incomplete (source: http://www.neuro.wustl.edu/sci/scifyiFS.htm, retrieved December 17, 2003). It is
noteworthy that at least some people who are classified as complete quadriplegic still may have full
facial activity control, but the control of hands can be totally lost or only partially preserved (e.g., one
of the authors of the study at hand). In sum, many people suffering from even severe spinal cord injury
could benefit from hands-free user interfaces.
In a pilot study voluntarily directed gaze was used for object pointing, and voluntary activation of a
facial muscle corrugator supercilii (the muscle activated in frowning) was used as a counterpart for the
mouse button press. This muscle site was chosen because it is located near the eyes. Proximity to the
eyes was considered to make the integrative use of gaze and muscle activity easier. A second reason
for choosing this muscle site was that frowning activity (as well as gaze direction) can be related to
the changes in cognitive activity. There is some evidence that increased frowning activity is associated
with higher cognitive load [van Boxtel and Jessurum 1993; Hietanen et al. 1998]. This suggests that
in addition to frowning activity reflecting negative emotional response, it may be naturally related
to tasks that require changes in attention. Because gaze direction reflects changes in visual attention,
combining it with frowning activity might be a promising combination for choosing and selecting objects.
In the pilot study the data from the eye tracker and facial activity measured with EMG were combined
and analyzed offline. Pointing task time analyses from three pointing and selecting distances (i.e., 50,
100, and 150 pixels) suggested that the new technique was significantly faster to use than the mouse at
medium and long distances. In the mouse condition the task times increased significantly as the target
distance increased. In the new technique condition the effect of distance was not significant and the
task times were similar at all target distances. Subjective ratings indicated that the users liked the
mouse more than the new technique [Partala et al. 2001].
In order to evaluate new interaction techniques they need to be tested against some other well-known
interaction technique. It is common to compare gaze-based techniques to the mouse. A usual method has
been to compare the mean pointing tasks times (i.e., pointing and selecting) and pointing errors of the
techniques. The results have suggested that gaze-based techniques can be equally fast or even faster to
use than the mouse [e.g., Partala et al. 2001; Sibert and Jacob 2000; Ware and Mikaelian 1987]. Gazebased techniques have often been less accurate than the mouse [e.g., Miniotas 2000; Partala et al. 2001;
Sibert and Jacob 2000]. Significant inaccuracies result from the deficiencies in eye tracking technology.
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
44
•
V. Surakka et al.
First, head movement compensation that often involves a moving camera tends to produce inaccuracies
in the calculations. Second, typically eye trackers do not track dynamical asymmetries of the pupil as
it dilates after calibration. If these problems could be fixed, gaze direction could be calculated more
accurately and might result in better pointing accuracy.
Another method that can be used for evaluating pointing devices is to use experiments based on
Fitts’ law [Fitts 1954]. Fitts’ law establishes a relationship between pointing time and the difficulty
of the pointing task. The task difficulty is quantified by the index of difficulty (ID) such that ID =
log2 (A/W + 1), where A is the moved distance and W is the width of the target area [Fitts 1954; Gillan
et al. 1990; MacKenzie 1995]. In other words, the formula states that selecting targets that are narrower
and farther away is more difficult than selecting targets that are close and wide. The ID is in a linear
relationship to pointing time and can thus be described by a linear regression equation of the form MT =
a + b ID, where MT is the movement time, and a and b are the regression coefficients.
Originally, Fitts tested his equation using data from tasks where objects were manipulated with hands
directly in a reciprocating pattern. Later, Fitts and Peterson [1964] established that the relationships
held for discrete pointing acts. Subsequent research has shown that Fitts’ law is applicable to pointing
with the mouse and other computer pointing devices [e.g., Card et al. 1978; Mackenzie 1995]. The
reciprocal of b (i.e., 1/b), which is also known as an index of performance (IP) is often used to compare
two input devices [e.g., Card et al. 1978; Mackenzie 1995; Miniotas 2000]. The device with higher IP is
considered better because a high IP indicates that making the task more difficult has a smaller effect
on MT than with a device with lower IP. IP values should be compared only with knowledge of the
experimental and computational methods used for acquiring them [Douglas et al. 1999; MacKenzie
1992]. It is not totally clear whether all interaction techniques that utilize eye movements follow Fitts’
law. This is because the testing has been infrequent. However, almost every study that has tested Fitts’
law supports the conclusion that Fitts’ law applies to these techniques [Surakka et al. 2003].
Speed-accuracy trade-off is a central notion in pointing tasks. One can perform the task very rapidly
if there is no need for accuracy. On the other hand, unlimited time offers the opportunity to proceed very
accurately. The basic Fitts’ experimental setup does not address this issue, but all users are assumed to
perform under the same speed-accuracy condition. This condition is assumed to be adequately described
in the task instructions. This assumption may or may not reflect the reality. It is conceivable that some
participants understand the instructions differently from each other and choose a different approach to
the speed-accuracy trade-off. To counter this problem, methods for normalizing the error rate in the data
analysis phase have been developed. The preferred approach has been to use effective target width instead of the presented target width in the Fitts’ law calculations [MacKenzie 1995]. The effective target
width is computed by first finding the standard deviation of the pointing coordinates and then multiplying it by a factor of 4.133. If the pointing coordinates are normally distributed, this procedure normalizes
the error rate to 4%. Effective target width We is then used instead of the presented target width W
in the ID computation. In other words, the Fitts’ law equation becomes MT = a + b log2 (A/We + 1)
[MacKenzie 1995].
Using the effective target width in Fitts’ law calculations for comparing two pointing devices has
been the preferred way. However, sometimes it is unclear which method is better. In our specific case of
comparing the mouse and a new eye-tracker-based pointing technique, difficulty arose from the fact that
we were trying to answer two different questions. First, we wanted to find out how the new technique
would perform on a tracker that would be more accurate than the one we used. Second, we wanted to
do a fair comparison against the mouse with the technology that was available. When the tracker fails,
pointing time increases and the effective target width grows. Consequently, the new technique looks
bad in comparison. Excluding these erroneous tasks allows us to observe the new technique under ideal
conditions, but makes the comparison against the mouse unfair because only part of the data with the
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
Gazing and Frowning as a New HCI Technique
•
45
new technique is being used. Because of this we decided to do the Fitts’ law analysis both with and
without error rate normalization to answer both of our questions.
The preliminary findings with combining the two modalities for pointing and selecting targets were
promising. However, the method was not fully functional and data was combined offline [Partala et al.
2001]. For these reasons we wanted to evaluate this technique further with a fully functional system.
Thus, the present aim was to explore combined voluntary gaze direction and voluntarily generated
electrical facial muscle activity with a method that enabled real-time object pointing in a graphical
user interface. This was done by using computer software that combined both the eye tracker and EMG
data in real time. Fourteen subjects performed pointing tasks with both the new technique and with
the mouse. Three pointing distances and target widths were used to measure response times from a
home square to the target. Fitts’ law applicability to the movement times with both techniques was
investigated. Users’ subjective evaluations were compared between techniques.
2.
METHODS
2.1
Subjects
Fourteen right-handed users (five females and nine males) participated in the experiment. Their mean
age was 24 (range 19–28). All the subjects were familiar with the mouse, but unfamiliar with the new
technique. All subjects had normal or corrected to normal vision. However, if the vision was corrected
only subjects with contact lenses were included because eyeglasses degrade the accuracy of eye tracking.
2.2
Apparatus
A regular PC mouse (Logitech M-M35) with medium cursor speed was used. The display was a 15
Nokia 500 Xa LCD monitor in 1024 × 768 resolution mode. The viewing distance was 80 cm. In the
new technique condition, an Applied Science Laboratories Model 4000 corneal reflection eye tracker
was used to measure the gaze coordinates from the user’s right eye at a sampling rate of 50 Hz. The
smoothing rate of the system was 4 (range 1–25). A Grass® Model15TM differential amplifier was used
for recording the electrical activity from above the user’s corrugator supercilii muscle site (see Figure 1).
The sampling rate of the system was 1,000 Hz. Disposable surface electrodes were used for bipolar
EMG recording. The electrodes were placed on the region of corrugator supercilii, on the left side of the
face according to the guidelines by Fridlund and Cacioppo [1986]. Before electrode attachment the skin
was cleaned with ethanol and slightly abraded with electrode paste and cotton sticks. The interelectrode
impedances were <10 k. Gaze data from the eye-tracker computer was sent through the serial port
and facial muscle activity data from EMG recorder was sent through an isolated 10 Mbps Ethernet
segment to a Pentium III 500 MHz PC computer. The experimental software ran under Windows 98
operating system using both the data in real time. The software rectified and low-pass filtered the
EMG signals and was set up to produce an object selection event (hereafter called a click) whenever the
signal crossed a certain threshold. If the subject had difficulties producing clicks or if she/he felt that
clicks happened unintentionally the threshold level could be adjusted. With this method, we wanted
to ensure that any problems associated with involuntary facial muscle activity were minimized. All in
all, the object selection system was easily responsive but required voluntary activation of facial muscle
corrugator supercilii.
2.3
Experimental Tasks
The experiment was a within-subject 2 × 3 × 3 factorial design with two pointing techniques (mouse
versus the new technique), three pointing distances (60, 120, and 180 mm), and three target widths
(25, 30, and 40 mm). The experiment was counterbalanced and randomized so that six of the subjects
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
46
•
V. Surakka et al.
Fig. 1. A person wearing electrodes attached above the corrugator supercilii muscle site. The upper and lower parts of the figure
represent a face without and with voluntary activity, respectively.
started with the mouse condition and eight started with the new technique condition. The pointing
tasks were similar to those of Douglas and Mithal [1994] with minor differences. In both conditions,
two objects, a home square, and a target circle were presented to the subjects simultaneously. The object
became highlighted when the subject’s gaze or the mouse cursor was inside it. Circles were chosen as
targets to avoid complications due the angle of approach [Accot and Zhai 2003; MacKenzie 1995]. The
width of the home square was kept constant and it was 30 mm. In order to make the longer pointing
distances possible the place of the home square was varied so that it appeared always symmetrically
(measured from the center of the screen) on the opposite direction from the target circle. For example,
for a vertical upwards pointing distance of 180 mm the home square appeared 90 mm downwards
from the center of the screen, and the target appeared 90 mm upwards from the center of the screen.
The targets appeared in one of eight different angles (four orthogonal and four diagonal directions)
around the home square. As there were three different distances, three different widths, and eight
different angles, there were in sum 72 different trials. Each trial was used twice, resulting in total of
144 trials. All 144 trials were presented in a randomized order for every subject with both interaction
techniques.
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
Gazing and Frowning as a New HCI Technique
2.4
•
47
Procedure
When the subject arrived in the laboratory, the equipment and the test room were presented to her/him.
The subject was told that the purpose of the experiment was to investigate two interaction techniques.
The subject was informed that the test would last about an hour and two different interaction techniques
would be tested separately. If the subject started with the new technique condition, the use of the new
technique was first explained to her/him in detail. We explained to the subject that in the new technique
condition, voluntary gaze direction was used for object pointing (objects are pointed simply by looking at
them), and voluntary facial muscle activity was used for object selection. Then the electrodes were placed
on the subject’s face. The experimenter showed examples of how to voluntarily activate the corrugator
supercilii muscle. To ensure that the subjects were able to voluntarily activate their facial muscle the
subjects were shown their EMG graph on the screen while producing voluntary activity. A vertical blue
line appeared in the graph to indicate that the click was successful. The use of visual feedback proved
to be a valid method for training. The subjects learned quickly and easily to use voluntary facial muscle
activity as a counterpart to the mouse button press. Next we explained the eye-tracker calibration
procedure and trained the subject in its use. When the subject was able to produce a click without extra
effort and was familiar with the eye-tracker calibration procedure, we explained the task to her/him. If
the subject started the experiment with the mouse technique condition, the use of the new technique
was explained to her/him after finishing this condition.
There was practice of 48 trials before the actual experiment and it proceeded as follows. First, the
home square and the target circle appeared simultaneously. The subject was instructed to point inside
the home square and then click. The home square disappeared after a successful click. Then the subject
was instructed to point inside the target circle and then click. The target circle disappeared after a
successful click. Both clicks were to be made as fast and as accurately as possible. There was a pause of
2 s before the home square and the target circle appeared again. The time between the two clicks were
measured as task times. The eye tracker was recalibrated during the practice when necessary.
After finishing the practice the subject was asked if she/he had understood the task. There was a
short relaxation period before the actual experiment during which the subject was told that the actual
experiment would be similar to the practice, but it would be longer. When the relaxation period was
over, the eye tracker was calibrated and the experiment was started. The eye tracker was recalibrated
during the experiment when necessary (i.e., 2 times per subject on average).
2.5
Subjective Ratings
Right after finishing with either of the techniques the subjects rated their experiences of the individual
technique. The ratings were given with six nine-point bipolar adjective scales. The scales varied from
negative (the lower end) to positive (the upper end) experience. The six bipolar adjective pairs used
were: bad–good, difficult–easy, slow–fast, inaccurate–accurate, unpleasant–pleasant, and inefficient–
efficient. The adjective pairs were respectively named as general evaluation, difficulty, speed, accuracy,
enjoyableness, and efficiency scale.
2.6
Artifact Rejection and Data Analyses
The data analyses were performed both with error-free data and with data that included trials classified
as errors (i.e., effective target width). For error-free data, trials with pointing time deviating from the
mean by more than two standard deviations were excluded. More detailed data analyses were done to
the error-free data. This was because we wanted to make a clean comparison for the case that both
techniques would function ideally. It is noteworthy that in fact both techniques were error-free in the
sense that eventually all the subjects were able to point and click all the targets successfully with both
techniques. However, analyzing this data would not have produced results for the ideal comparison.
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
48
•
V. Surakka et al.
Table I. Mean Pointing Task Times (MT) ± SEMs (ms)
and Mean Error Percentages ± SEMs for Both Techniques
at Different Target Widths and Distances
New Technique
Width (mm)
25
30
40
Distance (mm)
60
120
180
Mouse
MT
692 ± 36
677 ± 34
656 ± 33
Error (%)
26.3 ± 2.3
16.7 ± 2.4
9.5 ± 1.6
MT
649 ± 24
625 ± 25
576 ± 26
Error (%)
4.9 ± 1.2
3.0 ± 1.1
1.9 ± 0.6
626 ± 36
676 ± 33
725 ± 36
14.9 ± 1.6
16.2 ± 1.9
21.4 ± 2.8
483 ± 24
631 ± 24
736 ± 29
1.8 ± 0.5
4.8 ± 1.3
3.3 ± 1.0
The definition of an error was that if the first click on the target circle was not successful the trial
was classified as an erroneous one, and excluded from task time analyses. This happened in 17.5% of
cases for the new technique trials and in 3.3% of cases for the mouse trials. Also trials with a task time
more than two standard deviations above or below the subject’s mean task time were excluded. In all,
20.7% of the new technique trials and 6.5% of the mouse trials were excluded from the pointing task
time analyses. This data was used for the first Fitts’s law analysis. The second Fitts’ law analysis that
used the error rate normalization included those first clicks that happened within 1.5 target widths
from the center of the target regardless of whether the target was hit or not. This criterion excluded
4.07% of the trials with the new technique and 0.05% of the trials with the mouse. The task times
were averaged over the different pointing directions. Then the mean task times and error percentages
were calculated separately for all width and distance combinations (three widths and three distances).
Statistical analyses were performed for error-free data only. Repeated measures analyses of variance
with F -values based on Greenhouse–Geisser corrected degrees of freedom were used for the statistical analyses. Pairwise Bonferroni corrected t-tests (i.e. t ) were used for post hoc tests. For pairwise
comparisons of subjective ratings the Mann-Whitney U test was used.
3.
3.1
RESULTS
Pointing Task Time Analyses
Mean pointing task times ± standard error of the means (SEMs) using error-free data at different target
widths and distances are presented in Table I. The pointing task times (mean ± SEM) were 675 ± 34 ms
for the new technique and 617 ± 25 for the mouse. A 2 × 3 × 3 (interaction technique × distance ×
width) three-way ANOVA showed a significant main effect of distance F (2, 26) = 147.6, p < 0.001,
and a significant main effect of width F (2, 26) = 43.2, p < 0.001. The interaction of the main effects
of the technique and the distance was also significant F (2, 26) = 23.9, p < 0.001. There were no other
significant main or interaction effects.
Post hoc comparisons showed that at short distances the mouse was significantly faster to use than
the new technique t = 4.3, df = 13, p < 0.01. However, there were no significant differences between
the two techniques at medium and long distances. Because of the significant interaction of the main
effects of the technique and the distance, one-way repeated measures ANOVAs were performed for
both interaction techniques separately. ANOVA showed a significant effect of distance for the new
technique F (2, 26) = 15.6, p < 0.001, and for the mouse F (2, 26) = 215.9, p < 0.001. In the new
technique condition, post hoc tests showed that the pointing task times were significantly shorter for
short distances than medium distances t = 3.4, df = 13, p < 0.05, and significantly shorter for medium
distances than long distances t = 2.8, df = 13, p < 0.05. In the mouse condition, post hoc tests showed
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
Gazing and Frowning as a New HCI Technique
•
49
that the pointing task times were significantly shorter for short distances than medium distances
t = 13.9, df = 13, p < 0.01, and significantly shorter for medium distances than long distances t = 10.4,
df = 13, p < 0.01. Even though the interaction of the main effects of the technique and the width was
not significant, one-way repeated measures ANOVAs were performed separately for both techniques.
ANOVA showed a significant effect of width for the new technique F (2, 26) = 4.7, p < 0.05, and for the
mouse F (2, 26) = 69.9, p < 0.001. In the new technique condition, post hoc tests showed that the time
between clicks was significantly longer for small than large target widths t = 3.2, df = 13, p < 0.05.
In the mouse condition, post hoc tests showed that the time between clicks was significantly longer for
small than medium target widths t = 4.6, df = 13, p < 0.01, and significantly longer for medium than
large widths t = 10.3, df = 13, p < 0.01.
3.2
Error Percentage Analyses
A 2 × 3 × 3 (interaction technique × distance × width) three-way repeated measures ANOVA showed
a significant main effect of technique F (1, 13) = 41.0, p < 0.001, a significant main effect of width
F (2, 26) = 29.8, p < 0.001 and a significant main effect distance F (2, 26) = 5.4, p < 0.05. The interaction
of the main effects of the technique and the width F (2, 26) = 23.7, p < 0.001 and the interaction of the
main effects of the technique and the distance F (2, 26) = 5.4, p < 0.05 were also significant (see Table I).
Post hoc tests showed that significantly more errors were made for all 60, 120, and 180 mm distances
when using the new technique as compared to the mouse t = 7.0, df = 13, p < 0.01; t = 4.6, df = 13,
p < 0.01; t = 5.8, df = 13, p < 0.01, respectively. Post hoc tests showed that there were also significantly
more errors for all three 25, 30, and 40 mm target widths in the new technique condition as compared
to the mouse t = 8.5, df = 13, p < 0.01; t = 4.5, df = 13, p < 0.01, t = 4.3, df = 13, p < 0.01, respectively.
Because of the significant interaction of the main effects of technique and distance, one-way repeated
measures ANOVAs were performed separately for both techniques. The ANOVAs showed a significant
effect of distance for the new technique F (2, 26) = 5.5, p < 0.05 and for the mouse F (2, 26) = 4.8, p <
0.05. Post hoc tests showed that in the new technique condition significantly fewer errors were made for
short than long distances t = 2.9, df = 13, p < 0.05. In the mouse condition, post hoc tests showed that
fewer errors were made for short than medium distances t = 3.0, df = 13, p < 0.05. One-way repeated
measures ANOVAs were also performed separately for both techniques to test the effect of width. There
was a significant effect of width for the new technique F (2, 26) = 33.8, p < 0.001 and for the mouse
F (2, 26) = 4.0, p < 0.05. Post hoc tests showed that in the new technique condition there were significantly more errors for small than medium t = 4.1, df = 13, p < 0.01, and for medium than large widths
t = 3.3 df = 13, p < 0.05. In the mouse condition, there were no significant differences between widths.
3.3
Fitts’ Law Analyses
The mean pointing task times were used to analyze how well the Fitts’ law applied to both techniques
(Figure 2). The linear regression equation for the new technique was MT = 501 + 79 ID, R = 0.988, p <
0.001. For the mouse technique the equation was MT = 180 + 198 ID, R = 0.990, p < 0.001.
Because both lines were clearly ascending and the correlation coefficients were high, Fitts’ law applied
for both techniques. The IP values were calculated through linear regression (IP = 1/b, from MT =
a + b ID). The results showed that for the new technique the IP was 12.7 bits/s and for the mouse it
was 5.1 bits/s. Figure 2 shows that there is a clear difference between the new technique and the mouse
for the point they intersect the y-axis. The y-intercept of the mouse is clearly at a lower level (i.e.,
about 180 ms) than the y-intercept of the new technique (i.e., about 500 ms). For this reason, the new
technique is less efficient than the mouse at short distances. As the distances get longer the difference
seems to balance and eventually the new technique may outperform the mouse at longer distances (i.e.,
beyond 800 pixels). The most important reason for the new technique being slower than the mouse was
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
50
•
V. Surakka et al.
Fig. 2. Fitts’ law regression lines for both techniques.
Fig. 3. Fitts’ law regression lines for both techniques with error rate normalization.
that the subject had to wait to make sure that the eye tracker measured the point of gaze correctly
before frowning.
3.4
Fitts’ Law Analysis with Error Rate Normalization
Figure 3 shows the regression slopes with error rate normalization. The linear regression equation
for the mouse was MT = 80.6 + 267 IDe , R = 0.991, p < 0.001. For the new technique it was
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
Gazing and Frowning as a New HCI Technique
•
51
Table II. Mean Ranks of Ratings of Both Techniques
General evaluation
Difficulty
Speed
Accuracy
Enjoyableness
Efficiency
New Technique
11.8
9.9
17.7
11.2
13.0
14.6
Mouse
17.2
19.1
11.1
17.8
16.0
14.4
Significance
n.s.
p < 0.01
p < 0.05
p < 0.05
n.s.
n.s.
MT = 419.7 + 185.3 IDe , R = 0.776, p < 0.01. The IP was now 5.4 bits/s for the new technique and
3.8 bits/s for the mouse.
Figure 3 supports the results of Figure 2. The intersections with y-axis were somewhat at a lower
level for both techniques, but the difference between the techniques remained about the same (i.e.,
around 300 ms).
3.5
Subjective Ratings
Mean ranks of subjective ratings are presented in Table II. Pairwise comparisons (Mann-Whitney U
test) of subjective ratings showed that the new technique was rated as significantly more difficult to
use U = 33.0, p < 0.01, and less accurate U = 52.0, p < 0.05 as compared to the mouse. However,
the new technique was rated as faster to use than the mouse U = 51.0, p < 0.05. There were no other
significant differences in the ratings between the two techniques.
4.
DISCUSSION
Our results showed that the new technique worked relatively well in real time. The mouse was significantly faster to use than the new technique at short target distances. However, at medium and long
distances there were no significant differences. There were no significant differences in pointing task
times between the two techniques at any target widths. Because Fitts’ law correlation coefficients were
high and slopes were ascending for both techniques, Fitts’ law applied to both techniques. Although
Fitts’ law IP value was higher for the new technique than for the mouse, the trends suggested that the
new technique may outperform the mouse only at very long distances (i.e., beyond 800 pixels). This
was also supported by the effective target width analysis that suggested that the longer the pointing
distance the smaller the difference between techniques. The findings from rating data supported the
above in showing that the mouse was rated to be more accurate and less difficult to use than the new
technique. However, the new technique was rated as faster than the mouse, which supports the notion
from regression slopes.
Both Fitts’ law regression slopes showed that the pointing task times with the new technique intersected the y-axis in a higher level than with the mouse. In both analyses, the difference was about
300 ms. Probably there are several factors contributing to this time lag for the new technique. First, it
is clear that eye tracking technology is not as accurate as mouse technology [e.g., Jacob and Karn 2003;
Ware and Mikaelian 1987]. This means that targeting the pointer was slower with the new technique.
Second, as pointed out by, for example, Zhai [2003] the eye is primarily a sensory organ. Probably, giving
eyes a new responsibility also results in some time delays. Third, as for the eyes, using facial muscle
activity in a new functional context may cause extra time delay for this technique. However, at this
point there are no studies that have compared the speed of clicking with facial muscles versus clicking
a mouse button. Fourth, as an interaction method the new technique was multimodal in contrast to the
unimodal mouse. This may create some delay for the new technique because it requires integration of
motor activities in a new way. Although our subjects learned to use the new technique with little practice, it is reasonable to assume that extra practice may have improved the subjects’ performance for the
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
52
•
V. Surakka et al.
new technique. According to Salvucci [1999] user performance can be improved in purely gaze-based
user interfaces with small amounts of extra practice. In sum there are several factors that differentiate
between these two techniques. Many of these factors probably work against the new technique, each of
them causing a bit of extra time delay. On the other hand, this represents a new approach that offers
the promise of hands-free interaction for disabled users or for able-bodied users whose hands might be
encumbered or already otherwise involved in a manual task.
Participants in our experiment had a long experience in the use of a mouse but none with the new
technique. This type of comparison may make the new technique to look awkward in respect to established techniques. Fitts’ law analysis is one way to try to overcome the difficulties resulting from the
unbalanced usage experiences for the different techniques. In the present study, the slopes from the linear regression analyses were clearly ascending for both the new technique and for the mouse. For the
mouse the slope was clearly steeper. We found that Fitts’ law applied for both the mouse and for the new
technique. This finding is in line with the earlier findings of Miniotas [2000] and Ware and Mikaelian
[1987]. The Fitts’ law IP in our study was higher for the new technique than for the mouse indicating
that the new technique is a promising interaction technique in some usage contexts. The reason for the
higher IP value is likely the speed of eye movements [Rayner 1998]. Due to the inherent speed of eye
movements the gaze becomes more efficient in comparison to mouse as the pointing distances become
longer.
Pointing errors were defined quite strictly in our study. If the first click on the target circle was not
successful the trial was classified as an erroneous one, and excluded from task time analyses. Also trials
with a task time more than two standard deviations above or below the subject’s mean task time were
excluded. This was done in order to get a clean comparison between the techniques. We note that in
fact both techniques were error-free in that eventually all the subjects were able to point and click all
the targets successfully with both techniques. Statistical analyses of pointing errors showed that more
pointing errors were made with the new technique than with the mouse in all target distances and
widths. Within technique analysis showed, however, that significantly fewer errors were made in the
new technique condition when the target width increased. Our results support the conclusion of Ware
and Mikaelian [1987] that the targets should be large enough to reduce errors in gaze-based selection.
In our experiment, the error percentage was below 10% for the new technique, when the target width
was 40 mm. At small target widths the error percentage was as high as 26.3%. The reason for this high
error percentage was mostly due to the inaccuracy of the eye tracker, which made it difficult to point
at small objects. Another reason, of course, is our strict definition of an error. Of course, in practical
applications these types of problems in pointing at objects are more or less unacceptable. It is likely
that the future eye trackers will be more accurate so that targeting at the objects with eyes will be
easier and faster.
When comparing the new technique to other alternative techniques for HCI there are basically two
types of techniques that relate to our new technique. Gaze-based methods, and electrophysiology-based
methods. In comparison to purely gaze-based methods [e.g., Jacob 1991; Ware and Mikaelian 1987], the
new technique avoids difficulties that are related to the use of dwell time. In comparison to methods
that combine the use of eye tracking with use of hands (such as MAGIC pointing [e.g., Zhai et al. 1999;
Zhai 2003]), the new technique preserves the hands free advantage, which may be especially important
for disabled users. When comparing to methods that use electrophysiological activity extracted from
the brain [e.g., Wolpaw et al. 2002], from facial muscles and the brain [Barreto et al. 2000], or horizontal
and vertical eye movements [Tecce et al. 1998] one needs to be cautious. Only some of them [e.g., Tecce
et al. 1998] are intuitive in the sense that they are directly associated with the functioning of the
sensing organ. Thus, the techniques differ radically from each other and because of this comparisons
may be somewhat misleading. Keeping that caution in mind some obvious comparisons can be made.
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
Gazing and Frowning as a New HCI Technique
•
53
Tecce et al. [1998] registered horizontal and vertical eye movements by measuring changes in levels of
direct current (DC). These changes were converted to a computer cursor to be used for selecting letters
from the computer screen. A kind of dwelling time was again used for object selection. They found that
their subjects (normal volunteers) were able to point and select one character per 2.6 s. Our technique is
much more efficient (i.e., the biggest mean pointing task time, though with error-free data, was 0.73 s),
and again totally avoids the use of any kind of dwelling time. Barreto et al. [2000] measured electrical
activity from facial muscles (e.g., left and right temporalis muscles) that were not directly associated
with gaze direction. These activities were again converted to a computer cursor, and they measured
how fast their subjects (healthy volunteers) could move the cursor from each corner of the computer
screen to the center of the screen. They found that on average the pointing time from the corner to the
center of the screen took 16.36 s. Finally, the comparison to BCIs shows again that our technique is
faster to operate and learn. BCIs may require weeks of learning and still not all persons may be able
to use them. Another factor is that these interfaces utilize electroencaphalographic signals, which are
easily contaminated by eye blinks and other artifacts generated by the normal activity of the human
body. Current BCIs can reach at best a 25 bits/min data transfer rate (see Wolpaw et al. [2002] for a
review on BCIs). Our new technique was not sensitive to artifacts, such as disturbance from other facial
muscles or eye blinks. The system was easy to learn: it took only a few minutes for every subject to
learn it. And finally, the data transfer rate of the new technique can reach a level of 140–750 bits/min
depending on the Fitts’ law variant used in the calculation.
The most obvious users of the new technique would be people who have motor disabilities. Although
the present study did not involve subjects with any kind of motor disabilities we think that the new
technique could be helpful for certain groups with such functional limitations. For people who have
preserved some of their motor functions including eye movement and at least one facial (or other) muscle
control this method could significantly enhance their possibilities for communication. Some persons who
have been classified as complete quadriplegics preserve full facial muscle and eye movement control.
If full facial muscle activity is preserved the basic idea of the new technique is easily adapted to
much more rich means of communication and environmental control. The corrugator supercilii muscle
site was chosen for the present study because it is located near the eyes, and was considered to be
appropriate to be used in conjunction with the gaze direction [see also Surakka et al. 2003]. In addition,
there is evidence that increased frowning activity is naturally related to the tasks that require more
cognitive processing [van Boxtel and Jessurum 1993; Hietanen et al. 1998]. Selecting objects is clearly
a task that requires more cognitive processing when compared to just pointing at objects. Although
there are arguments on behalf using the corrugator supercilii muscle site, the use of zygomaticus major
(the muscle that draws the lip corners upwards, producing a smile), or some other muscle, would also
have been possible. Our current system is relatively easily extendable for combining the use of two or
more muscles to enable a more extensive command system. Thus, many people suffering from even
severe spinal cord injury could benefit from this hands-free user interface.
Central drawbacks of the new technique are that it requires technology that is relatively expensive
(especially the eye tracking technology) and impossible to use for a person with severe disabilities.
This is of course a drawback that relates to every alternative interface reviewed above. The cost of eye
trackers is coming down all the time. At the time of purchase (i.e., about 5 years ago) the price of the
ASL tracker used in this study was more than 80,000 . At present there are systems (head mounted)
available at less than 10,000 that have comparable accuracy with system used in the present study
(e.g., the system sold by Cambridge Research Systems Ltd.). It would be also impossible for a person
to attach electrodes if she/he cannot use her/his hands. In many cases disabled people have a personal
assistant who can take care of system calibration, electrodes, and so on. It is easy to learn to do these
operations after which the actual user is ready to communicate. One possibility to avoid the use of eye
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
54
•
V. Surakka et al.
tracking technology would be the development of gaze-direction monitoring in the way that was done for
example by Tecce et al. [1998], who used changes of DC level in order to monitor horizontal and vertical
eye movements. The other direction would be to implant the electrodes for certain facial muscles that
would be far less invasive than implanting an intracortical recording system. It is likely also that this
system would not be so sensitive to the changes in cortical functions (i.e., due to the cortical plasticity)
after brain injury, than intracortical systems [e.g., Lauer et al. 2000; Moore and Kennedy 2000; Wolpaw
et al. 2002].
In general, future HCI would benefit from using different input and output channels in a more
imaginative manner. Particularly the use of such channels that are used in natural human–human
communication offer promising alternatives for the development of user-friendly interfaces. The present
research is one way of trying to follow that direction. The new technique was implemented to utilize
signals that are constantly active in spontaneous and voluntary human–human social communication
[Surakka et al. 2003; Surakka and Hietanen 1998]. In most cases gaze direction and facial expression
that is subserved by the underlying facial muscle activity are used frequently and fully automatically
in human interaction. People usually look at the object of interest, and for that reason it is natural to
use gaze for object pointing [Sibert and Jacob 2000]. In this sense the new technique parallels natural
human communication.
In sum, the combination of the voluntary gaze direction and voluntary facial muscle activity seems a
new promising interaction technique for real-time HCI. The voluntary use of facial muscle corrugator
supercilii worked well for clicking as a counterpart for the mouse button press. With the new technique
the Midas touch problem and the use of hardware button press were totally avoided. This means that
the user’s hands were left free for other purposes. This also means that the new technique could be a
genuinely useful interaction method for physically challenged persons.
ACKNOWLEDGMENTS
The authors would like to thank the comments and suggestions of four anonymous reviewers. Mrs.
Katja Karila is especially thanked for her invaluable help in registering EMG.
REFERENCES
ALLANSON, J., RODDEN, T., AND MARIANI, J. 1999. A toolkit for exploring electro-physiological human–computer interaction. In
Proceedings of Human–Computer Interaction—INTERACT ’99. IOS Press, Amsterdam, 231–237.
ACCOT, J. AND ZHAI, S. 2003. Refining Fitts’ law models for bivariate pointing. In Proceedings of the ACM SIGCHI Conference
on Human Factors in Computing Systems (CHI 2003). ACM Press, New York, 193–200.
BARRETO, A. B., SCARGLE, S. D., AND ADJOUADI, M.
disabilities. J. Rehabil. Res. Dev. 37, 53–63.
2000. A practical EMG-based human–computer interface for users with motor
CARD, S. K., ENGLISH, W. K., AND BURR, B. J. 1978. Evaluation of mouse, rate-controlled isometric joystick, step keys, and text
keys for text selection on a CRT. Ergonomics 21, 601–613.
DIMBERG, U.
1990.
Facial electromyography and emotional reactions. Psychophysiology 27, 481–494.
DOHERTY, E., BLOOR, C., AND COCKTON, G. 1999. The “cyberlink” brain-body interface as an assistive technology for persons with
traumatic brain injury: Longitudinal results from a group of case studies. CyberPsychology Behav. 2, 249–259.
DOUGLAS, S. A. AND MITHAL, A. K. 1994. The effect of reducing homing time on the speed of a finger-controlled isometric pointing
device. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI ’94). ACM Press, New
York, 411–416.
DOUGLAS, S. A., KIRKPATRICK, A. E., AND MACKENZIE, I. S. 1999. Testing pointing device performance and user assessment with
the ISO 9241, Part 9 standard. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI
’99). ACM Press, New York, 215–222.
EKMAN, P. AND FRIESEN, W. V. 1978. Facial Action Coding System (FACS): A Technique for the Measurement of Facial Action.
Consulting Psychologists Press, Palo Alto, CA.
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
Gazing and Frowning as a New HCI Technique
•
55
FITTS, P. M. 1954. The information capacity of the human motor system in controlling the amplitude of movement. J. Exp.
Psychol. 47, 381–391.
FITTS, P. M. AND PETERSON J. R. 1964. Information capacity of discrete motor responses. J. Exp. Psychol. 67, 103–112.
FRIDLUND, A. J. 1991. Evolution and facial action in reflex, social motive, and paralanguage. Biol. Psychol. 32, 3–100.
FRIDLUND, A. J. AND CACIOPPO, J. T. 1986. Guidelines for human electromyographic research. Psychophysiology 23, 567–589.
GILLAN, D. J., HOLDEN, K., ADAM, S., RUDISILL, M., AND MAGEE, L. 1990. How does Fitt’s law fit pointing and dragging? In
Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI ’90). ACM Press, New York,
227–234.
HIETANEN, J. K., SURAKKA, V., AND LINNANKOSKI, I. 1998. Facial electromyographic responses to vocal affect expressions. Psychophysiology 35, 530–536.
JACOB, R. J. K. 1991. The use of eye movements in human computer interaction techniques: What you look at is what you get.
ACM Trans. Inf. Syst. 9, 152–169.
JACOB, R. J. K. AND KARN, K., S. 2003. Eye tracking in human–computer interaction and usability research: Ready to deliver
the promises. In The Mind’s Eyes: Cognitive and Applied Aspects of Oculomotor Research, J. Hyönä, R. Radach, and H. Deubel,
eds. Elsevier Science, Oxford, 573–605.
KÜBLER, A., KOTCHOUBEY, B., HINTERBERGER, T., GHANAYIM, N., PERELMOUTER, J., SCHAUER, M., FRITSCH, C., TAUB, E., AND BIRBAUMER,
N. 1999. The thought translation device: A neurophysiological approach to communication in total motor paralysis. Exp.
Brain Res. 124, 223–232.
LAAKSO, J., JUHOLA, M., AND SURAKKA, V. 2002. Neural network recognition of electromyographic signals of two facial muscle
sites. In Proceedings of Medical Informatics Europe 2002. 83–87.
LAAKSO, J. JUHOLA, M. SURAKKA, V., AULA, A., AND PARTALA, T. 2001. Neural network and wavelet recognition of facial electromyographic signals. In Proceedings of 10th World Congress on Health and Medical Informatics, V. Patel, R. Rogers, and R. Haux,
Eds. IOS Press, Amsterdam, 489–492.
LAUER, R. T., PECKHAM, H., KILGORE, K. L., AND HEETDERKS, W. J. 2000. Applications of cortical signals to neuroprosthetic control:
A critical review. IEEE Trans. Rehabil. Engng 8, 205–208.
LUSTED, H. S. AND KNAPP, B. 1996. Controlling computers with neural signals. Sci. Am. 275, 82–87.
MACKENZIE, I. S. 1992. Movement time prediction in human–computer interfaces. In Proceedings of Graphics Interface ’92,
Canadian Information Processing Society. 140–150.
MACKENZIE, I. S. 1995. Movement time prediction in human–computer interfaces. In Readings in Human–Computer Interaction (2nd ed.), R. M. Baecker, W. A. S. Buxton, J. Grudin, and S. Greenberg, Eds. Kauffman, Los Altos, CA, 483–493.
MACKENZIE, I. S., KAUPPINEN, T., AND SILFVERBERG, M. 2001. Accuracy measures for evaluating computer pointing techniques.
In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI 2001). ACM Press, New York,
9–16.
MINIOTAS, D. 2000. Application of Fitts’ law to eye gaze interaction. In Extended Abstracts of the ACM SIGCHI Conference on
Human Factors in Computing Systems (CHI 2000). ACM Press, New York, 339–340.
MOORE, M. AND KENNEDY, R. K. 2000. Human factors issues in the neural signals direct brain–computer interface. In Proceedings
of the Fourth International ACM Conference on Assistive Technologies. ACM Press, New York, 114–120.
PARTALA, T., AULA, A., AND SURAKKA, V. 2001. Combined voluntary gaze direction and facial muscle activity as a new pointing
technique. In Proceedings of INTERACT 2001. IOS Press, Amsterdam, 100–107.
PARTALA, T. AND SURAKKA, V. 2003. Pupil size variation as an indication of affective processing. Int. J. Human Comput. Stud.
59, 1–2, 185–198.
RAYNER, K. 1998. Eye movements in reading and information processing: 20 years of research. Psychol. Bull. 124, 372–422.
SALVUCCI, D. D. 1999. Inferring intent in eye-based interfaces: Tracing eye movements with process models. In Proceedings of
the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI ’99). ACM Press, New York, 254–261.
SALVUCCI, D. D. AND ANDERSON, J. R. 2000. Intelligent gaze-added interfaces. In Proceedings of the ACM SIGCHI Conference on
Human Factors in Computing Systems (CHI 2000). ACM Press, New York, 273–280.
SIBERT, L. E. AND JACOB, R. J. K. 2000. Evaluation of eye gaze interaction. In Proceedings of the ACM SIGCHI Conference on
Human Factors in Computing Systems (CHI 2000). ACM Press, New York, 281–288.
STAMPE, D. M. AND REINGOLD, E. M. 1995. Selection by looking: A novel computer interface and its application to psychological
research. In Eye Movement Research: Mechanisms, Processes and Applications, J. M. Findlay, R. Walker, and R. W. Kentridge,
Eds. Elsevier Science, Amsterdam, 467–478.
SURAKKA, V. AND HIETANEN, J. K. 1998. Facial and emotional reactions to Duchenne and non-Duchenne smiles. Int. J. Psychophysiol. 29, 23–33.
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.
56
•
V. Surakka et al.
SURAKKA, V., ILLI, M., AND ISOKOSKI, P. 2003. Voluntary eye movements in human–computer interaction. In The Mind’s Eyes:
Cognitive and Applied Aspects of Oculomotor Research, J. Hyönä, R. Radach, and H. Deubel, Eds. Elsevier Science, Oxford,
473–491.
TANRIVERDI, V. AND JACOB, R. J. K. 2000. Interacting with eye movements in virtual environments. In Proceedings of CHI’2000
Human Factors in Computing Systems. ACM Press, 265–272.
TECCE, J. T., GIPS, J., OLIVIERI, C. P., POK, L. J., AND CONSIGLIO, M. R. 1998. Eye movement control of computer functions. Int. J.
Psychophysiol. 29, 319–325.
VIDAL, J. J. 1973. Toward direct brain–computer communication. Ann. Rev. Biophys. Engng 2, 157–180.
VAN BOXTEL, A. AND JESSURUM, M. 1993. Amplitude and bilateral coherency of facial and jaw-elevator EMG activity as an index
of effort during a two-choice serial reaction task. Psychophysiology 30, 1065–1079.
WARE, C., AND MIKAELIAN, H. H. 1987. An evaluation of an eyetracker as a device for computer input. In Proceedings of the ACM
SIGCHI Conference on Human Factors in Computing Systems (CHI ’87). ACM Press, New York, 183–188.
WOLPAW, R. J., BIRBAUMER, N., MCFARLAND, D. J., PFURTSCHELLER, G., AND VAUGHAN, T. M. 2002. Brain–computer interfaces for
communication and control. Clin. Neurophysiol. 113, 767–791.
ZHAI, S. 2003. What’s in the eyes for attentive input. Commun. ACM 46, 34–39.
ZHAI, S., MORIMOTO, C., AND IHDE, S. 1999. Manual and gaze input cascaded (MAGIC) pointing. In Proceedings of the ACM
SIGCHI Conference on Human Factors in Computing Systems (CHI ’99). ACM Press, New York, 246–253.
Received May 2003; revised December 2003, March 2004; accepted March 2004
ACM Transactions on Applied Perceptions, Vol. 1, No. 1, July 2004.