Lektion05

Kursusgang 5
Oversigt:
• Sidste kursusgang
• Fremlæggelse
• Brugbarhedsevaluering:
-
Teknikker til brugbarhedstest
Heuristisk inspektion
Tænke-højt kontra heuristisk inspektion
Learning to find usability problems in internet time
Design af brugerflader
5.1
Sidste kursusgang
• Interaktionsdesign:
-
Paradigmer
Principper
-
Udførelse
Fortolkning
• Brugbarhedsevaluering:
Design af brugerflader
5.2
Fremlæggelse
Hver gruppe (ca. 5 minutter):
• Kort beskrive produktet/systemet og afprøvningen
(testproceduren)
• Vise uddrag fra afprøvningen (VHS)
• Vurdere produktets/systemets brugbarhed (var systemet
brugbart, hvad var der af brugbarhedsproblemer)
• En vurdering af testmetoden: hvad var let og hvad var
svært ved planlægning og udførelse af testen.
Design af brugerflader
5.3
Erfaringer
Let
• Nemt inde i kontrolrummet,
når det kører
Svært
• Hvad gør man, når en
testperson er gået i stå eller
går i forkert retning
• At finde en passende bruger
•
•
•
•
Design af brugerflader
Brugeren tænker ikke højt –
siger bare hvad han/hun gør
Teknikken drillede lidt
Vanskeligt at se skærmen på
et mobilt system (opstilling af
kameraer)
At have styr på roller
(modtager, logger, operatør)
5.4
Teknikker til brugbarhedstest
Laboratorium
Brugerorganisation
(felt)
Andre dimensioner:
Design af brugerflader
Bruger kontrollerer
Udvikler kontrollerer
Tænke-højt
Konstruktiv interaktion
Heuristisk inspektion
Kognitiv inspektion
Fokusgruppe
Observation
Anvendelsesstatistik
Tilbagemelding
Interview
Spørgeskemaer
Rigorisme (planlagt og styret forløb)  Realisme
Kvalitativt  Kvantitativt
5.5
Heuristisk inspektion
Laboratorium + Udviklerkontrol
•
•
•
•
•
•
•
Deltagerne gennemgår systemet ud fra
en checkliste
Scenario med relevante opgaver kan
strukturere processen
Systemet gennemgås to gange:
1.Fokus på helhed og umiddelbare
indtryk
2.Fokus på detaljer såsom funktioner i
forhold til opgaver
Deltagerne arbejder individuelt og
noterer problemer
I fællesskab udarbejdes en samlet liste
Problemerne kategoriseres eventuelt
(kritisk, alvorligt, kosmetisk)
Rettelsesforslag udarbejdes, prioriteres
og overdrages til udviklerne
Design af brugerflader
Eksempel på checkliste:
 Enkel og naturlig dialog
 Tal brugerens sprog
 Minimer krav til
hukommelsen
 Sørg for konsistens
 Giv feedback
 Lav tydelige udgange
 Lav genveje
 Giv konstruktive
fejlmeddelelser
 Forebyg fejl
5.6
Øvelse (1)
Antal inspektioner – Molich & Nielsens resultater:
1: ca. 35% af alle problemer findes
3-5: ca. 70% af alle problemer findes
Denne påstand er meget omdiskuteret
Øvelse (fra Molich & Nielsen):
Functionality: A service from Manhattan Telephone (MANTEL) to home users. Typical
users have little knowledge of data processing. They can dial into the system, which will
provide the name and address of a telephone subscriber in the United States, given the
telephone number of the subscriber.
Assumptions: For each telephone number there is at most one subscriber. All telephone
numbers consist of exactly ten digits (3 digit area code and 7 digit local number). The
user's computer has a traditional alphanumeric, monochrone display with 24 lines of 80
characters each and a typewriter-like keyboard with the usual extra keys found on most
keyboards, including 10 function keys marked PF1-PF10.
Design af brugerflader
5.7
Øvelse (2)
Dialogue: Enter by selecting
"Computer Telephone Index"
from the main MANTEL
menu. The system then
issues the prompt:
ENTER DESIRED
TELEPHONE NO. AND
RETURN
If the user enters anything
other than exactly 10 digits
to this prompt, the system
answers:
ILLEGAL NUMBER: TRY AGAIN
If the user enters a telephone number which is not in use, the system answers:
UNKNOWN TELEPHONE NUMBER
If the area code is 212 (Manhattan), the system will normally display the the screen shown within 5
seconds. For other area codes, the system must retrieve the necessary information from external
databases.This
may take up to 30 seconds.
Design af brugerflader
5.8
Øvelse (3)
Design af brugerflader
5.9
Øvelse (4)
Design af brugerflader
5.10
Øvelse (5)
Design af brugerflader
5.11
Øvelse (6)
Design af brugerflader
5.12
Øvelse (7)
Design af brugerflader
5.13
Tænke-højt kontra Heuristisk inspektion
•
•
•
•
•
www.hotmail.com
8 laboratorier testede webstedet
-
•
•
Professionelle firmaer
Forskningsmiljøer
Studerende
Testen skulle omfatte et antal
specificerede funktioner
Selve udførelsen kunne
tilrettelægges frit
Formålet var at undersøge
kvaliteten af brugbarhedstest
1 af laboratorierne indgik ikke
seriøst i undersøgelsen
6 af laboratorierne baserede
deres evaluering på test med
brugere
-
•
1 af laboratorierne baserede
deres evaluering på en
kombination af heuristisk
inspektion og test med
brugere
-
Design af brugerflader
De fandt mellem 17 og 75
problemer af forskellige
kategorier
De fandt 150 problemer
De beskrives tit med
formuleringen “might be a
problem”
107 af deres problemer findes
ikke af nogen af de andre
De finder 19 ud af 26 “core
problems” men uklart hvordan
5.14
RESULTATER
Design af brugerflader
5.15
Tænke-højt versus inspektion
Tænke-højt forsøg
Gruppe-inspektion
Individuel inspektion
159
68
49
19
18
3
9
13
1
8
9
1
Ingen aktion (antal)
7
24
29
Unikke problemer for
hver metode (antal)
13
1
0
Samlet tid (i timer)
Problemtyper (antal)
Tid/problem
Kategoriproblemer
160
159
1.0
40
118
68
1.7
23
94
49
1.9
18
Tid/SPA
4.0
5.1
5.2
Problemtyper (antal)
Problemkategorier
Kategori 1 (antal)
Kategori 2 (antal)
Kategori 3 (antal)
(Karat, Campbell og Fiegel, Comparison of Empirical Testing and Walkthrough Methods in User Interface Evaluation, 1992)
Design af brugerflader
5.16
Learning to Find Usability Problems in
Internet Time
Mikael B. Skov & Jan Stage
Design af brugerflader
5.17
Motivation
•
•
•
•
•
•
•
•
Information technology: available to anyone, anywhere, anytime
Strength: WWW is a significant move in that direction
Weakness: Many web sites are designed and implemented in fastpaced projects by multidisciplinary teams
Teams involve such diverse professions as information architects,
Web developers, graphic designers, brand and content strategists,
etc.
Teams are usually not familiar with established knowledge on
human-computer interaction
The strong limitation in terms of price and development time
effectively prohibits usability testing in the classical sense,
conducted by experienced testers in sophisticated laboratories
Methods tend to focus on analysis, design, and implementation
The implied lack of focus on usability issues and practical skills with
usability testing reflects a potential barrier for universal access of
information on the Web
Design af brugerflader
5.18
Empirical study (1)
Research questions:
•
•
What is the potential for
supporting universal access
through dissemination of
fundamental usability
engineering skills
Can we teach a simple
approach to usability testing to
people with an interest in
information technology but
without formal education in
software development or
usability engineering, and to
do it in less than a week.
Overall design:
• A course for first-year students at
•
•
•
-
Think-aloud protocol (Nielsen 1993)
Questionnaires filled in after each task
and the entire test (Spool et al.)
• The exercises after the first four
•
Design af brugerflader
Aalborg University, Denmark.
Subject: fundamentals of
computerized systems with
particular emphasis on usability
issues.
Ten class meetings with two hours
of class lecture and two hours of
exercises in smaller teams.
Two primary techniques:
class meetings made the students
conduct small usability pilot tests in
order to train and practice their
practical skills.
The last six exercises were devoted
to conducting a more realistic
usability test of a web-site:
Hotmail.com.
5.19
Empirical study (2)
• 36 teams of first year university
•
•
•
students used the simple approach
to conduct a usability evaluation of
the email services at Hotmail.com.
The 36 teams consisted of 234
students in total, of which 129
acted as test subjects
Educations: architecture and
design, informatics, planning and
environment, and chartered
surveyor
All part of a natural science or
engineering program at Aalborg
University
•
Each team should apply at least one of the
two primary techniques, and could
supplement this with other techniques
•
The team should among themselves
choose a test monitor and a number of
loggers and the rest of each team acted as
test subjects
• Each team was given a very specific two-
•
page scenario stating that they should
conduct a usability test of the Hotmail
web-site (www.hotmail.com)
The entire team worked together on the
analysis and identification of usability
problems and produced the usability
report
Team size
Average
Team size
Min / Max
Number of test
subjects
Average
Number of test
subjects
Min / Max
Age of test
subjects
Average
Age of test
subjects
Min / Max
6.5
4/8
3.6
2/5
21,2
19 / 30
Design af brugerflader
5.20
Data collection and analysis
•
•
The usability reports were the
primary source of data for our
empirical study
All reports were analyzed,
evaluated, and marked by both
authors:
1. We worked individually and marked
each report in terms of 16 different
factors
2. The markings were compared, a
new factor was added, and the
characteristics of each factor was
specified explicitly
3. We worked individually to re-mark
all reports according to the 17
factors
4. All reports and evaluations were
compared and a final evaluation on
each factor was negotiated
The markings were made on a
scale of 1 to 5, with 5 being
the best
Design af brugerflader
Five of the 17 factors:
1. The planning and conduction of the
evaluation
2. The quality of the task assignments
3. The clarity and quality of the
problems listed in the report
4. The practical relevance of these
problems
5. The number and relevance of the
usability problems identified
• Comparison with usability reports
•
•
produced by eight professional
laboratories (Molich 1999)
Evaluated the same web-site
according to the scenario used by the
students
Their reports were analyzed,
evaluated, and marked through the
same procedure as the student
reports
5.21
Similar distribution
• The relevance of the tasks, the number of
•
•
•
tasks, and the extent to which they cover
the areas specified in the scenario
The student teams cover all five elements
of the scale, with an average of 3.3
The professional laboratories score
almost the same result, with an average
of 3.5
This is by no means impressive for the
professionals; a general low quality of
the tasks
• How well each problem is described,
•
•
•
explained, and illustrated and how easy it
is to gain an overview of the complete list
of problems
The student teams are distributed mainly
around the middle of the scale, with an
average of 2.9
The professional laboratories are
distributed from 2 to 5 with an average of
3.5
Again, not impressive for the professional
laboratories.
Tasks
Clarity of Problem List
%
%
40
40
35
35
30
30
25
25
Professional
20
Student
15
15
10
10
5
5
0
0
1
2
Design af brugerflader
3
4
5
Professional
20
Student
1
2
3
4
5
5.22
Different distribution (1)
• How well the tests were planned,
•
•
•
organized, and carried out
The student teams have average of 3.7
and the majority score 4, indicating wellconducted tests with a couple of
problematic characteristics
The professional laboratories score an
average of 4.6 on this factor, and 6 out of
8 score the top mark.
This is as it should be expected because
experience will tend to raise this factor.
• The practical relevance of the problem list
• The student teams are almost evenly
•
•
•
distributed on the five marks of the scale,
and their average is 3.2
The professional laboratories score an
average of 4.6 where 6 out of 8
laboratories score the top mark
Reason may be the experience of the
professionals in expressing problems in a
way that make them relevant to their
customers
The course has focused too little on
discussing the nature of a problem
Test Conduction
%
%
80
80
70
70
60
60
Practical Relevance of Problem List
50
50
Professional
40
Professional
40
Student
Student
30
30
20
20
10
10
0
0
1
2
Design af brugerflader
3
4
5
1
2
3
4
5
5.23
Different distribution (2)
•
•
•
A key aim in usability testing: to uncover
and identify usability problems
The student teams are on average able to
find 7.9 problems. They find between 1
and 19 problems with half of the teams
finding between 6 and 10 problems
•
Thus the distribution seems to be
reasonable
•
•
The average for the professional
laboratories is 23.0 problems identified
Only one of them scores in the same
group as a considerable number of
student teams – that is between 11 and
15 problems
Only one student team identified a
number of problems that is comparable
to the professional laboratories
Number of Problems
Number of Problems
%
%
60
60,0
50
50,0
40
40,0
Professional
30
30,0
Student
20
20,0
10
10,0
0,0
0
1-3
4-7
Design af brugerflader
8-12
13-17
>17
1-5
6-10
11-15
16-20
21-25
26-30
31-35
36-40
41-45
5.24