Application Domain Knowledge and Programmers` Mental

Application Domain Knowledge
and Programmers’ Mental
Representations
Teresa M. Shaft
University of Oklahoma
Iris Vessey
Indiana University
Application Domain Knowledge

Application Domain refers to the
context of the problem to be
addressed by the computer
software – in contrast to the
solution or computing domain (Blum
1989)
The Software Development
Process
Application
Conceptual
Domain
Models
adapted from Blum, 1989
Formal
Implementation
Models
Domain
Importance of Application Domain
Knowledge in Software Development


Those who can map deep application
domain knowledge into a computation
solution often serve as project gurus
“the thin spread of application domain
knowledge” was one of the most salient
problems in large scale development


Curtis, Krasner & Iscoe 1988
Significant effort during software
development is associated with acquiring
needed application domain knowledge

Walz, Elam & Curtis 1993
Application Domain Knowledge



Acknowledged as critical to the ability to
develop, comprehend, and maintain
software (Brooks, 1990, Pennington &
Grabowski 1990, Vans, von Mayrhauser &
Somlo 1999)
Infrequently studied (Glass & Vessey
1998).
Most studies of software comprehension
or maintenance either do not consider the
application domain or use materials the
researchers argue do not require
application domain knowledge.
Application Domain Knowledge in
Software Comprehension


Programmers use a more top-down
(hypothesis driven) comprehension
process when they have relevant
application domain knowledge (Shaft &
Vessey 1995)
During software comprehension,
programmers use application domain
knowledge in addition to (rather than to
replace) programming knowledge (Shaft
& Vessey 1998)
Current Study


Examines the influence of application
domain knowledge on programmers’
mental representations during the
software comprehension and
enhancement process
Examines the influence of different types
of enhancement tasks on the
development of programmers’ mental
representations
Conceptual Model of the Software Comprehension and Enhancement
Program Code
and
Documentation
Mental
Representation
of Computer
Program
Comprehension
Process
Enhancement
Specification
Enhancement
Process
Program/Programmer
Specific Knowledge Base
Application
Domain
Knowledge
Programming
Domain
Knowledge
Programmer’s
Knowledge Base
Mental
Representation
of Enhanced
Program
Program/Programmer
Specific Knowledge Base
Enhanced
Computer
Program
Programmers’ Mental Representations




Internal knowledge structure of the
contents of the computer program
Numerous types of information are
embedded in a computer program (Brooks
1987, Pennington 1987, Green 1977,
Corritore & Wiedenbeck 1999)
Comprehension is the process of extracting
that information from the computer
program
Programmers’ mental representations
reflect their comprehension of the different
types of information embedded in the
programs
Knowledge Categories




Function
Data flow
Control flow
State


Pennington 1987a, 1987b
These categories are relevant to
procedural programs
Initial Comprehension of a
Computer Program

Driven by the knowledge
programmers’ bring to the
comprehension task
Hypothesis 1: Initial Comprehension


H1a: Programmers have higher levels of
initial comprehension when they are
familiar with the application domain of the
program than when they are unfamiliar
with the application domain.
H1b: When programmers are familiar with
the application domain, they have better
comprehension of function and data flow
information than of control flow and state
information.
Comprehension After Conducting an
Enhancement



Influenced by the nature of the
enhancement task as well as the
knowledge programmers’ possess
Different types of enhancement tasks
rarely considered
Consider two types of enhancement
tasks:
 Function – high need for application
domain knowledge
 Control flow – low need for application
domain knowledge
Hypotheses 2: Comprehension After
Conducting An Enhancement


H2a: Comprehension after conducting an
enhancement will be greater when
programmers are familiar with the
application domain (than when they are
unfamiliar with the application domain).
H2b: Programmers have greater
comprehension of the type of knowledge
emphasized in the enhancement task
than the other types of knowledge.


Control flow task -> control flow knowledge
Function task -> function knowledge
Changes in Programmers’
Comprehension


Influenced by programmers’
knowledge
Influenced by the nature of the
enhancement task
Hypotheses 3: Changes in
Comprehension


H3a: Programmers experience
greater changes in comprehension
when they are familiar with the
application domain (than when they
are unfamiliar)
H3b: Programmers who conduct a
control flow task have greater
increases in comprehension than
those who conduct a function task
Methodology

24 professionals studied and
enhanced two computer programs


Minimum of 2 years professional
experience
Computer programs from two
application domains




Familiar application domain: accounting
Unfamiliar application domain: hydrology
COBOL programs
Equivalent size, data density, decision
density
Methodology (continued)

Assessed programmers’
comprehension at two points:



After an initial study period
After conducting an enhancement task
Comprehension assessed via
questions



Two questionnaires per program
20 questions per questionnaire
5 questions for each knowledge
category
Methodology (continued)




Half of the programmers conducted
control flow enhancements on both
programs; half conducted function
enhancements
Control flow tasks: insert a new level of
control break in an existing control break
report
Function tasks: create a new capability
Enhanced programs of equivalent size,
data density & decision density
Analysis of Hypothesis 1
Source
SS
Mean Square
F Value
1
23
14700.00
20200.00
14700.00
878.26
16.74
.001
Knowledge
Category (KC) 3
Error
69
8625.00
26075.00
2875.00
377.90
7.61
.01
AD * KC
Error
466.67
24633.33
155.56
357.00
.44
.73
Application
Domain (AD)
Error
DF
3
69
Pr > F
Hypothesis 1 – Initial
Comprehension

H1a: supported


H1b: not supported


programmers’ have higher levels of
comprehension in the familiar application
domain
No interaction between knowledge category
and application domain familiarity
Programmers’ comprehension of state
knowledge was significantly greater than
data flow, regardless of knowledge of the
application domain
100
90
80
73.33
70.84
69.17
70
Percent Correct
Familiar
(Accounting)
Unfamiliar
(Hydrology)
59.17
60
50
53.33
50.00
40
38.33
30
Function
Data Flow
Control Flow
State
Knowledge Category
Initial Comprehension by Knowledge Category
Analysis of Hypothesis 2
Source
DF
SS
Type of Enhancement
Task (TET)
1
1752.08
Error
22
26845.83
Application
Domain (AD)
AD * TET
Error
Mean Square
F Value
Pr > F
1752.08
1220.27
1.44
.24
1
1
22
10502.08
4680.75
7389.58
10502.08
468.75
335.42
31.21
1.40
<.001
.25
Knowledge
Category (KC) 3
KC * TET
3
Error
66
7289.58
522.92
14837.50
2429.86
174.31
224.81
10.81
.78
<.001
.51
AD * KC
3
AD * KC * TET 3
Error
66
5972.92
2106.25
19770.83
1990.97
702.08
299.56
6.65
2.34
.001
.08
Hypothesis 2 – Comprehension
After Conducting the Enhancement

H2a: supported


H2b: not supported


Comprehension in the familiar application domain
is greater than in the unfamiliar application domain
Type of enhancement task did not influence which
knowledge categories are better comprehended
Application domain knowledge influenced
comprehension of the different knowledge
categories:



Data flow, control flow and state knowledge more
accurately understood in the familiar application
domain
Within familiar application domain: state
knowledge more accurately understood than other
categories
Within unfamiliar application domain: state and
function more accurately understood than data
flow knowledge
100
86.67
90
80
70.00
71.67
72.50
Application
Domain:
70
Percent Correct
Familiar (Accounting)
66.67
65.00
60
57.50
50
45.83
40
30
Function
Data Flow
Control Flow
State
Knowledge Category
Application Domain Familiarity by Knowledge Category After Enhancement Task
Unfamiliar (Hydrology)
Analysis of Hypothesis 3
Source
Type of Enhancement
Task (TET)
Phase
TET * Phase
Error
Application Domain (AD)
AD * TET
Error
Knowledge Category (KC)
KC * Task
Error
Phase * AD
Phase*AD*TET
Error
Phase*KC
Phase*KC*TET
Error
ADK * KC
ADK* KC* TET
Error
Phase*ADK*KC
Phase*ADK*KC*TET
DF
SS
Mean Square
1
1
1
22
1
1
22
3
3
66
1
1
22
3
3
66
3
3
66
3
3
176.04
5551.04
2109.38
7564.58
25026.04
26.04
14172.92
15586.46
786.46
22102.08
176.04
651.04
13197.92
328.13
969.79
17577.08
3253.13
1553.13
17668.75
3186.46
844.79
176.04
5551.04
2109.38
343.84
25026.04
26.04
644.22
5195.49
262.15
334.88
176.04
651.04
599.91
109.38
323.26
266.32
1084.38
517.71
267.31
1062.15
281.60
F Value
Pr > F
0.06
16.14
6.13
.81
<.001
.02
38.85
0.04
<.001
.84
15.51
0.78
<.001
.51
0.29
1.09
.59
.31
.41
1.21
.75
.31
4.05
1.93
.01
.13
2.65
.71
.06
.55
Hypothesis 3 – Changes in
Comprehension



H3a: not supported, application domain
knowledge did not allow programmers to
gain more knowledge
H3b: supported, those who conducted a
control flow enhancement showed greater
gains in comprehension
Application domain influenced the types of
knowledge programmers understood:



Higher levels of comprehension of all knowledge
categories in the familiar application domain
Within familiar application domain: state
knowledge more accurately understood than
function and data flow
Within unfamiliar application domain: function,
control flow and state knowledge more accurately
understood than data flow
90
80
80
70
70.83
68.75
65.42
Percent Correct 60
Application
Domain:
62.92
61.67
Unfamiliar
(Hydrology)
53.75
50
40
42.08
30
Function
Data Flow
Control Flow
Familiar
(Accounting)
State
Knowledge Category
Application Domain Familiarity by Knowledge Category Interaction Across Phases
80
70 Control Flow
70
61.04
63.96 Function
60
Type of
Enhancement
Task:
57.71
Percent Correct
Control Flow
Function
50
40
30
Prior to Enhancement
After Enhancement Task
Phase
Phase by Type of Enhancement Task Interaction
Summary of Results
Research
Question
1: Initial
Comprehension
2: Comprehension After
Conducting the Enhancement
Application
Domain
Familiarity
Familiar >
unfamiliar
Familiar > unfamiliar.
Application domain knowledge
did not result in greater gains in
comprehension.
Type of
Knowledge
State > data
flow
Across Application Domains:
Data flow, control flow & state
knowledge better understood
in the familiar than unfamiliar
application domain.
Within Familiar Application
Domain: state > function, data
flow & control flow.
Within Unfamiliar Application
Domain: function & state >
data flow.
Across Applications Domains: all
knowledge categories better
understood in the familiar
application domain.
Within Familiar Application
Domain: state > function & data
flow.
Within Unfamiliar Application
Domain: function, control flow
& state > data flow.
Not applicable
Type of enhancement task did
not influence programmers’
comprehension or their
understanding of different
types of knowledge
Conducting a control flow
enhancement increased overall
comprehension.
Type of
Enhancement
Task
3: Changes in Comprehension
Discussion

When programmers possess application
domain knowledge they have higher
levels of comprehension initially and after
enhancing a program


Consistent for two types of enhancements
Even in the context of program
maintenance, a software development
task that is seen as relatively far from the
original domain, relevant application
domain knowledge benefits software
comprehension
Discussion

Conducting a control flow
enhancement task led to gains in
comprehension


Not all types of tasks will give
programmers’ the same opportunity to
develop their mental representation of
a computer program
A more complete taxonomy of tasks
would be helpful to future researchers
Implications – Types of Enhancement
Tasks

Long-term vs. short-term
employees


Programmers’ mental representations
develop if they engage in tasks that
require changes to multiple locations of
the program
Programmers who conduct
enhancements which do not require
interacting with much of the original
program, did not experience significant
increases in comprehension
Implications – Application Domain
Knowledge



Lengthy start-up time for new IT
employees
Training and selection
Outsourcing and off-shoring


Activities that require less application
domain knowledge would seem to be
better choices
Develop mechanisms to provide
application domain knowledge
Conclusions



Not all enhancement tasks encourage
development of programmers’ mental
representation of a computer program.
Traditionally, software comprehension and
enhancement were seen as residing in the
programming domain and requiring little,
if any, application domain knowledge.
Application domain knowledge is
beneficial to software comprehension, and
to the understanding of all knowledge
categories.