WiltzFrancis1982

CALIFORNIA STATE UNIVERSITY, NORTHRIDGE
A COMPARATIVE ANALYSIS OF SOFTWARE
VERIFICATION AND VALIDATION METHODOLOGIES
A thesis submitted in partial satisfaction of the
requirements for the degree of Master of Science in
Computer Science
by
Francis Nolan Wiltz
May 1982
The Thesis of Francis Nolan Wiltz is approved:
D. Gumb (Committee Chairman
California State University, Northridge
ii
ACKNOWLEDGEMENTS
I
would like to express my sincere appreciation
to the members of my master's committee, namely, Dr.
Diane Schwartz, Dr. Ray Gumb and John Johl for their
assistance in making this thesis a reality.
A special
debt of gratitude goes to my master's chairman, Dr.
Ray Gumb, for his constructive criticisms, encouragements and overall support in writing this thesis.
In
addition, to the many professors who helped me along
the
way,
in
particular,
to
Drs.
Russ
Abbott,
Jack
Alanen, Phil Gilbert, George Lazik and Nancy Levenson
goes many thanks.
A special thanks to Sue Nissen for an outstanding
job
done
in
typing,
editing
preparation of this thesis.
iii
and
assistance
in
the
DEDICATIONS
I would like to dedicate this thesis to members
of my family.
In particular, to my wife, Cecilia, for
the love and understanding she has shown and the many
sacrifices she has endured during the many hours required to complete my thesis.
To my children, David,
Kenneth, Alicia and Anthony, who could still love me
as a father when my time with them was very limited.
To my mother and father who first instilled the motivation in me to attend college=
A special
dedication
to
my
sister,
Linda,
show, if only in a small way, my love for her.
iv
to
TABLE OF CONTENTS
• •••••• • ••• • ••• • ••• •••• •••• ••• ••• • ••• •
i
APPROVAL PAGE •••••••••••••••••••••••••••••••••••
ii
ACKNOWLEDGEMENTS ••••••••••••••••••••••••••••••••
iii
DEDICATIONS • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
iv
TITLE PAGE
LIST OF TABLES ••••••••••••••••••••••••••• ~ •••••• viii
LIST OF FIGURES •••••••••••••••••••••••••••••••••
ABSTRACT
CHAPTER
CHAPTER
........... . . ....... . . . .. ..... .. ........
INTRODUCTION .. . ... . ... . .. .... .. . .
I
ix
X
1
1.1
Introduction to Chapter I ••••••••
2
1.2
Scope of Research ••••••••••••••••
7
II
DISCUSSION OF SOFTWARE V&V
METHODOLOGIES ••••••••••••••••••••
8
2.1
Introduction to Chapter II •••••••
9
2.2
Static V&V Methodologies •••••••••
9
2.3
2.2.1
Requirement Review
eeeeeeee
10
2.2.2
Design Review •••••••••••••
13
2.2.3
Structured Walkthrough ••••
15
2.2.4
Checklist Review..........
18
2.2.5
Desk Checking •••••••••••••
20
2.2.6
Structural Analyzer •••••••
22
Dynamic V&V Methodologies ••••••••
27
2.3.1
27
Simulation ••••••••••••••••
v
CHAPTER
CHAPTER
2.3.2
Assertion Checker •••••••••
33
2.3.3
Symbolic Execution •••••••••
35
2.3.4
Proof of Correctness
39
2.3.5
Mathematical Checker
45
2.3.6
Test Data Generation
46
2.3.7
Instrumentation ••••••••••••
49
2.3.8
Test Driver ••••••••••••••••
51
2.3.9
Test Coverage Analyzer •••••
54
COMPARATIVE ANALYSIS AND RANKING OF
SOFTWARE V&V METHODOLOGIES ••••••••
57
3.1
Introduction to Chapter III •••••••
58
3.2
Functional Requirements Phase •••••
61
3.3
Design Specifications Phase •••••••
62
3.4
Implementation Phase ••••••••••••••
63
3.5
Maintenance Phase •••••••••••••••••
63
3.6
Ranking Process •••••••••••••••••••
64
3.7
Analysis of Methodologies with
Respect to Ranking Criterions •••••
68
ANALYSIS OF ERROR OCCURRENCE AND
DETECTION •••••••••••••••••••••••••
88
4.1
Introduction to Chapter IV ••••••••
89
4.2
Error Types and Classification ••••
90
4.3
Error Occurrence/Detection during
Software Development Phases ••••••• 103
4.4
Summary of Error Analysis ••••••••• 108
III
IV
vi
0
0
0
0
0
0
0
CHAPTER
v
RESULTS OF RESEARCH AND ANALYSIS •• 110
CHAPTER
VI
RECOMMENDATIONS ••••••••••••••••••• 113
BIBLIOGRAPHY
.. . ..... . . .... . . .. .... ... ..... . . . ...... 116
APPENDICES
Appendix A
Glossary of Terms ••••••••••••••••• 129
Appendix B
Ranking of Software V&V
Methodologies ••••••••••••••••••••• 133
Appendix C
Categorization of V&V Methodologies
as Related to a Software Development
Phase • • • • • • • • • • • • • • • • • • • • • • • • • • • • .
Appendix D
Appendix E
139
Summary of Advantages and Disadvantages of each Methodology ••••••••• 141
Brief Discussion of Various Automatic
V&V Tools ••••••••••••••••••••••••• 147
vii
LIST OF TABLES
Table 3.6.1
Ranking Keys •••••••••••••••••••••••••
Table 4.3.1
List of Errors Detected with Respect to
a Software Development Phase for Project
Table Bl
Table B2
Table B3
Table B4
65
}{.............. . • • . • • . . . • • • . . . • . . . • . . .
104
Functional Requirements Phase-Ranking
of Methodologies •••••••••••••••••••••
134
Design Specifications Phase-Ranking of
Methodologies ••••••••••••••••••••••••
135
Implementation Phase-Ranking of
Methodologies ••••••••••••••••••••••••
136
Estimated Cost with Respect to a Software Development Phase •••••••••••••••
137
Table Cl
Categorization of V&V Methodologies as
Related to a Software Development Phase 140
Table Dl
Summary of Advantages and Disadvantages of Each Methodology •••••••••••
Viii
142
LIST OF FIGURES
Page
Figure 1.1.1
Hardware/Software Cost Trend ••••••
4
Figure 2.3.3.1
Example used in Symbolic Execution.
36
Figure 3.1.1
Software Development Phases •••••••
59
Figure 3.1.2
Software Development Cost per Phase
60
Figure 3.6.1
Software V&V Tools Survey
67
Figure 4.3.1
Errors Detected with Respect to a
Software Development Phase During
ProjecT X •••••••••••••••••••••••••
105
Errors Attributed to Other Phases •
109
Figure 4.3.2
ix
ABSTRACT
A COMPARATIVE ANALYSIS OF SOFTWARE
VERIFICATION AND VALIDATION METHODOLOGIES
by
Francis Nolan Wiltz
Master of Science in Computer Science
Typically more than 50 percent of software development cost goes into software testing.
Thus, there is a
critical need for cost effective, easily used and reliable software verification and validation methodologies.
This thesis contains a comparative analysis of fifteen methodologies selected with respect to a software
X
development phase (i.e., functional requirements, design
specification, implementation and maintenance).
Each methodology is discussed emphasizing both their
weak and strong points and categorized mainly through a
subjective selection process then ranked based on criterion such as cost, ease of use and reliability of the
methodology. In addition, a discussion of each software
development phase is presented.
In conjunction with the above comparative analysis, a
study of error occurrence and detection was conducted by
reviewing
data
(various sizes)
compiled
for
several
software
programs
in order to gain some insight into where
a particular software verification and validation methodology should be concentrated.
This study was accom-
plished within the framework of the four software development phases stated earlier.
As part of the findings in this analysis, it was determined
that
cost
effectiveness
and
an
appreciable
early detection of errors were apparent if the methodologies were used judiciously and as early as possible
in the software development process.
xi
CHAPTER I
INTRODUCTION
1
2
1.1
Introduction to Chapter I
The primary objective of this thesis is to present some insight into current software verification and
validation
(V&V)
methodologies;
a
subject
very
seldom
mentioned or researched as compared to other areas of the
rapidly growing software technology.
To do so,
a com-
parative analysis and ranking of selected V&V methodologies
as
related
to
phase
(i.e.,
tion,
implementation,
a
functional
particular
software development
requirements,
and
design specifica-
maintenance)
was
conducted.
Further, to reinforce the judicious utilization or selection of these methodologies, a study of error occurrence
and detection was also accomplished.
The
primarily on
software.
usefulness
its
of
inherent
Typically,
a
V&V methodology
ability
to
detect
is
based
errors
in
testing procedures used currently
are not well-planned or devised (ad hoc techniques).
To
assure reliable software, one normally becomes involved
in an exhaustive testing process.
We do not
know how
much testing is necessary to obtain the desired results
(reliable software).
software
schedule has
objective of a
We just continue testing until the
run
its
course.
Therefore,
"good" testing methodology should be
the
to
3
reduce the potentially infinite exhaustive testing process to a finite one; at least one that we can feasibly
handle.
The judicious use of testing methodologies dis-
cussed
later
However,
in
this
since there
thesis
are a
could
wide
provide
range
of
just
that.
software V&V
methods or tools, the user must be cognizant of his needs
and what the method will provide for him.
It is generally agreed upon throughout the software community
(Alberts
[2], Branstad
Deutsch
Fairley
[23],
Myers
[20],
[47],
Ramamoorthy
Glass
[53],
[10],
[28],
Sorkowitz
Brown
[11],
Jensen
[39],
[58],
Tratner
[60], et al.) that testing in all its forms requires approximately 50 percent or more of the software development cost and time.
ware
cost
at
a
Software cost is outstripping hard-
high
rate
(Alberts
[2],
Boehm
[7]).
Figure 1.1.1 (which is still somewhat accurate today) depicts this hardware/software cost trend.
Therefore,
it
seems logical that to significantly reduce the cost of a
software/hardware product, we must reduce the cost of the
V&V process.
This is accomplished by using the correct
V&V methodology for
However,
even with
the
job if
the high
the methodology exists.
cost of
methods and the development thereof,
backseat
to
development
of
other
software,
is still
software
testing
taking
methods
a
and
'
'
4
tools.
There is a definite need for an earnest effort in
this area.
100
I
80
HARDWARE
Percent of
60
total
40
I
cost
20
SOFTWARE
I
0
1955
1970
YEAR
Figure 1.1.1 Hardware/Software Cost Trends
(extracted from Boehm [7]).
1985
5
To promote cost savings,
it
the software V&V process should occur
software
development
(discussed
is suggested
that
in each phase of
in Chapter
III)
rather
than in a single isolated stage following implementation
(which normally happens)
because errors discovered that
late usually exact a more exorbitant price than if found
earlier.
When we speak about reliable software
(Bruggere
[12]), we are basically saying that the software satisfies or is consistent with its specifications.
Sometimes
the specification is not correct or needs some modification (this is the current view in program verification earlier work tended to overlook the possibility that the
specification could be wrong).
To obtain reliable soft-
ware, we must have testing methods to tell us that it is
reliable to begin with or methods to help us get to that
point.
However, software reliability technology is still
in its
infancy which leaves one still without specific
guidelines
to
obtain
reliability
in
its
entirety.
In
addition, some of the methods that are discussed in Chapter II are either still being researched or only experimental (used only in small scale programs).
In my opinion, this thesis would not be complete
6
without, even if on a small scale, an investigation as to
where these V&V methodologies would attain their greatest
usefulness.
By
investigating
where
certain
types
errors occur and where they are usually detected
of
(which
software development phase), we can then select with some
degree of
job.
accuracy the most "sui table" method for
the
Although there is some overlap in detection effi-
ciency, each methodology has some unique capabilities.
Chapter II presents a discussion of fifteen (15)
V&V methodologies
emphasizing
points along with,
in some cases,
methodology
their
strong and weak
typical
(also see Appendix D).
use (s)
of
a
Each methodology is
classified as belonging to either a static or dynamic V&V
category.
Chapter III provides a comparative analysis and
ranking of each V&V methodology discussed in Chapter II
with respect to a software development phase.
The rank-
ing criterion is based on cost, ease of use and reliability of the methodology.
The rankings of the methodolo-
gies are shown in Appendix B.
Chapter IV deals with an analysis of error occurrence
and
detection
in
software.
Several
software
7
studies/projects
(various
sizes)
are
reviewed
for
the
purpose of acquiring data on error types and to determine
where a V&V method would be best utilized.
Chapter V summarizes the findings during the research.
Chapter VI gives recommendations of other software techniques to be used along with the V&V methodologies presented to enhance the chances of obtaining reliable software.
1.2
Scope of Research
In undertaking this analysis, I have limited the
number
(fifteen)
of methodologies discussed and analyzed
to those currently or most widely used
community.
In addition, only a brief discussion of vari-
ous automated V&V tools
Also,
in the software
(see Appendix
E)
is presented.
the time allotted to complete this paper suggests
that I
However,
use some brevity in my selections for analysis.
the
extensive
annotated
bibliography no
will lead the interested party to further
of other useful V&V methodologies.
doubt
investigation
CHAPTER II
DISCUSSION OF SOFTWARE
V&V METHODOLOGIES
8
9
2.1
Introduction to Chapter II
In
this
chapter
ologies are discussed.
ogy is detailed
points
various
V&V
method-
An explanation of each methodol-
emphasizing both their strong and weak
(also see Appendix D).
differences
software
between
the
In addition,
methodologies
are
some major
noted
when
applicable.
The software V&V process can be divided into two
modes of testing approaches, namely, static and dynamic
(Bate
[6], Fairley [23]).
Hence,
the V&V methodologies
are discussed with respect to this partitioning.
2.2
Static V&V Methodologies
Utilizing
static V&V,
software programs
are
tested without considering their behavior in a run-time
environment.
a
The software usually exists in the form of
human-readable
representation such as a
charts (structured form of flowcharts)
listing,
N-S
or functional re-
quirements/design specification documentation.
Since
analyzing
the
static
V&V
functional
is
or
concerned
structural
primarily
aspects
of
with
the
software, it is very useful in detecting errors in func-
10
tional requirements, design specification and in particular, algorithmic processes.
2.2.1
Requirement Review
A requirement
[10], Jensen
[39]}
of the functional
review
(Alberts
[2],
Branstad
is a methodology tailored for review
requirements.
It is very similar to
the design review methodology which is described later in
Section 2.2.2.
Sometimes in the literature, this meth-
odology
differentiated
is
not
methodology.
differences
However,
between
I
the
from
believe
the
there
design
are
review
sufficient
two methodologies to warrant
a
separate discussion.
According
to
certain
defined
procedures,
the
requirements documentation is reviewed by a software team
for
consistency,
correctness
and
completeness with
spect to the intended user specifications.
abstraction
of
the
functional
requirements
re-
The level of
is
substan-
tially more than that of the design specification.
The
greater level of abstraction of the former places more of
a demand on the analysis ability of the persons participating in the review.
As an initial prelude to the requirement review,
11
the required material (requirements documentation, analysis results, hardware/software partitioning charts, user
specifications,
i terns
helpful
interface
to
the
documentation
presentation)
is
advance to the software participants.
and
any
other
disseminated
in
The team members
are expected to be knowledgeable about the requirements
to be reviewed.
The members of the software team normally consist of the person who wrote the functional requirements,
the review moderator, the person who will write the design
specification,
and any other persons concerned or
affected by the final outcome of the functional requirements.
It is very critical that the members of the team
be carefully selected for maximum benefit to be derived
from the review.
As in the design review, discussed in the next
section, the functional requirements author verbally presents the software requirements to members of the review
team.
As the meeting proceeds, each member appropriately
makes comments on or assists in making clarifications to
areas of disagreement.
Some or all of the
following,
to, should be topics for discussion:
but not limited
12
o
Functional software partitioning
o
Interface between functions
o
Identification
of
input
and
output
quant-
ities
o
Special requirements or constraints on the
software or hardware.
o
Alternatives to complex solutions or algorithms
o
Timing requirements
o
Storage allocations considerations
o
Consistency of functional
requirements with
respect to user specifications.
More so than the design review, an apparent difficulty is maintaining the requirements review discussions
on course.
This is due in part to the somewhat abstract
nature of the functional requirements and the open forumlike environment where problems other than those intended
are aired.
review
Hence, a favorable outcome of the requirement
demands
well-written
functional
requirements
and
sticktoitiveness to the purpose of the review.
Normally,
there
is a
tendency for
the· partici-
pants of the review to get bogged down on details which
should be left for the design specification process.
Here
is where the review moderator plays a most important part
13
in the review.
2.2.2
Design Review
A
Deutsch
design
[20],
review
Glass
[28])
(Alberts
[2],
Branstad
[10],
is a methodology specifically
utilized in the transmission and constructive criticism of
the
software
design
prior
to
the
implementation phase.
The main objective of this methodology being to determine
if the design is consistent and satisfies the intent of
the functional requirements by reviewing the software design documentation.
There are several different ways to conduct a design review.
The most widely acceptable way is for the
designer to submit the design documentation far enough in
advance so that reviewers can study and comment on anything germane to the review.
The designer verbally pre-
sents the software design with the help of charts, graphic
aids or anything else which will give additional information that could help to convey the specific intent of the
design.
The number of people in attendance should be kept
to a minimum.
It is difficult to prescribe an exact num-
ber but the number should be based on the complexity of
14
the design and the amount of people necessary to obtain a
good cross section for an effective review.
One difficulty in conducting a review is keeping
the review under control.
It is very easy to lose sight
of the intent of the review.
Someone, other than the de-
signer, should be there to monitor the flow of ideas between
designer
and
reviewers
to maintain
the
review on
course.
Often another difficulty is that too much time is
wasted trying to find solutions to all problems detected.
Difficult problems should be deferred to a later date and
should consist of only people directly involved with the
problem.
To make the methodology work and to obtain full
benefit of the design review, the following should be some
of the main ingredients of the design review process:
o
Capable
personnel
willing
to
earnestly
devote their time to the review.
o
Willingness on the part of the designer
to
accept
and
implement
constructive
criticisms.
o
Someone
to
right track.
maintain
the
review
on
the
15
o
Follow-up
on
all
recommendations
and
problems detected.
Finally,
a good design
review should
result
in
most of the following:
o
Solutions to most or all of the problems
detected.
o
Action
items given on problems
not
re-
deemed
require-
solved.
o
Design
ments and
consistent
with
ready for the next phase
( im-
plementation).
o
Decision on whether
further
reviews are
required.
2.2.3
Structured Walkthrough
The more formal structured walkthrough
[10],
Glass
[28],
Myers
[48],
Yourdon
[62])
(Brandstad
process
is
based on the desk checking methodology discussed in section
2.2.5.
One main difference being that the non-computer-
based V&V process is conducted by a team of software personnel rather than one person.
methodology is to formalize
make
it
a
more
viable
errors in software.
The primary intent of this
the desk checking process
methodology
for
the
detection
to
of
I
16
The
guidelines
structured
or
walkthrough
procedures
specifically
purpose of error detection.
meeting
(1-3
hours
process
consists
tailored
for
of
the
The process convenes with a
duration)
of
a
software
team
(3-5
people) wherein each team member is given a specific role.
Myers
[47]
moderator,
suggests
another
that
the
one
person
recorder
of
plays
errors
the
role
detected
another as the tester who "plays computer."
of
and
He suggests
that along with the programmer of the software,
the par-
ticipants of the meeting should be chosen from the following list:
o
A highly experienced programmer.
o
A programming-language expert.
o
A new programmer
(to give a fresh,
unbiased
outlook).
o
The person who will eventually maintain the
program.
o
Someone from a different project.
o
Someone
from the same software team as
the
programmer.
Normally,
the
members
of
the
software
given the structured walkthrough material
team
are
(listings, func-
tional requirements/design specification documentation and
charts,
etc.)
far
enough
in advance for
initial reading.
•
17
The
team
members
are
expected
material at review time.
to
be
familiar
with
the
At the meeting, test cases with
known results are manually executed through the code.
Any
errors or deviations from the intended design are recorded
(action items) for later fixing by the responsible programmer.
The structured walkthrough methodology finds
most
useful
application
software development
at
the
implementation
its
phase
of
(specifically after all syntax errors
and, if possible, type errors have been removed).
The success of the methodology lies
in the
fact
that the method stipulates the requirement for a team effort.
A programmer is not normally very critical of his
program, which
ware.
is necessary for developing reliable soft-
He should never be the sole critical reviewer of his
software.
The
eventuality
of
a
structured
walkthrough
tends to motivate a programmer in developing the most reliable software possible.
The structured walkthrough process is very taxing
on the team members.
As suggested earlier, the duration of
the meeting should be held to a minimum (1-3 hours).
When
the meeting is extended beyond this limit, people will tend
to become bored
resulting
in an unfavorable outcome.
In
18
addition, the tendency to waste time on picayune or unrelated items is always present.
portant
role
in
maintaining
The moderator plays an imthe
meeting
on
the
right
course.
2.2.4
Checklist Review
Similar to the structured walkthrough, the checklist review (Branstad
[10], Myers [47,48])
is a non-compu-
ter-based V&V process conducted by a team of software personnel.
However, the procedures or guidelines followed in
the meeting are somewhat different.
A main difference is
that the members of the software team do not actually "play
computer"
to detect· errors.
(see one similar to Myers [47]
Instead,
an error checklist
(also Section 2.2.6) but de-
signed for the specified task intended)
and procedures or
guidelines are used to aid in the error detection process.
The software team is usually made up of three to
five members.
one member
Like in the structured walkthrough process,
is given the
role of moderator.
His primary
duties are to disseminate the necessary material (listings,
checklist,
requirements and design documentation,
charts,
analysis data, etc.), schedule and direct (i.e., keep it on
the right course)
tion items)
the meeting, and maintain a record
of all errors detected.
(ac-
The other members of
19
the
team are
from
the
the
designer 1
designer
and code
the
programmer
is being
(if
inspected)
different
1
and
any
other software personnel involved with the software.
Unlike
the structured walkthrough,
the checklist
review tends to be conducted on a higher level of investigation
rather
than a
detailed one.
In other words,
the
logic, equations or algorithms in general are analyzed for
correctness without resorting to using inputs to manually
simulate the execution of the program.
As the software is
being narrated, each statement of the software is analyzed
with
respect
to
the
checklist
for
consistency of
whether it be code or specifications.
intent
The software can be
at the level of code or at a higher level as functional requirements or design specification.
review
is
useful
(requirement,
As
in
all
design,
with
the
phases
Hence,
of
the checklist
software
development
implementation and maintenance).
structured
walkthrough
methodology,
the main emphasis is on team effort in detecting errors.
Also,
that
the
no
general
one
is
to
attitude
during
be blamed
for
the
process
should
errors detected
be
in the
software.
The checklist review should be kept to a minimum
(1-3 hours).
It becomes very difficult to detect errors
20
when one is tired or just simply bored.
Also, as with the
structured walkthrough, an effort should be made to avoid
wasting time on items not related to the software at hand.
A major difficulty
will
assist
in
detecting
is creating
the
largest
a
checklist
number
of
that
errors.
Some standard checklist can be devised but normally must be
modified to fit the software being checked.
2.2.5
Desk Checking
Desk checking
(Branstad
[10], Glass [28]), nor-
mally done by the programmer alone, is one of the earliest verification methods and even today is still widely
used in the software development process.
Desk checking
is used during the implementation phase to review a program listing
for
the purpose of detecting and eventual
correction of errors.
Algorithms, mathematical calcula-
tion, and logic in general are verified by "playing computer".
The
initializing
cution of
logic
desk
process
is
accomplished
by
input parameters to manually simulate exe-
the
program.
and output
results.
checking
All
intermediate
parameters are verified
computation,
against
known
Various combinations of input parameters (test
21
cases)
are
used
to
attempt
to
execute
as
many
paths
through the program as possible.
One of
the main problems associated with
this
methodology is the inability of the programmer to be sufficiently motivated to effectively test his own program.
The programmer tends to put on blinders when it comes to
verifying a program he knows (at least believes) that has
been coded correctly.
Very few people will readily admit
that they have made a mistake.
We are a long way from
having total "egoless programming".
The primary advantage of desk
checking
tecting errors early in the V&V process.
is
de-
Valuable com-
puter systems or other hardware equipment utilization is
avoided.
Sometimes the hardware is not available prior
to implementing the software.
The desk checking process has become almost totally automated as compared to the inception of the methodology.
with
the
We
can now let
advent of
the
computer
"play computer"
interactive debug capability.
How-
ever, the results still have to be checked at the desk by
some human means.
22
2.2.6
Structural Analyzer
This
methodology
Reifer
[54])
errors
resulting
(Miller
[45),
Myers
provides the capability to analyze possible
from
problems
in
the
structure
software program (i.e., logic and data structure).
cally,
these
[47],
errors
are
detected
by
using
such
of
a
Typistatic
methodologies as structured walkthrough, checklist review
and desk checking as discussed earlier.
Hence, using an
analyzer automates the above mentioned manual techniques.
When
invoked,
the
structural
analyzer
performs
an examination (i.e., using techniques such as path analysis, parsing algorithms and directed-graph modeling) of
the structure of the program undergoing analysis recording
any discrepancies
in
the
data
or
logic
structure.
There is not any requirement for input data or test cases
since the software is never executed.
tial
(only non-executable
i terns)
Some of the poten-
areas
(Myers
analyzer attempts to search out are:
DATA REFERENCE
1.
Unset variables used?
2.
Subscripts within bounds?
3.
Noninteger subscripts?
4.
Dangling references?
5.
Correct attributes when aliasing?
[ 4 7])
the
23
6.
Record and structure attributes match?
7.
Computing addresses of bit strings?
Passing bit-string arguments?
8.
Based storage attributes correct?
9.
Structure definitions match
across
proce-
dures?
10.
String limits exceeded?
11.
Off-by-one errors in indexing or subscripting operations?
DATA DECLARATION
1.
All variables declared?
2.
Default attributes understood?
3.
Arrays and strings initialized properly?
4.
Correct lengths, types, and storage classes
assigned?
5.
Initialization
consistent
with
storage
class?
6.
Any variables with similar names?
COMPUTATION
1.
Computations on nonarithmetic variables?
2.
Mixed-mode computations?
3.
Computations
on
variables
of
different
lengths?
4.
Target
size
less
than
size
of
assigned
value?
5.
Intermediate result overflow or underflow?
24
6.
Division by zero?
7.
Base-2 inaccuracies?
8.
Variable's
value
outside
of
meaningful
range?
9.
Operator precedence understood?
10.
Integer divisions correct?
COMPARISON
1.
Comparisons between inconsistent variables?
2.
Mixed-mode comparisons?
3.
Comparison relationships correct?
4.
Boolean expressions correct?
5.
Comparison and Boolean expressions mixed?
6.
Comparisons of base-2 fractional values?
7.
Operator precedence understood?
8.
Compiler evaluation of Boolean expressions
understood?
CONTROL FLOW
1.
Multiway branches exceeded?
2.
Will each loop terminate?
3.
Will program terminate?
4.
Any loop bypasses because of entry conditions?
5.
Are possible loop fallthroughs correct?
6.
Off-by-one iteration errors?
7.
DO/END statements match?
8.
Any nonexhaustive decisions?
25
INTERFACES
1.
Number of input parameters equal to number
of arguments?
2.
Parameter and argument attributes match?
3.
Parameter and argument units system match?
4.
Number of arguments transmitted
to called
modules equal to number of parameters?
5.
Attributes of arguments transmitted to called modules
equal
to
attributes of
para-
meters?
6.
Units
system
of
called modules
arguments
equal
to
transmitted
units
system
to
of
parameters?
7.
Number, attributes, and order of arguments
to built-in functions correct?
8.
Any references to parameters not associated
with current point of entry?
9.
Input-only arguments altered?
10.
Global
variable
definitions
consistent
across modules?
11.
Constants passed as arguments?
INPUT/OUTPUT
1.
File attributes correct?
2.
OPEN statements correct?
3.
Format specification matches I/O statement?
4.
Buffer size matches record size?
26
5.
Files opened before use?
6.
End-of-file conditions handled?
7.
I/O errors handled?
8.
Any textual errors in output information?
OTHER CHECKS
1.
Any unreferenced variables in cross-reference listing?
2.
Attribute list what was expected?
3.
Any warning or informational messages?
4.
Input checked for validity?
5.
Missing function?
The above checklists could be used
the methodologies discussed earlier.
in many of
Additional problem
areas not listed about could be classified as standard
installation directives not detected by a compiler, assembler or the particular language used.
are
not necessarily potential
The following
errors but violations of
some directives or standard programming guidelines:
o
Usage of unstructured programming language
forms.
o
Naming conventions violations.
o
Constructs that are excessively complex.
Structural analyzers are not
readily available
in the software community mainly because most analyzers
27
are language dependent and designed with specific needs
of the user in mind.
2.3
Dynamic V&V Methodologies
In dynamic V&V,
software programs are executed
and its responses evaluated.
differently for each method)
ascertain
whether
the
Monitors
(normally applied
are placed appropriately to
software,
as
satisfy the intent of the functional
implemented,
will
requirement or de-
sign specification during execution.
The success of a dynamic V&V largely dependent
upon the selection of input test data and the ability of
someone to interpret the results correctly.
2.3.1
Simulation
Simulation (Adkins [1], Glass [28], Myers [47],
Naylor [49], Reifer [54]) is a methodology for modeling a
system for
the purpose of studying or observing behav-
ioral characteristics of a system in a simulated environment.
Adkins [1] describes a "system" as a collection of
objects with a well-defined set of interactions between
the objects.
28
Simulation enables the designer to verify if the
intent
of
satisfied.
the
software
The
system
is
actual
system
need
thus
requirements/specifications
for
the
avoided.
may
existence
Also,
warrant
the
of
the
actual
complexity of
development
of
a
are
the
simulation
model.
After
a
thorough analysis of
the
system
func-
tions and the particular areas of a system have been designated as requiring simulation,
the model
veloped to accomplish the simulation task.
(such
lator)
cally
as
Simscript,
GPSS
(General
simulation
development of
the
is
normally
A language
Purpose System Simu-
and Simula, only to name a few)
for
is then de-
designed specifi-
used
simulation model.
the best simulation language sui ted for
in
the
software
The selection of
the application
is a criteria for optimum results.
Simulation can be and is applied to almost anything you can imagine.
Naylor
[49] gives a few reasons
why he believes simulation serves a useful purpose in the
verification process as follows:
o
Simulation makes
it possible to study and
experiment with the complex internal interactions of a given system, whether it be a
29
firm,
an
industry,
an
economy,
or
some
subystem of one of the above.
o
Through simulation, one can study the effects
of
tional,
certain
and
informational,
environmental
organi za-
changes
on
the
operation of a system by making alterations
in the model of the system and by observing
the
effects
of
these
alterations
on
the
system behavior.
o
A detailed observation of the system being
simulated may lead to a better understanding of
the system and
to
suggestions
for
improvements which otherwise would be unobtainable.
o
Simulation can be used as a pedagogical device for
teaching both students and prac-
titioners basic skills in theoretical analysis,
statistical
analysis,
and
decision-
making.
o
The experience of designing a computer simulation model may be more valuable than the
actual
simulation
itself.
The
knowledge
30
obtained
in
designing
a
simulation
frequently suggests changes
being
simulated.
The
study
in the system
effects
of
these
changes can then be tested via simulation
before
implementing
them
on
the
actual
system.
o
Simulation
valuable
more
of
complex
insight
important
systems
can
yield
into which variables are
than others
in
the
system
and how these variables interact.
o
Simulation can be used to experiment with
new
situations
about
which
little
or
no
information is available, so as to prepare
for what may happen.
o
Simulation can serve as a •preservice test"
to try out new policies and decision rules
for operating a system, before running the
risk of experimenting on the real system.
o
For
certain
types
of
stochastic
problems
the sequence of events may be of particular
importance.
Information about expected
values and moments may not be sufficient to
31
describe the process.
In these cases, sim-
ulation methods may be the only satisfactory way of providing the required information.
o
Monte Carlo simulations can be performed to
verify analytical solutions.
o
Simulation
systems
enables
in
either
one
real
to
study
time,
dynamic
compressed
time, or expanded time.
o
When
new
system,
elements
are
introduced
simulation
can
be
used
into
to
a
anti-
cipate bottlenecks and other problems that
may arise in the behavior of the system.
Considering the above list, Adkins [1] cited the
following advantages of simulation:
o
It permits controlled experimentation.
o
It permits time compression.
o
It
permits
sensitivity
analysis
by
man-
the
real
ipulation of input variables.
o
It
causes
system.
no
disturbance
of
32
o
Also,
It constitutes an effective training tool.
just as important, the simulation method-
ology can be applied throughout the software development
process for verification or validation of the software.
He also pointed out the following disadvantages:
o
A simulation model may become expensive in
terms of manpower and computer time.
o
Extensive
development
time
may
be
en-
countered.
o
Hidden critical assumptions may be present
causing the model to diverge from reality.
o
Model parameters may be difficult to
tialize.
ini-
These may require extensive time
in collection, analysis, and interpretation
of data.
33
Assertion Checker
2.3.2
This
vides
for
methodology
the
(Chen
evaluation
of
[14],
Glass
assertions
[28])
strategically
placed throughout the software being verified.
assertion checking
is more
other V&V methodologies,
pro-
Although
complex by nature than most
it is by far a more f·ormidable
alternative for approaching program proof of correctness
than any methodology discussed so far.
Typically,
it is
used in conjunction with other debugging aids.
Assertion
statements
language-
(normally
dependent) must be added to the software which is in turn
inputted for interpretation to an assertion preprocessor
for output of the modified software.
When the software
is added to the total software system for its normal usage,
these
ments.
V&V
assertion statements
are
reconciled
as
com-
Hence, the assertion statements provides not only
capability,
but
The modified software
as
documentation
is passed
of
through
the
a
software.
compiler
for
output of object code ready for the link-loading process.
This object code has to be linked also with special assertion library functions
an associated data base.
executes
the
(invoked at execution time) and
At this point,
software with
cases to verify the validity
a
the programmer
sufficient number of test
(in some cases the invali-
34
di ty)
of
the assertions made about
the
software.
test cases used should exercise all assertions.
other methodologies discussed,
The
As with
there exists the problem
of arriving at a sufficient and reliable number of test
cases to exercise all assertions or any other constraints
place on the software.
During
execution
other debugging output,
for
analysis.
of
the
software,
along
with
the assertion output is printed
Typically,
unlike
the debugging
output,
the assertion output is obtained {printout) at the point
of
assertion
invocation.
Hence,
errors
are
detected
early in the test run which makes the V&V process more
controllable
by
the
programmer,
especially
if
inter-
active debug capabilities are utilized.
Assertion design should be preplanned prior to
or
in conjunction with the software design phase.
Al-
though designing of assertions require a considerable effort on the part of the designer, this is more than made
up when the V&V process is initiated.
In particular, the
designer needs to be very knowledgeable about the software
to design useful
{sufficient)
and
reliable
asser-
tions.
By inserting assertions at strategic points in
35
the software,
we
are designating what constitutes
rectness of the software at that point.
cor-
However, asser-
tion checking does not guarantee total program correctness.
of
This is due in part to the not
assertion
minimum,
creation,
each unique
selection
and
segment of code
~so
trivial matter
placement.
(i.e.,
As
a
a specific
function) and in particular any branching statements must
be assertion processed.
2.3.3
Symbolic Execution
The symbolic execution methodology
[13],
Clarke
Howden
[15,16],
[34,35],
Huang
Darringer
[37],
King
[17],
(Cheatham
Deutsch
[40,41]),
[20],
although
a
viable method in its own right, is generally agreed upon
to be an alternative to the proof of correctness methodology.
It has the favorable attributes of both conven-
tional testing and proof of correctness methodologies but
lies somewhere in between with respect to complexity and
level of difficulty of implementation.
During the symbolic execution process, the software being tested is "executed"
(not in the traditional
sense) using symbolic values or numeric constants as derived
(i.e.,
from
the
symbolic
representation of the
software
either symbolically assigned or inputted).
Each
36
variable on the left side of a program statement takes on
the
resulting symbolic
representation of the right side
of the statement (i.e., after substitution into each variable on the right with its appropriate symbolically assigned
or
inputted
value
for
this
instance
or
path
through the software).
For example, if we symbolically execute the program
segment
in
representation
Figure
of
J,
I,
Z(Y+Z)-X-5, respectively.
2. 3. 3.1,
and
the
K are
resulting symbolic
2X-Z+5,
X-Z+S
and
Note that, in the example, the
symbolic representation of X, Y, and Z were used and not
any previously assigned value to X, Y, and
symbolic representation is used and not
z
(i.e., the
the content of
the variable).
CALL COMPUTE (X, Y, Z);
COMPUTE:
BEGIN
PROCEDURE (A, B, C);
J
=
=
I-A;
K
=
2*B-I+C;
I
2*A-C+5;
END;
Figure 2.3.3.1
One
Example Used in Symbolic Execution
obvious
difficulty
is
how do
we
handle
a
37
conditional statement {e.g., statement of the type IF A>
B ••• ).
During the execution, the determination of A> B
true or false is difficult to resolve since A and B would
be symbolically represented
data).
Therefore,
ecuted
for
{not necessarily numeric
both paths must be symbolically ex-
separate
investigation.
The
program state
{i.e., the state of all symbolic variables)
is retained
for one path while the other is being executed.
Also,
previous program states are preserved throughout the symbolic execution of a path to help clear up subsequent unresolved
conditional
selection
statements.
is definitely one of
Hence,
the
keys
correct
to
path
successful
symbolic execution.
A person using symbolic execution to
test
be
software
must
knowledgeable
enough
about
the
software to eliminate unrealistic {not according to specifications)
paths.
Symbolic representation of all pos-
sible paths would approach infinity for even small programs.
Therefore,
we
need
some human
eliminate these unwanted paths from
generated
{path domain)
interactive
given
the
system
option
like
for
the
EFFIGY
intervention
the possible paths
software.
[see
to
41],
to select the path {s)
Using
the
some
user
is
of his choice.
Other well-known automatic verification systems which use
symbolic execution in some form are SELECT [see 15] and
DISSECT [see 34,35].
38
Another
difficulty
is
expressing
software symbolically without ambiguities.
all
of
the
In addition,
index, pointer-type variables and loops presents, in most
cases, an unresolvable situation for the symbolic execution process.
Again human
intervention becomes
neces-
sary.
As a consequence of the symbolic execution, symbolic
or
algebraic
formulas
are
obtained.
The
tester
must evaluate the formulas produced to determine if the
desired derivations are obtained
(i.e.,
consistent with
the design specification) with respect to either implied
or defined assertions.
Clarke [15] makes a good point in favor of symbolic execution.
Using a trivial example, suppose a pro-
gram path computes A*A
instead of
2*A.
If A=O
always
results from the test data then an error would not necessarily be detected.
Symbolic representation would most
likely show this error.
Also, a favorable attribute is that one symbolic
execution through the software represents many test cases
from the possible test case domain for the software being
tested.
Hence, symbolic execution can be used to gene-
rate test data for
a
particular path through the soft-
39
ware.
Accordingly, the symbolic formulas derived repre-
sents symbolically the set of all input data necessary to
execute that path.
Although there still remains certain basic problems, as stated earlier, many in the software community
expect that symbolic execution will lead the methods of
software testing in the years to come.
Proof of Correctness
2.3.4
What is necessary in the software V&V process is
a
methodology
Darringer
Linden
(Anderson
[17],
[42],
DeMillo
Morgan
[3],
[19],
[46],
Basu
(5],
Deutsch
Reifer
[54],
Clarke
[20],
[15],
King
Tannebaum
(40],.
[59])
that is an alternative approach to testing where programs
are determined to be error free.
The proof of correct-
ness methodology is an attempt to fulfill this objective.
Our ing the process of this methodology,
to ob-
tain verification that a software program is consistent
with
and
satisfies
the
specification
intended,
formal
proofs similar to mathematical proofs are used.
Like the assertion checker process,
assertions
are placed strategically throughout the software
(in the
40
form of a design or implemented code).
Each assertion
chara.cterizes the data attributes defined by predicates
over
a
particular
program segment.
From
these
asser-
tions, mathematical-like theorems expressed symbolically
are
derived
proofs,
the
and
proven.
software
must
In
conjunction
also
be
shown
with
to
these
terminate
(halt).
Typically, the software is divided into segments
(design or code)
which characterizes a unique function.
Assertions are defined for
each segment along with
variant conditions for any loops.
in-
When theorems corres-
ponding to the implied assertion (input)
and the associated derived assertions
are deemed true
(output)
from the
functioning of the software are also true then the software is said to have been formally proven correct (i.e.,
if the software also terminates).
Designing
task.
assertions
is by no means a
trivial
The designer must be very knowledgeable about the
software
to
derive
reliable
and
with respect to the software.
worse than no assertions at all.
sufficient
assertions
Incorrect assertions are
The task is most often
tedious requiring conceivably more work than the software
it alleges to verify.
41
Proofs to verify a detail design specification
are much simpler to derive than proofs to verify code at
the
implementation
phase
because
we
are
not
concerned
with the programming language, control directives, optimized version of the design or any other representation
constraints.
It
ture,
assist
has
been suggested,
throughout
the
1 i tera-
that a very high level language should be used to
in
proof.
the
formal
representation
of
the
software
Also, to enhance the chances for success during
the formal V&V process, there should be adherence to good
programming practices
(i.e.,
structured programming and
modularization of the software).
Formal
proofs
of
correctness
finds
its
most
useful application when the task is costly or the risks
are
high
(e.g.,
energy systems).
missile
systems,
aircraft
control
and
The end (V&V of the software) justifies
the means (costly proofs).
Large scale programs as noted
above would require enormous program verification derivations
(formal proofs).
A very costly endeavor
type of software V&V is not warranted.
if this
Only those por-
tions of the software deemed critical should be verified
using formal proof of correctness.
42
It can be agreed upon that exhaustive testing is
the only way to be
will
100 percent sure
satisfy the specifications for
that the software
all possible data,
but this takes an inordinate amount of time and resources
in
all
but
testing,
and
does
the
trivial
as stated earlier,
prove
mathematically
what
it
is
state-of-the-art
programs
most
has
not
An
alternative
the
to
do.
software
as
the
everyday tool of the programmer.
stage
written
Unfortunately,
area of mathematical
reached
to
is to analyze the software
that
supposed
in the
cases.
where
the
proofs
of
they are
an
Proof of correctness of
software is most certainly before its time.
The everyday
use of this methodology is at least a decade away.
Glass [28] considers some of the advantages and
disadvantages of proof of correctness as a verification
tool for software.
The advantages of proof of correctness are:
1.
Provides a rigorous, formalized process.
2.
Forces analysis.
process
sections
forces
of
The proof of correctness
the programmer
his
program,
to consider
which
might
otherwise only get a cursory analysis.
3.
Clarifies computation states.
the
assertions
makes
the
Writing out
programmer
43
explicitly
state
his
heretofore
implicit
assumptions, which define the state of the
computation for specific points within the
program.
4.
Clarifies dependencies.
When executing the
proof, the programmer becomes aware of what
assumptions
implicitly
about
used
by
the
input
the
code
data
are
in
various
small
simple
sections of the system.
The disadvantages are:
1.
Complexity.
Even
for
programs, the symbolic manipulations can be
overly complex.
2.
Errors.
easy
This can lead to •••
Because of the complexity it
to
introduce
errors
into
is
the
computation of the statements to be proven
as well as the proof of those statements.
3.
Arrays are difficult to handle.
4.
Lack
The
of
powerful-enough
proof
reduce
theorem
process
errors,
provers
could
except
theorem
be
that
powerful
provers.
automated
there
enough
are
for
to
no
most
practical problems.
5.
Too much work.
It often requires several
times the amount of work to prove a program
44
than was required to write the program.
6.
It is often very
Lack of expressive power.
difficult
to
create
the
output
assertion
for what is an intuitively simple computation.
7.
Non intuitive.
The
procedure
tends
to
obscure the true nature of the computation
being
analyzed
rather
than
providing
insight into the computation.
8.
Requires training.
Like
programming,
the
user of proof of correctness requires many
hours
of
training
as
well
as
practice
in
order to use the technique well.
Also,
nonadvantageous
is
that
not
all
aspects
(e.g., arrays, control directives and pointer variables)
of a software program can be easily characterized by assertions;
formal
verification
part nontransferrable
(i.e.,
signed for a particular
formal
systems
are
for
the most
the system is normally de-
(unique) piece of software); the
proof software could cost more than the software
to be verified; and there exist the difficulty in proving
that the program terminates for all possible inputs
halting problem).
(the
45
Mathematical Checker
2.3.5
This
methodology
valuable usage
(Glass
[28])
finds
its
most
in the V&V of software which is predomi-
nately mathematical.
It can be used throughout the soft-
ware development life-cycle to analyze correct transition
into code of equations or mathematically oriented algorithms
from
the
intended
specifications.
Possible
or
probable problem areas which could affect the reliability
of the software such as improper scaling of data, division by zero, overflow (positive and negative),
cies due to word length
significance,
(truncation of data)
are only some of a
inaccura-
and loss of
variety of areas
that
the mathematical checker can be applied to.
The mathematical checker accepts input test data
(possibly computed by hand or derived during analysis of
equations
verified.
are
or
algorithms)
and
the
coded
software
to
be
The values computed by hand, prior to coding,
checked
against
values
version of the equations.
computed
using
the
coded
Printout of the input data and
all results {specifically intermediate and final computations)
ing
are obtained after execution through postprocess-
by the mathematical
ysis.
checker
for
later manual
anal-
46
The
structed
to)
analysis.
mathematical
generate
The
checker
an
tester
will
enormous
has
to
use
(i.e.,
amount
some
of
if
in-
data
for
discretion
in
selecting the proper amount of resulting output and also
in zeroing in on the suspected area containing errors to
eliminate
unnecessary
analysis
after
the
mathematical
checker has done its job.
2.3.6
Test Data Generation
Generating
test
data
which
are
then
appropri-
ately grouped together to produce hopefully reliable and
sufficient test cases is by no means a trivial undertaking.
Normally a programmer or other software personnel
tests
a
program by generating
satisfy
himself
that
the
a
few
program
test
is
data
sets
without
to
errors.
However, what is a sufficient number of test data sets?
The
test
Clarke
data
[15, 16],
generation
Darringer
methodology
[17],
Miller
(Branstad
[10],
[44, 45],
Myers
[47], Reifer [54]) seeks to resolve this dilemma by formali zing some systematic approach to
subset
of
infinite).
test
data
from
and
the
input
selection of
domain
a
(normally
The judicious use of a test data generation
methodology can help eliminate the haphazard approach to
generating test cases for software verification and validation.
47
The
use
of
an
exhaustive
(infinite)
number
of
test data sets to verify the software would surely verify
the absence of errors.
Since we know this goal is impos-
sible to reach, we have to be more judicious in our selection of test data.
Hence, we resort to using a subset
of the infinite test data domain. Branstad [10] suggests
the following criteria for selection of test data:
o
The test data should reflect special properties of the domain
[all possible set of
test
external
data],
such
as
or
ordering
properties or singularities.
o
The test data should reflect special properties of the function that the program is
supposed
to
-
implement,
such
as
domain
values leading to external function values.
o
The test data should exercise the program
in
a
specific
manner,
e.g.,
cause
all
branches or all statements to be executed.
Test data generation
is
still
in
its
infancy.
Random number generators were one of the early attempts
at arriving at test cases.
Today, the test data genera-
tion methodology is much more complex.
conducted on the software;
Path analysis is
test predicates are derived;
input data generated and selected by some criterion; re-
48
suiting in the objective - test cases.
Goodenough and Gerhart
strides
in the development of
[31] have made important
the
test data generation
methodology.
They [31] discuss in their approach to the
methodology a
way of developing
program test cases.
sources
of
data
effective and
reliable
I support their contention that the
to
generate
reliable
test
data
should
come from all of the following:
o
The
general
requirement
a
program
is
to
specify.
o
The program's [design] specifications.
o
General characteristics of the implementation method used,
tions
data
relevant
are
internal
to
including special condidata
represented;
structure of
structure
and
and
how
the
specific
an actual
implemen-
tation.
Test
predicates
from the above list.
or
constraints
are
generated
By a test data selection criterion
(out of scope for this discussion) Goodenough and Gerhart
[31]
determines
the
validity
cases derived from the data
test predicates.
and
reliability
necessary to
of
test
satisfy these
However, a weakness in this approach is
49
that
in
some
cases
too
many
test
data
sets
can
be
generated thus presenting the tester with the problem of
exhaustive testing.
Although
the
test
data
generation
methodology
definitely has a future as a V&V methodology, there exist
some problem areas when attempting
to generate
reliable
test data (leading to test cases), namely:
2.3.7
o
Determination of data paths.
o
Generation of too much test data.
o
Selection criterion for test data.
o
Non-reachable or missing data paths.
o
Conditional constraints.
Instrumentation
Instrumentation
Brown
[11],
Huang
methodology where
[38]
of
resulting
from
software.
the
and
statements
into a software program.
validity
(Arthur
Miller
Branstad
[45])
(instruments)
is
are
[10],
a
V&V
inserted
The reason is to determine the
software
these
[4],
by
analyzing
the
output
instruments after execution of the
These instruments are usually removed from the
software prior to the release of the final version to the
user.
50
The two primary ways instrumentation is used in
the software V&V process are:
o
During
initial
debugging
of
the
software
prior to integration with other modules.
o
During integration or environmental testing
to gather system data prior to delivery of
the software.
During
debugging
of
a
module,
instruments
are
inserted in the software to provide verification that for
a given input the program output is correct.
If the out-
put is incorrect, additional instruments are inserted in
suspected areas of the software to isolate the error(s).
The
instrumentation
flags
output
is
normally
to vary the amount of output.
controlled
This procedure
by
is
repeated until all known or suspected errors are detected
and then fixed.
During integration or environmental testing, instrumentation
takes
on
a
different
role.
The
primary
task being to gather data for evaluation of the software
as well as the hardware.
device
(magnetic
tape,
The data is recorded via some
disk,
paper
later analysis by software personnel.
tape
or
cards)
for
51
The
its
instrumentation methodology
unfavorable
side
effects.
is
not without
Instruments
require
ditional memory space in the computer memory.
cause of
the
extra
code,
execution time
ad-
Also be-
is lengthened.
This could be of some significance in real-time software
where for the most part execution time is paramount.
We can find a variety of uses of instrumentation
in the verification of software.
A few of the many areas
are listed below:
2.3.8
0
Measurement of execution time
0
Path traversal analysis
0
Loop control (single or nested)
0
Data flow analysis
0
Boundaries on variables
0
Module calling sequence
0
Parameter passing
0
Module interface
Test Driver
This methodology (Panzl [52]), which has existed
since the inception of programming, still plays an important
during
role
the
in
the
V&V
of
software
implementation phase).
(more
specifically
Normally,
each piece
52
(module) of the program is tested individually before it
takes
its
place
(here
the
as part of
bottom-up
the
testing
software program
total
approach
is
used).
The
software modules that will eventually provide input; call
this software module or otherwise used
the output
from
this software module normally does not exist or has not
been verified at this stage in the software development
process.
Since, for the most part, these modules are not
independent programs,
program
to
execute
through test cases.
there exist a need for some other
("drive")
these
individual
modules
The test driver serves this purpose
by inputting a number of data sets through the software
module to be tested and then records the resulting outputs.
Every attempt
should
be
made
to
keep
the
test
driver separate from the code that it will help to verify.
When the test driver is removed from the system, the
verified
software
should
not have
to
be altered.
Un-
necessary retesting is thus avoided.
Test drivers are designed solely for testing of
the
software and
when testing
then
as
is completed.
a
matter
of
course
discarded
This could be expensive if
too much time and effort is spent developing a reliable
test driver.
Typicaly,
test drivers are coded using a
53
"quick and dirty" approach with little complex structure
Since the code is a throwaway, it does not
development.
have to be optimum code.
However, good programming prac-
tices should be adhered to whenever possible.
might have
to
spend more
time
testing
the
You just
test
driver
as seen in the 1 i terature
(Panzl
than originally intended.
Some studies,
(52] ) ,
have
been made
in enhancing the capabi 1 i ties of
the test driver methodology.
developing
test
drivers
Emphasis has been placed on
that
can
be
used
to
test
the
software initially and thereafter when modifications are
made to the software (regression testing).
We are a long
way from the technology that would allow repeated use of
the original test driver without any modifications to it.
Rarely does the ideal situation arise when the original
test
driver
can
be
used
throughout
software for which it was designed.
drivers
are
designed
in such a
the
life
of
the
Basically, most test
manner
that
they apply
only to a specific piece of software at a specific stage
of software development.
When this problem is solved, a
definite effort will undoubtedly be put forth in developing better and more reliable test drivers.
will then no longer be expensive throwaways.
Test drivers
54
Test Coverage Analyzer
2.3.9
This methodology (Branstad [10], Bolthouse [32],
Miller [44,45]), provides the capability of analyzing the
test coverage of software during execution of test cases
designed to verify the software. This
analysis
is
based
on the counting of each occurrence of logic branches and
associated
code
(program
segments)
during
execution
of
the software.
The primary objective is to find out if all program segments
provide
(total coverage)
this
information
for
have been exercised.
analysis,
the
software
To
in
question must be instrumented (code added to provide program segment occurrence count)
The occurrence count
software
in
a
is
table
at each program segment.
stored during execution of the
for
later
output
by
some
post-
processor.
Typically, a test coverage analyzer receives the
instrumented
source
code
as
input
further modifies the software to
coverage process.
kept
count
intact.
of
each
However,
The
analyzer
ready it for
the test
the original source code is
After each test case
program
data.
segment
is
run,
the occurrence
analyzed
and
special
attention is paid to program segments which have a zero
55
occurrence count.
pleted,
program
When all test case runs have been comsegments
with
sti 11
count requires further analysis.
a
zero
occurrence
Some of the reasons for
zero occurrence count are:
o
Test
cases
not
sufficient
to
exercise
program segments.
o
Program segment is unreachable code.
o
Software not instrumented correctly by the
analyzer.
The test coverage process is
repeated with new
test cases or modified software or both until total test
coverage has been accomplished.
process
is
A side benefit of this
the detection of program segments with high
occurrence counts.
This
typically suggests
areas
that
are prime candidates for optimization, i.e., if execution
time is a significant consideration.
Although a methodology to verify that every program segment has been exercised does not guarantee correctness
about
errors.
of
the
software,
unexercised
it
does
code which
In the final
analysis,
provide
could
information
possibly
contain
all other program seg-
ments that cannot be exercised must be unreachable code.
In such a case, we really do not care because they do not
have any bearing on the outcome of the V&V process unless
it
was
code
that
should
have
been
reachable.
Not
56
considering
the
waste
of
memory
and
good
programming
practices, we normally ignore this code.
Also,
provided with
are
cases.
information as
to how good
we are
the
test
This helps in designing better and more reliable
test cases.
According to Glass [28], what you are really
doing is measuring the effectiveness of the test cases by
analyzing the program.
The test coverage analyzer is a viable V&V methodology, but there are some disadvantages or unfavorable
aspects of the methodology:
o
Time consuming in real-time applications.
o
All
program
segment
combinations
are
not
taken into consideration (too many) •
o
Additional
memory
necessary
to
host
the
analyzer and associated software.
o
Results to analyze could be enormous (e.g.,
with large scale programs).
o
No
means
segments.
of
identifying
missing
program
p •
CHAPTER III
COMPARATIVE COST ANALYSIS AND RANKING
OF SOFTWARE V&V METHODOLOGIES
57
58
3.1
Introduction to Chapter III
The software development cycle consists of four
phases in which the software exists in varying degrees of
detail or levels of abstraction (Figure 3.1.1).
Further,
the software development cycle begins with the Functional
Requirements phase where the needs of the user are formalized into a specification document (interfaces between
hardware and software functions are emphasized)
followed
by the Design Specification phase which consist of design
analysis and algorithm development.
Next, the Implemen-
tation phase where the design is transformed into a software language
(assembly or high level) or another under-
standable computer system
input.
Finally,
the
Mainte-
nance phase where the primary task is maintenance of the
software in an operational mode.
As mentioned previously (Chapter I), the cost of
software
is
1 i terally
leaving
hardware
in
its
tracks.
Each phase of the software development cycle with respect
to testing exacts a certain toll on this overall cost of
producing the final software product
exactly drawn to scale).
Hence,
(Figure 3.1.2 - not
in evaluating where a
methodology could be best utilized, we have to ascertain
where costs could be minimized.
59
DESIGN
SPECIFICATION
IMPLEMENTATION
MAINTENANCE
Figure 3.1.1 Software Development Phases.
60
A.
(50%)
c.
( 15%)
D.
(30%)
A - Maintenance Phase
B - Functional Requirements Phase
C - Design Specification Phase
D - Implementation Phase
Figure 3.1.2 Software Development Testing Cost Per
Phase.
61
Within this framework of the software development process, we will
shortly turn to the analysis and
ranking of each software V&V methodology after a further
discussion of
each
The V&V methodologies have
phase.
been selected and categorized for each software development phase
(Appendix C)
lection process
based
mainly through a subjective seon my own
experience with
these
methodologies, work associates, information gleaned from
the literature, and the inherrent attributes of each V&V
methodology.
task is to
Within each category
rank each methodology
(phase), the primary
(Appendix Bl-B4)
with
respect to criterions such as cost, ease of use, and re1 i abi 1 i ty. These cr iter ions are based within the framework of real-time large scale programs with which I have
some familiarity.
delineate
Also, no particular effort was made to
any differences
between manual
and
automatic
implementation of these methods.
3.2
Functional Requirements Phase
More specifically during the Functional Requirements phase, an analysis of user needs results in a functional
representation of the required system.
The pri-
mary task of this phase is to functionally characterize
the system in terms of
inputs and outputs and
determine hardware/software
functional
to also
partitioning.
62
During the functional
requirements phase one is usually
providing answers to the question "What has to be done?",
as opposed to "How will it be done?" which is a question
asked during the design specification phase.
In particular, the functional requirements documentation consist of software requirements concerning the
functions to be performed by the software, special tolerances on the software, and test requirements for ascertaining that the functional requirements have been satisfied.
3.3
Design Specification Phase
The Design Specification phase bridges
the gap
between a user concept formulated into function requirements
and
an
implemented system.
The primary task of
this phase is to describe in some detail the specifics of
the software to be implemented.
tion,
but not limited
to,
The following informa-
should appear
in
the
design
specification documentation:
o
Functions,
algorithms,
special
tolerances
or processing techniques.
o
Specification of known data bases.
o
System structure
(modularization of
soft-
63
ware).
3.4
o
Special test requirements.
o
External/internal interfaces.
Implementation Phase
During
the
Implementation
phase,
the
software
exists as a symbolic representation of the detailed design specification.
The software is coded in some high
level or assembly language targeted for input into a computer system.
The primary task in this phase is to correctly
interpret the design specification and translate it into
some software language understandable by the host computer.
At this stage, misconceptions or incorrect inter-
pretation of the design specification can result in very
costly software errors.
3.5
Maintenance Phase
Finally,
encompasses
all
the Maintenance
phase
the
of
activities
discussed phases in varying degrees.
the
(Glass
(29])
previously
Minor changes are
made during this phase either to correct a problem;
to
add a new function; or make a modification to an existing
64
function.
For
software,
any
major
changes
or
upgrades
to
the entire software development cycle
the
is put
into cyclic motion.
The
V&V
understood and
task
for
during
this
phase
is
rarely
the most part thought of as only a
minor task in the overall software development process.
As
mentioned
above,
software system has
tasks
of
the
the
to
other
personnel
maintaining
essentially accomplish
phases
discussed.
the
the
same
However,
the
overall cost to maintain a system is much more than that
incurred during the other phases.
This is mainly due to
the fact that errors detected this late in the software
development
process
normally
requires
a
lot
of
backtracking to determine if any adverse side-effect has
propagated
to
other
parts
(i.e.,
whether
it
be
code,
design or requirements) of the software.
3.6
Ranking Process
Each V&V methodology selected for a particular
software development phase is ranked using ranking keys
(Table 3.6.1)
earlier.
respect
with respect to
the criterions as stated
Each methodology is discussed and ranked with
to
the
total
cost.
The
total
cost
is
a
combination of two costs, namely, procurement (obtainment
"
65
of the software package)
and usage
personnel or computer time).
(incurred either for
Secondly, each is discussed
The levels of
and ranked with respect to ease of use.
ease of use are specified to be in the range from easy to
difficult.
Each
level
difficulty factor
Finally,
each
reliability.
is
representing
greater
discussed
than
and
a
20
the preceding
ranked
with
percent
level.
respect
to
A determination is made of the efficiency
of a methodology in the detection of errors.
Each level
of
a
reliability
is
classified
to
be
within
certain
percentage of being totally reliable.
TABLE 3.6.1
Total Cost
RANKING KEYS.
Ease of Use
Reliability
(E1) easy
(R1)
Excellent
(within 5%)
high
(C2)
($75,000-$200,000)
(E2)
(R2)
very good
{within 10%)
(C3) medium
($10,000-$75,000)
{E3)
(R3)
good
{within 20%)
low
(C4)
($1,000-$10,000)
(E4)
(R4)
fair
(within 30%)
(ES) difficult
(RS)
poor
(within 50%)
(Cl)
{CS)
very high
(> $200,000)
very low
(< $1,000)
to
'
66
These ranking keys are based primarily on my own
experience and also on data accumulated during a survey
taken
in
November
of
1979.
The
software
personnel
surveyed consisted of people in my software engineering
laboratory and also people who worked with me on Project
X (discussed later in Chapter IV).
A total of 123 soft-
ware personnel were surveyed with 118 responding to the
survey.
with
the
A sample questionnaire is shown in Figure 3.6.1
results
(averages)
of
that survey.
The
cost
factor was not considered since many of the people surveyed were not cognizant of this factor.
In addition,
each participant was told to leave the rating blank if he
was inexperienced with the V&V tool.
Further, the rat-
ings were not necessarily linked to a software development
phase
thesis.
and
not
all
tools
were
rated
as
in
this
67
SOFTWARE V&V TOOLS SURVEY
DATE: NOV 1979
NAME JOCK SOFTWARE
# YRS. SOFTWARE EXPERIENCE 5.72
Rate each tool from 1 to 5 (1 being the highest). This rating should be based on your experience
with the tool. More specifically, rate each with respect to ease of use and error detection capability
(reliability).
Rating
Tool
Ease of
Use
Reliability
Checklist Review
2.27
3.61
Design Review
1.19
2.79
Desk Checking
1.52
3.01
Instrumentation
2.36
3.27
Mathematical Checker
2.18
2.11
Requirement Review
1.44
2.46
Simulation
2.89
3.53
Structural Analyzer
2.16
2.41
Structured Walkthrough
1.65
2.38
Test Coverage Analyzer
3.23
3.76
Test Data Generation
4.57
3.80
Test Driver
2.83
2.13
Figure 3.6.1
Software V&V Tools Survey
68
Some interesting observations that can be made
about this survey are:
o
The
design
review was
the
most
preferred
tool with respect to ease of use.
However,
it did not rate as well with respect to reliability.
o
The test data generation tool was considered the
most difficult to use and most un-
reliable.
o
Most
tools
received
at
least
a
moderate
rating.
o
The
average
years
of
software
experience
was at least 5 years.
3.7
Analysis of Methodologies with Respect to
Ranking Criterions
In this section, we now turn to the analysis and
ranking of each methodology according to the cri terions
specified earlier.
ranking
keys
(Table
The ranking is accomplished using the
3.6.1)
derived in Section 3.6 with
respect to the particular software development phase (s)
the methodology has been selected for.
A
separate
analysis
and
ranking
of
the
V&V
69
methodologies with respect to the maintenance phase will
not be presented.
the maintenance
As stated previously in Section 3.5,
phase
software process
is an
iterative
process of the other phases.
Assertion Checker
The assertion checker (Section 2.3.2), used during the implementation phase, is relatively new and somewhat more complex than many of
discussed (Section 2.3).
standable
why
only
few
the
other methodologies
Hence, it would be quite underassertion
checker
systems
have
been built.
The reliability of this checker depends heavily
on the application of a sufficient number of assertions
strategically placed throughout the software.
The check-
er has the potential of being reliable, but designing assertions is no trivial task; neither is the placement of
them in the software.
This is one of the main difficult-
ies in using this method; another is the analyzing of the
results.
Someday we will get to the stage where asser-
tions are almost automatically generated
ments/design specification) for us.
(from require-
70
It is difficult to estimate the procurement cost
since not many systems are currently available. This cost
would certainly depend on the amount of checking capabilities required by the user.
In most cases,
the system
(normally language-dependent) would have to be built from
scratch.
The procurement cost could range from $50,000
to $200,000 based on information gleaned from the literature.
The usage cost is directly dependent on the salary
of the person designing the input assertions and conducting the analysis after processing by the checker.
Checklist Review
The
success
checklist review
checklist
devised
or
failure
(Section 2.2.4)
and
the
(reliability)
of
the
depends heavily on the
people
using
it.
In
other
words, the review is as reliable as the checklist devised
and how knowledgeable are the people using it.
Further,
the checklist can vary in form from being detailed to almost abstract depending on the
tional
requirements
checklist devised for
being
complexity of
verified.
the functional
the
func-
Typically,
the
requirements phase
would not be as detailed as that of the one devised for
the design specification or implementation phases.
How-
ever, the checklist review does take a more detailed look
71
at the functional
requirements than the
requirement
re-
view (Section 2.2.1).
Basically, the checklist is easy to use once obtained.
The difficulty with this methodology lies in the
effort that has too be put forth
in creating the check-
list.
The
somewhat
usage
but
cost of
depends
the
mainly
checklist
on
the
review varies
number
of
people
(i.e., their salary) participating in the review.
There
is no procurement cost since the checklist and associated
guidelines
or
procedures
are
normally
constructed
in-
house.
One of
the main differences
between
the
func-
tional requirements and design specification phase, with
respect to this methodology,
checklist.
The
is the construction of the
checklist should be more detailed than
previously stated for the functional
requirements phase.
Being more detailed, the reliability of the review is increased
again
for
the
design
this depends
specification
highly on
phase.
the participants
However,
involved
making use of the checklist.
The checklist now becomes easier to use during
72
the design specification phase by nature of
detailed.
being more
In other words, we are now dealing with less
abstract ideas.
However, the task of creating the check-
list is much more involved since we cannot be as general
as before.
The total cost is basically the same as previously discussed
for
the
functional
Some additional
cost might be
requirements
involved due
phase.
to a longer
review duration.
As compared to
the functional
requirements and
design specification phases, the appearance of the checklist
reflects
phase.
We
are
the
at
much
the
more
detailed
stage where
implementation
the
review of
such
items as variables types, range boundary of variables and
mixed-mode operations, only to name a few,
(see Section
2.2.6 for possible items for a checklist) are all a part
of
the
review
process.
The
reliability of
the
review
process will tend to increase beyond that as seen in previous phases.
The ease of use and cost of the
review during
the implementation phase compares to that of the design
specification phase.
73
Design Review
This review process is one of the primary techniques used during the design specification phase.
Its
reliability depends considerably on the participants involved
in
the
ledgeable and
review.
follow
If
the
participants
are
know-
the specified guidelines and
pro-
cedures, the review can result as a great success.
The
reliability
the
of
the
review
tends
to
decrease
when
optimum (1-3 hours) review time is extended.
Like
the
curement cost.
requirement
review,
there
is no
pro-
The usage cost is salary dependent
(de-
signer/reviewers preparation time and hours spent during
the review).
Typically, the design review is an easy process
when the specified guidelines are adhered to by all participants.
The review moderator has to be in control of
the review at all times.
Desk Checking
The
process
undisciplined
during
the
adverse effect on
nature
of
this
verification
implementation phase has a
its reliability.
definite
For the most part,
74
the
lone
work.
programmer
Typically,
will
not
be
very
critical
of
his
obvious mistakes but not many subtle
ones will be detected.
It is about the easiest to use of all the methodologies discussed.
tages.
Working alone does have some advan-
Some of the complexities involved in working with
other people are avoided.
No procurement cost is involved.
The user has
his own methods (normally undisciplined) which may or may
not work for him.
The usage cost is
the salary of
the
lone programmer.
Instrumentation
This methodology is used at the
implementation
phase where actual code has been generated. The placement
of the instruments in the software is critical for this
instrumentation process to work.
Sometimes final outputs
can give a false sense of security.
Intermediate results
could possibly have offsetting effects on each other hiding subtle errors.
Hence,
the reliability of this pro-
cess depends heavily on how well
the software has been
instrumented; secondly, on the analysis of the results.
75
Software
extend.
the
is
Basically,
software.
easily
instrumented
to
a
certain
it is just additional code added to
However,
correct
analysis
of
where
and
what type of instruments to be inserted in the software
is not a trivial task.
There
ically,
is
normally
instrumentation
in-house
when
no
procurement
library functions
necessary.
The
usage
cost.
Typ-
are developed
cost
involves
programmer's time to analyse the software for
the
insertion
of the instruments and accomplish numerous computer runs.
Each computer run is analyzed to determine if additional
instruments
are necessary or
if
an error
has
been de-
in
mathemat-
tected.
Mathematical Checker
This
ically
methodology
oriented
quirements
is
software.
phase
it
is
used
During
primarily
mainly
the
used
functional
for
high
relevel
analysis or feasibility study.
It
is
relatively easy to
use depending
complexity of the equations being evaluated.
on
the
The person
using the checker must be willing to spend a considerable
amount of time analyzing the results to take full advan-
76
tage of the checker.
This
involves analyzing
computer
runs (typically many) and fixing any errors detected then
repeating this process until the desired results are obtained.
The procurement cost could vary between $100,000
to
$300,000.
Mathematical
checkers
are
normally pur-
chased or leased but sometimes developed inhouse.
How-
ever, once purchased or developed, the cost per computer
run will obviously decrease.
The usage cost depends pri-
marily on the billing rate of the computer installation
used.
The mathematical checker is very reliable.
gives
you
everything
computer printouts.
you
want
However,
to
know
in
the
form
quite understandable,
It
of
the
results are only as good as the inputs you provide to the
checker.
During the design specification phase the checker becomes very useful for detailed mathematical analysis.
Due mostly to the fact
that we
are dealing with
less abstract concepts as compared to the functional requirements phase.
Both usage and
procurement cost are about
the
77
same for
this phase as
the preceding phase.
Some ad-
ditional cost could be incurred from computer runs.
During the implementation phase, the checker is
very useful and reliable in analyzing or detecting errors
in such problem areas as
round-off,
scaling,
and over-
flow.
Both
usage
and
procurement
costs
are
approxi-
mately the same as indicated for the two preceding phases
discussed.
Proof of Correctness
This methodology is primarily used at the functional
requirements and design specification phases but
is still very much at the research stage.
been
found,
reliable
as stated
mostly when
fact
it has
throughout the 1 i terature,
verifying
small
segments or algorithms in general.
the
Also,
programs,
to be
program
This is due mainly to
that derivation of assertions
{Section
used in the formal proof process are difficult
2.3.4)
(in most
cases not easy; sometimes close to impossible).
Since most formal proof of correctness tools designed so far are at best experimental,
it is difficult
78
to arrive at a
reasonable quantitative estimate of
the
cost of procuring a working system.
The
usage
cost
would
salary of the person using
be
a
combination
the system and
of
the
the computer
time required.
No
major
the functional
differences
can
be
indicated
between
requirements and design specification
phases except to note that the methodology is easier to
use in the latter because of the more detailed nature of
that phase (i.e., the proofs are more easily generated).
In other words,
we are dealing with less abstract con-
cepts.
Requirements Review
The review process is a most valuable asset in
the verification of functional requirements.
Its relia-
bility depends highly on the capabilities of the personnel
participating
in
the
review.
In other words,
the
review will be only as reliable as the participants are
capable.
The
participants.
review
is
not
normally
difficult
for
the
If the procedures set forth are followed,
79
the
review can
However,
be
conducted
difficulties
arise
with
the
utmost
of
ease.
when
the
moderator
loses
control of the review.
A procurement
there
is
a
usage
cost
cost normally varies
cost
does
not
exist.
However,
(the participants salary).
between $10
and
$25
per hour
The
de-
pending on the professional level of each participant.
Simulation
This process is recommended for complex or unavailable systems.
Although a very powerful tool, bene-
fits would be reaped only if designed for systems as indicated previously.
Off-the-shelf simulators for the functional
re-
quirements phase range in procurement cost from $10,000
to $50,000.
In addition, usage cost is incurred in de-
signing the simulation model for the system being verified.
Also,
cost are
incurred
for
simulation computer
runs and analysis of the data.
The
pends
reliability of
on how well
the
the
simulation
intended system
process
de-
is modelled and
80
also the capability of the personnel conducting the simulation and analysis.
The simulation process is easily implemented. It
does require, however, knowledge of the simulation hardware and procedures and the
intended system to be mod-
elled.
With respect to the design specification phase,
the reliability depends on how well the system being verified was modelled together with the
ties
of
the
this
phase,
person conducting
since
the
concepts
the
required capabili-
simulation.
are
less
During
abstract,
the
simulation task is somewhat easier and reliability tends
to increase.
The procurement cost of the simulation system is
in the
the
range of $50, DOD to $100, DOD.
previous
usually
phase
requested.
since
The
more
An increase over
simulation
usage cost
remains
options
are
essentially
the same.
Modelling of the system under verification tends
to be easier during the implementation phase because we
have
ware.
(at least should have) all the details of the softSince our modelling task
is easier, this implies
81
that less mistakes should be made. Further, an increase
in reliability is probable.
The
$300,000.
procurement
costs
range
from
$100,000
to
The overall usage cost normally increases from
previous
phases
since
more
simulation
runs
are
made.
This is due in part to the nature (detailed) of the software at this phase.
Structural Analyzer
The reliability of this analyzer is essentially
the
same
as
some
of
the
methodologies
(desk
checklist review and structured walkthrough)
automated.
However,
since
the
process
is
checking,
that it has
automated,
there is a probable chance that some subtle errors might
be detected.
Since structural
not
many off-the-shelf
analyzers are
systems
can
be
relatively new,
purchased.
The
system is normally language dependent; hence, decreasing
its portabi 1 i ty.
The procurement cost to design such a
system could range from $25,000 to $100,000 depending on
the options required by the user.
The usage cost incur-
red is dependent on the salary of the personnel using the
analyzer.
82
The analyzer is used with very little difficulty.
Possibly the most difficult and time consuming part
of the process is analyzing the resulting output of the
analyzer.
Structured Walkthrough
The
to as code)
structured
(sometimes
referred
has been used almost since the inception of
software programming.
depends on
walkthrough
The reliability of the walkthrough
the members of the software team conducting
the walkthrough.
Having aggressive and
inspecti ve type
team members makes for typically favorable results.
ever,
How-
it must be stressed that subtle errors are diffi-
cult to detect.
The usage cost of this walkthrough is based primarily on the salary of the participants.
would depend on the professional
pants.
No
procurement
cost
is
level
This of course
of
incurred
the particifor
the walk-
through.
Like the checklist review discussed earlier, the
walkthrough is easily implemented.
There is some prepa-
ration time spent by the participants prior to the walkthrough.
The better the participants are prepared,
the
83
easier and less painful the walkthrough process will be.
Adherence to the guidelines or procedures are vital to a
successful walkthrough during the implementation phase.
Symbolic Execution
The symbolic execution process is still mainly
at the research stage and not widely used as an everyday
tool of the programmer.
an
alternative to
It is, however, considered to be
the proof of
correctness methodology
discussed earlier.
Similar to the proof of correctness methodology,
this method is most reliable when dealing with small program
segments;
typically mathematically oriented.
The
process sometimes becomes difficult and unmanageable with
large program segments or complex algorithms.
tainly,
the
reliability
decreases
as
the
Most cer-
largeness
or
complexity of the program increases.
The
used.
symbolic
execution
is
not
easily
Difficulties arise when attempting to express the
software symbolically.
Not all software constructs can
be easily expressed symbolically.
proof of the formulas derived
task.
method
Also, the analysis or
is by no means a trivial
84
There
are
some
systems
(discussed
in
Section
2.3.3) already built that are still being improved, which
provides symbolic execution as a V&V method.
systems
build,
are
the
so
complex
procurement
$100,000 to $500,000).
dependent.
and
require
cost
is
many
Since these
manhours
relatively
high
to
(i.e.,
The usage cost is mainly salary
Computer time normally is a factor in arriv-
ing at the total usage cost.
Test Coverage Analyzer
This
analyzer
basically
provides
a
path
versal analysis of the software being verified.
traIt
is
highly dependent on a sufficient number of test cases to
exercise every path of the software (not an easy task for
large-scale programs) •
If
we
can
devise
a
sufficient
and
reliable
number of test cases that executes every path then the
reliability of the software would be what we want it to
be.
However,
we normally have
to settle for
something
much less.
The
However,
test
coverage
process
is
relatively
easy.
building test cases is both time consuming and
difficult.
Once the necessary test cases are built then
85
the only remaining task
This
is analyzing the output.
could be the largest task in the total test coverage process.
Sometimes test coverage analyzers are available
as part of a system which provides additional functions.
Depending if the system is separate or part of another
system, the procurement cost could range from $50,000 to
$200,000.
The usage cost is a combination of computer
time and analysis of the output.
Test Data Generation
A systematic
approach
to
test
data
generation
does avoid some of the problems that a haphazard approach
of
generating
test
data
presents.
During
the
latter
approach, it is mostly a trial and error process. Also, a
sufficient and reliable set of test data is not normally
attained.
The reliability of this methodology is directly
related to the inputs or conditional constraints derived
for the test data generation process.
a
In addition, since
large number of test data are obtained as output, we
still have to have some means of selecting only a
ficient amount to reliably test our software.
suf-
86
This methodology is generally not easy to use.
The
required
test
data
inputs or
generation
difficult to construct.
initial constraints to seed the
process
are
for
the
most
part
Also, the selection of test data
from the test data domain (i.e., all test data generated)
is no trivial task either.
The
$50,000
to
procurement
$200,000.
cost
Also,
normally
ranges
not
systems
many
from
are
presently available and are usually language dependent.
The major factor involved in usage cost is computer time
and analysis of the output data.
Test Driver
The test driver is utilized as a V&V tool more
so than any other methodology discussed during the implementation phase.
there
are
not
It is normally easy to implement since
many restrictions
imposed.
It does not
have to have the niceties of the software that it will be
used to test since it is normally just throwaway code.
The reliability of the test driver depends solely
on
words,
the
person
constructing
how exhaustive
the
test
the
driver.
cases
In
contained
other
in
driver determines the reliability of the test driver.
the
87
There
ology.
is no procurement cost
for
this
method-
The test driver is constructed specifically for
the software at hand.
The usage cost involves construc-
tion of the test driver and the associated analysis before and after the process is invoked.
CHAPTER IV
ANALYSIS OF ERROR OCURRENCE
AND DETECTION
88
89
4.1
Introduction to Chapter IV
In this chapter, some aspects of error occurrence and
detection with respect to a software development phase is
discussed.
In particular, to better understand where V&V
methodologies should be concentrated, an attempt has been
made to provide some answers to the following questions:
o
What kinds of errors occur most frequently?
o
In what phase of software development do they
occur?
o
In what phase of software development
are
they
detected?
o
What factors correlate with error occurrence and
detection?
Although the above list of questions could probably
be
expanded,
cussion.
the
list should be sufficient for our dis-
If answers could be found to some or all of these
questions, we would be well on our way to achieving reliable software.
Some error classification schemes and error types are
discussed.
Also, various studies (Alberts [2], Bowen [8,
9], Endres [21], Gannon [24], Glass [25], Howden [33], Ru-
90
bey [56], Schneidewind [57], et al.)
conducted on software
errors, in addition to information on error analysis gained
through my own experience, is used as a basis for arriving
at answers/conclusions concerning error occurrence and detection.
4.2
Error Types and Classifications
To begin with, it should be stated what is meant by a
software error.
Basically, it is any deviation that occurs
in the translation
(software development process) from the
user requirements into executable software.
This defini-
tion is sufficient as long as we assume that the user requirements have been specified correctly.
If the require-
ments have to be changed so does the implemented software.
The error classification schemes, that will
be dis-
cussed shortly, differ somewhat in general content.
How-
ever, hopefully some minimal classification scheme can be
devised for a given user.
The intent is to present some of
the various classification schemes currently in use today
so that we can begin to find answers to some of the questions posed earlier
in the
introduction of this
chapter.
These classification schemes are a means by which data can
be collected for error analysis.
91
Alberts
[2],
into three groups,
in his
namely,
error
study,
design,
classifies
errors
logic and syntax.
He
defines these as follows:
o
Design errors - requires change in the specification used by the programmer.
o
Logic errors
- translation from system design to
a programmable form.
o
Syntax errors - compiler or assembler detected.
The
frequency
distribution
of
these
errors
(data
drawn from a variety of sources) were found to be 46 to 64
percent for design errors, 15 percent for syntax errors and
21 to 38 percent for logic errors.
Bowen
[8]
describes a few classification schemes of
which TRW reliability study (sponsored by Rome Air Development Center) is the most interesting and widely adopted for
use.
TRW-Redondo Beach devised (during a 2.5 year study) a
software error classification scheme with twelve major categories (their results of Project 5 data is shown below in
parentheses) as follows:
92
Computational
(12.1%)
o
Logic
(24.5%)
o
Data Input
( 7.8%)
o
Data Handling
(11.0%)
o
Data Output (included as part of the 7.8% above)
0
Interface
7.0%)
0
Data Definition
8.9%)
0
Data Base
(16.2%)
0
Operation
N/A
0
Documentation
N/A
0
Problem Report Rejection
N/A
0
Other
(12.5%)
Bowen [8] also discusses a Hughes Aircraft Army project which used the following classification scheme based
on the TRW scheme:
0
Computational
(
0
Logic
(38.5%)
0
Data Definition
(20.5%)
0
Data Handling
(14.0%)
0
Data Base
3.0%)
0
Interface
4.5%)
0
Operation
1. 0%)
0
Documentation
0.5%)
0
Problem Report Rejection
N/A )
4.0%)
93
o
Data Input
N/A
o
Data Output
N/A
o
Other
Further,
(13.5%)
Bowen
[8]
discussed
another
Hughes
sification scheme which attempts to provide more
to management or other concerned personnel.
clas-
feedback
This minimal
error classification scheme is as follows:
o
Source
- Phase in which error of omission/commission was made (e.g. Requirement, Design, Coding, Test, Maintenance, and
Corrective Maintenance).
o
Cause
- The casual description of the error,
rather than symptomatic.
o
Severity - The resulting effect of the error on
mission performance (e.g. Critical,
Major, and Minor).
The casual categories are defined as follows:
o Design
Nonresponsive to requirements
Inconsistent or incomplete data base
Incorrect or incomplete interface
Incorrect
or incomplete program structure
94
Extreme conditions neglected
o
Interface
Wrong or nonexistent subroutine called
Subroutine call arguments not consistent
Improper
use
or
setting
of
data
base
by
a
routine
Improper handling of interrupts
o
Data Definition
Data not initialized properly
Incorrect data units or scaling
Incorrect variable type
o
Logic
Incorrect relational operator
Logic activities out of sequence
Wrong variable being checked
Missing logic or condition
Loop
iterated
incorrect
number
of
cluding endless loop)
Duplicate logic
o
Data Handling
Data accessed or stored improperly
times
(in-
95
Variable
used
as
a
flag
or
index
not
set
properly
Bit manipulation done incorrectly
Incorrect variable type
Data packing/unpacking error
Units or data conversion error
Subscripting error
o
Computational
Incorrect operator/operand in equation
Sign convention error
Incorrect/inaccurate equation used
Precision loss due to mixed mode
Missing computation
Rounding or truncation error
o
Other
Not applicable to software reliability analysis
Not
compatible with project standards
Unacceptable listing prologue/comments
Code or design inefficient/not necessary
Clerical
Endres'
[21] analysis deals primarily with errors in
system programs.
ered
during
The analysis was done on errors discov-
internal
tests
of
the
components
of
the
IBM
96
operating system DOS/VS.
He classified the errors in this
study into the following groups:
o
machine error
o
user or operator error
o
suggestion for improvement
o
duplicate (of a previously identified
program error)
o
documentation error
o
program error (not previously identified)
In addition, Endres [21] did a study of error distribution by types of errors.
three
major
groups.
groups.
He
He categorized the errors into
then
broke
them
down
into
sub-
The subgroups are too numerous to list here.
The
major groups are as follows:
o
Group I
Machine configuration and architecture
(10%)
Dynamic behaviour and communication
between processes
( 17%)
Functions offered
( 12%)
Output listings and formats
3%)
Diagnostics
3%)
Performance
1%)
97
o
Group II
Initialization (of fields and areas)
( 8%)
Addressability (in the sence of the
assembler)
7%)
Reference to names
7%)
Counting and calculating
8%)
Masks and comparisons
2%)
Estimation of range limits (for
( 1%)
addresses and parameters)
Placing of instructions, within a
module, bad fixes
o
( 5%)
Group III
Spelling errors in messages and
( 4%)
commentaries
Missing commentaries or flowcharts
(standards)
5%)
Incompatible status of macros or
modules (integration errors)
5%)
Not classifiable
2%)
An error frequency distribution (discussed in DeMilio
[18]) was also conducted by E. A. Youngs.
He,
in his PhD
dissertation, analyzed 1258 errors in Fortran, Cobol, PL/I
98
and Basic programs.
He found that the frequency of occur-
renee of errors were distributed as shown below:
Relative
Frequency
of Occurrence
Error Type
Error in assignment or computation
27%
Allocation error
15%
Other, unknown, or multiple errors
11%
Unsuccessful iteration
9%
Other I/O error
7%
I/0 formatting error
6%
Error in branching - unconditional
1%
- conditional
5%
Parameter or subscript violation
5%
Subprogram invocation error
5%
Misplaced delimiter
4%
Data Error
2%
Error in location or marker
2%
Nonterminating subprogram
1%
Schneidewind
[57],
in
his
error
analysis,
defined
error categories and types as follows:
1.
Design Errors
The
following
types
of
errors
apply
categories "System Design Errors" and
Design Errors":
to
both
"Program
99
Communication Error
Design Negligence
Forgotten Cases or Steps
Timing Problems
Errors in I/O Concepts
Data Design Error
Initialization Error
Inadequate Checking
Extreme Conditions Neglected
Sequencing Error
Indexing Error
Loop Control Errors
Misuse of Boolean Expression
Mathematical Error
Representation Error
Misunderstanding of Problem Specifications
Other Design Errors
2.
Coding Errors
Misunderstanding of Design
Negligence
I/O Format Error
Misplaced Data Declaration
Multiple Data Declarations
Missing Data Declaration
100
Inadequate Data
Initialization Error
Error in Parameter Passing
Inadequate or Forgotten Checking
Level Problems
Missing Declarations of Block Limits
Case Selection Error
GO TO Problems
Comment Error
Forgotten Delimiter
Inconsistency in Naming
Wrong Use of Nested IF Statements
Indexing Error
Inconsistent Use of Variables or Data
Sequencing Error
Flag Usage Problems
Syntax Error
Loop Control Error
Incorrect Exit for Subroutines
Language Usage Problems
Forgotten Statements
Representation Error
Control Sequence Error
Incorrect Subroutine Usage
Other Coding Errors
101
3.
Clerical Errors
Manual Error
Mental Error
Procedural Error
Other Clerical Errors
4.
Debugging Errors
Inappropriate Use of Debugging Tools
Insufficient
or
Inappropriate
Selection
of
Test Cases or Test Data
Misinterpretation of Debugging Results
Misinterpretation of Error Source
Negligence
Other Debugging Errors
5.
Testing Errors
Inadequate Test Case(s) or Test Data
Misinterpretation of Test Results
Misinterpretation
of
Program
Specification
Negligence
Other Testing Errors
In
four
the
Schneidewind
projects,
follows:
[57]
experiment with data
the error distribution was
found
from
to be as
102
Number of
Errors
Percentage
Design
35
20.2
Coding
97
56.1
Clerical
37
21.4
4
2.3
Error T:t:pe
Debugging
This
Bowen
[8],
is
in contrast to the finding of Alberts
Endres
[21],
Jensen
[39]
and Rubey
[56],
[2],
who
found the majority of errors to occur in the design specification phase.
The classification scheme used in my error study is
somewhat similar to the last scheme I discussed from Bowen
[8].
To
facilitate
error
reporting,
a
classification
scheme was chosen which could be easily implemented.
This
scheme was developed specifically for Project X (discussed
in
Section
4.3).
Four
software
error
classification
schemes used are defined as follows:
o
Source
- software development phase where the
error occurred.
o
Destination - software development phase where the
103
error was detected.
o
Error Type
-computation (28.8%), logic (33.4%),
data associated (16.8%), documentation (5.2%), interface (7.7%) and
miscellaneous (8.1%).
o
4.3
Severity Level - extent of software degradation
1.
serious
2.
moderate
3.
minor
Error Occurrence/Detection During Software
Development Phases
During the years 1969 through 1980, I participated as
a programmer and then later as a senior analyst in the development of a weapons control system embedded in an airborne computer.
Briefly, this project concerns the real-
time computation of missile parameters for the purpose of
directing a missile at a designated target.
I will refer
to this project in following discussions as Project X since
the name and any association to it is classified.
After approximately a year into Project X, a record
of the errors made was required to be maintained upon re-
104
quest of the user.
I have compiled (from software problem
reports) a list for each year, as shown in Table 4.3.1, of
the errors detected during each software development phase
(functional requirements, design specification and impleTo give a more pictorial view of the error
mentation).
study that follows, the errors detected are graphed as
shown in Figure 4.3.1.
TABLE 4.3.1
List of Errors Detected with Respect to a
Software Development Phase for Project X.
Year
Functional
Requirements
1970
63
97
119
279
1971
13
193
133
339
1972
19
89
29
137
1973
37
125
76
238
1974
19
47
31
97
1975
13
92
57
162
1976
23
176
83
282
1977
11
133
87
231
1978
31
63
59
153
1979
7
19
37
63
1980
3
17
12
32
239
1051
723
2013
11.9
52.2
35.9
Total
%
Design
Specification
Implementation
Total
105
II OF
VERIFICATION
VALIDATION
/ ~,----:---'-'--=---~v__..l____..
ERRORS
PHASE I
PHASE II
PHASE III
!PHASE lV
200
I
180
I
I
160
140
120
100
80
60
40
20
0
1969
70
71
72
73
74
75
76
77
78
79
YEARS
Errors Detected in:
Functional Requirements
Design Specification
Figure 4.3.1.
Errors Detected with Respect to a Software
Development Phase During Project X.
80
106
There were
three
major
software
development
periods (as indicated by the dotted lines).
During these
periods, additional requirements were imposed on the system by the user.
Therefore, the errors are not only er-
rors detected since inception of Project X but also the
accumulation of errors during these periods.
As shown in Table 4.3.1, a total of 2013 errors
were detected during that eleven year time span.
ther,
the
largest
were
detected
This
agrees,
percentage
during
in
the
general,
(52. 2
design
with
percent)
Fur-
of
errors
specification
phase.
the
findings
by Alberts
[2], Bowen [8], Endres [21], Jensen [39], and Rubey [56],
but in contrast to Schneidewind
[57]
which gives imple-
mentation the higher percentage.
No
after 1978.
additional
user
requirements
were
imposed
However, a total of 95 errors were detected
during these two remaining years when Project X was going
through a validation/acceptance phase
(Phase IV).
These
errors could be labelled persistent (errors found late in
the software development process)
as described by Glass
[ 25] •
The
V&V
checklist review,
methods
used
design review,
during
Project
desk checking,
X
were
instru-
107
mentation,
requirements
review,
simulation,
structured
These methods were typi-
walkthrough, and test driver.
cally used as related to a software development phase as
shown
in Appendix
C.
The
relatively new methods like
symbolic execution, assertion
checkin~
and proof of cor-
rectness were not and still is not at the state-of-theart for everyday usage by the software personnel assigned
to Project X.
Changes
design),
as
a
to
the
result
documentation
of
software problem report.
errors,
was
indicated
on
or
the
From this information one could
tell where the error actually occurred.
the error was detected in
the
(requirements
the
For instance, if
implementation phase and
requirements document had to be changed then
it is
most likely that the error occurred during the requirements phase.
It is difficult to access with a high de-
gree of precision how many errors detected were due to
what I will call the "rippling effect"
(i.e. errors which
occurred in an earlier software development phase and has
propagated
down
meaningful
data,
through succeeding phases).
it
requires
that
To obtain
the documentation
be
detailed and not a vague representation of the software.
From
the
reports
information
accumulated
(from Project X),
from
software
problem
a compilation of error occur-
rence, as shown in Figure 4.3.2, was made.
It was found
108
that 189 and 31 more errors occurred for the requirements
and design phases, respectively, than originally detected.
4.4
Summary of Error Analysis
The following conclusions can be drawn from the
analysis conducted:
o Some
type
of
error
classification
scheme
is
necessary to provide feedback.
o Most errors occur during the design specification
phase.
o Logic
and
computational
errors
were
the
most
prevalent error types.
o Errors were not necessarily detected where they
originally occurred.
o The software typically still contains some per-
•
sistent errors at the acceptance phase.
ERROR SOURCE
Total
Errors
Detected
Functional
Requirements
Phase
Functional
Requirements
Phase
I
239
II
239
Design
Specification
Phase
I
1051
II
152
I
723
II
37
~I
IWff#h'ff/~1
Design
Specification
Phase
Implementation
Phase
ERROR
Implementation
Phase
DESTINATION
Total Error
Occurrence
Errors that
Occurred but
not Detected
Figure 4.3.2
I
183
I
503
428
I
1082
I
503
189
I
31
Errors Attributed to Other Phases
.......
0
1.0
CHAPTER V
RESULTS OF RESEARCH AND ANALYSIS
110
111
RESULTS OF RESEARCH AND ANALYSIS
First, a brief summary of what transpired during this
research and analysis of selected V&V methodologies is in
order.
sizing
Fifteen
the
V&V methodologies
advantages
and
were
disadvantages
discussed
of
each.
emphaAlso,
each were categorized to be in a static or dynamic grouping.
a
Further, each were categorized according to usage in
particular
software
also discussed).
development
phase
(each
phase was
The analysis and eventual ranking of each
methodology was accomplished with respect to cost, ease of
use and reliability.
my own experience.
This ranking was based primarily on
However, supporting data was also ob-
tained from a survey of 118 software personnel with di versified software backgrounds and numerous years of experience.
Data gleaned from literature was also used.
Last-
ly, an error analysis was conducted to provide some insight
into where these V&V methodologies could be best utilized.
Supporting
data
was
obtained
from
whose theme was error analysis.
a
variety
In addition,
of
studies
error data
accumulated during Project X was also analyzed.
As a
result of the
research
the findings are listed as follows:
and
analyis conducted,
112
o
Most
software
errors
occur
during
the
design
specification phase (more than 50 percent).
o
The design review methodology received the overall
most favorable
rating with respect to cost, ease
of use and reliability.
o
Software errors still remain at the acceptance/
validation phase.
o
Most V&V methodologies are geared toward usage in
large scale software projects.
o
Some errors are not detected where they actually
occurred.
o
A variety of V&V methodologies exist for
use
in
each software development phase.
o
Logic and computational type errors are the most
common ones.
o
Some of the most potentially powerful V&V methodologies
(symbolic execution, proof of correctness
and assertion checkingj
have yet to be acclaimed
as a panacea for V&V of software.
CHAPTER VI
RECOMMENDATIONS
113
114
RECOMMENDATIONS
To even approach 100 percent reliability in software,
we must be cognizant of the V&V techniques and tools currently available.
These
techniques and tools are inval-
uable in the V&V of software conducted during the software
development process.
The following
items are recommended
to enhance the chances of realizing reliable software:
o
Adherence to "good" programming practices.
Stand-
ards and conventions should be applied at the onset of
any software project.
Use of structured
programming is only one of the many practices that
should be implemented.
o
More
stage
time
in
particular
should
the
be
spent
software
testing
development
at
an
early
process
the design specification phase)
(in
to
avoid costly errors later.
o
Select the right V&V methodology for the software
being
verified
(i.e.,
consider
factors
cost, ease of use and reliability).
cost might not be as
such
as
For example,
important in a large scale
project as compared to a small one.
o
Apply
the
partioned
technique
of
into
smallest
its
modularization
functional
(software
units).
115
Small software segments are easier to verify and
inherently produces less errors.
o
Avoid practically at all cost to code the software
too soon.
Do more up front work and it will pay
off in the end.
Managers have a tendency to push
for code too soon, typically before the design has
had time to solidify.
o
"Good" documentation is essential in the transmittal of information from one software phase to the
next.
Many
misconceptions
or
misunderstandings
can be avoided with the proper documentation.
o
Top-down
or
bottom-up software
development
choice that will have to be made.
is
a
Most software
managers (and I include myself) take preference to
the
relatively
approach.
new
(about
ten
years)
top-down
Myers [47] presents an informative dis-
cussion of the two approaches.
o
Prior to any testing, a test plan and associated
procedures should be devised and strictly adhered
to.
o
Testing should be accomplished throughout the
software development process.
BIBLIOGRAPHY
116
117
BIBLIOGRAPHY
{1]
Adkins, G., and u. W. Pooch: "Computer Simulation:
A Tutorial," IEEE Computer, April 1977. pp. 12-17.
Presents an introduction to simulation.
Addresses the advantages, disadvantages
and applications of the simulation
methodology.
{2] Alberts, D. S.: "The Economics of Software Quality
Assurance," AFIPS Conference Proceedings, Vol. 45,
New York, 1976.
An analysis of the software life-cycle
is performed to determine where in the
cycle the application of quality assurance techniques would be most beneficial.
{3] Anderson, R. B.: Proving Programs Correct, Wiley &
Sons, New York, N.Y., 1979.
Illustrates several informal correctness
proof techniques; which provides a systematic means of desk checking programs;
and offer additional insight into some
basic programming constructs, looping
and recursion.
[4] Arthur J., and J. Ramanthan: "Design of Analyzers
for Selective Program Analysis," IEEE Transactions
on Software Engineering, Vol. SE-7, No. 1, January
1981, pp. 39-51.
This paper presents a method for developing automatic analyzers which analyze
programs and provide programmers with a
variety of messages for the purpose of
validating these programs in the early
stages of program development.
[5] Basu, s. K., and R. T. Yeh : "Strong Verification
of
Programs,"
IEEE Transactions on Software
Engineering, Vol. SE-1, No. 1, March 1975, pp. 7686.
Investigates the strong verification of
programs using the concept of predicate
118
transformer introduced by Dijkstra.
[6] Bate, R. R., and G. T. Ligler: "An Approach to
Software Testing: Methodology and Tools," IEEE
Proceedings COMP-SAC, November 1978, pp. 476-4ao:Presents an approach to construction of
software testing tools with respect to
software development.
[ 7]
Boehm I B. w.: "The High Cost of Software I n Practical Strategies for Developing Large Software
Systems, Addison Wesley, 1975.
Surveys the factors which influence the
high cost of software; gives a breakdown
of software costs.
[8] Bowen,
J. B.: "Standard Error Classification to
Support Software Reliability Assessment," AFIPS
Conference Proceedings, 1980, pp. 697-705.
Proposes a standard error classification
that can be applied to all phases of the
software development cycle.
[9] Bowen, J. B.: "A Survey of Standards and Proposed
Metrics for Software Quality Testing,"
Computer, August 1979, pp. 37-42.
IEEE
This article addresses the integration
and test phase by surveying military
standards for software quality control.
M. A.: "Validation, Verification, and
Testing for the Individual Programmer," IEEE
Computer, December 1980.
[10] B:ranstad,
Inexpensive verification and testing
techniques are discussed for the individual programmer.
Testing performed
throughout the software life-cycle is
stressed.
[11] Brown, J. R., and R. H. Hoffman: "Evaluating the
Effectiveness of Software Verification - Practical
Experience with an Automated Tool," AFIPS Fall
Joint Computer Conference, 1972.
Discusses the use of an automated tool FLOW program.
FLOW monitors statement
119
usage during test execution; provides
statement usage frequencies; indicates
unexecised code.
[12] Bruggere, T. H.: "On-Schedule, Reliable Software
Depends on Sound Methodology," EON, January 1981,
PP. 152-155.
Discusses the key areas for a successful
project-management, personnel, and engineering methodology.
[13) Cheatham, T. E., G. H. Holloway, and J. A. Townley: "Symbolic Evaluation and the Analysis of
Programs," IEEE Transactions on Software Engineering, Vol. SE-5, No. 4, July 1979, pp. 402-417.
Describes a symbolic evaluator for part
of the ELI language with particular
emphasis on techniques for handling
conditional data sharing patterns, the
behavior of array variables, and the
behavior of variables in loops and
during procedure calls.
[ 14 J Chen, W. T. , J. P. Ho, and C. H. Wen:
Validation of Programs Using Assertion
Facilities," IEEE Proceedings COMP-SAC,
1978, pp. 533-538.
"Dynamic
Checking
November
Provides
a
theoretical
basis
for
assertion checking with
regard
to
validation of
program correctness.
Gives some guidelines for inserting
assertions within the program.
[15] Clarke, L. A.: "A System to Generate Test Data and
Symbolically Execute Programs," IEEE Transactions
on Software Engineering, SE-2, No. 3, September
1976, pp. 215-222.
Describes a system that attempts to generate test data for programs written in
ANSI Fortran using symbolic representation of output variables as a function
of input variables to detect errors.
[16] Clarke,
L.
A.:
"Testing:
Achievements and
Frustrations," IEEE Proceedings COMP-SAC, November
1978, pp. 310-314.
120
Discusses some of the current program
validation methods.
Testing, symbolic
execution and test data generation
methods are presented.
[17] Darringer, J. A., and J. c. King: "Application of
Symbo 1 i c Execution to Program Testing," IEEE
Computer, Vol. 11, No. 4, April 1978, pp. 51-60.
Discusses symbolic execution techniques
as developed in the EFFIGY project.
Also describes how symbolic execution
can be used to solve a variety of program testing problems.
[18] DeMilio, R. A.: "Hints on Test Data Selection:
Help
for
the
Practicing
Programmer,"
IEEE
Computer, 1978, pp. 34-41.
Suggests how to selectively choose test
data to minimize testing time.
[19] DeMilio, R. A., R. J. Lipton, and A. J. Perlis:
"Social Processes and Proofs of Theorems and
Programs," Communication ACM, Vol. 22, No. 5, May
1979, pp. 271-280.
Argues that formal verification of programs will not play the same key role in
software development as proofs do in
mathematics.
[20] Deutsch, M. S.: "Software Project Verification and
Validation:" IEEE Computer, April 1981, pp. 54-70.
Stresses the concept that verification
and validation techiques should be used
over the entire software life-cycle.
Discusses some automated testing techniques and cost estimates of these
techniques.
[21] Endres, A.: "An Analysis of Errors and Their
Causes in System Programs," IEEE Transactions on
Software Engineering, Vol. SE-1, No. 2, June 1975,
PP. 140-149.
Investigates error types and distribution in system programs; suggests ideas
for prevention of errors; discusses detection methods.
121
[22] Fairley, R. E.: "An Experimental Program Testing
Facility," IEEE Transactions on Software Engineering, December 1975, pp. 350-357.
Describes a program testing facility
called the Interactive Semantic Modeling
System (ISMS) •
The ISMS is designed to
allow experimentation with a wide variety of tools for collecting, analyzing,
and displaying testing information.
[23] Fairley, R. E.: "Tutorial: Static Analysis and
Dynamic Testing of Computer Software," IEEE
Computer, April 1978, pp. 14-23.
This paper discusses static analysis and
dynamic testing. Both the conceptual
aspects and automated tools in these
areas are described.
[24] Gannon, C.: "Error Detection Using Path Testing
and Static Analysis," IEEE Computer, August 1979,
pp. 26-31.
Demonstrates in an empirical manner the
types of errors one can detect and the
cost of testing using static analysis
and dynamic testing.
[25] Glass, R. L.: "Persistent Software Errors," IEEE
Transactions on Software Engineering, Vol. SE-7,
No. 2, March 1981, pp. 162-168.
Discusses the expensive cost of persistent software errors - those which are
not discovered until late in development.
Categorizes persistent errors
from two projects.
[26] Glass, R. L.: "Real-Time: The Lost World of
Software Debugging and Testing," Communication
ACM, Vol. 23, No. 5, May 1980, pp. 264-271.
From a survey of current practices
across several projects and companies,
the problems involved in real-time testing are discussed and suggestions for
improvements are made.
[27] Glass,
R.
L.:
"Real-Time
Checkout:
the
'Source
122
Error First' Approach," Software-Practice and
Experience, Vol. 12, 1982, pp. 77-83.
This paper proposes some improvements to
methods currently used for checkout of
real-time software.
It is suggested
that errors be removed at the host computer environment rather than the target
computer environment.
[28] Glass, R. L.: Software Reliability Guidebook,
Prentice-Hall, Englewood Cliff, N.J., 1979.
Surveys the available technological and
managerial software reliability techniques~ makes value judgments about each
technique~ illustrates each by example.
[29] Glass, R.
L.
and
R.
A.
Noiseux:
Software
Maintenance Guidebook, Prentice-Hall, New Jersey,
1981.
Discusses the maintenance life-cycle
from a technological and management
standpoint.
[30]
Goodenough, J. B., and c. L. McGowan: "Software
Quality Assurance:
Testing and Validation,"
Proceeding of the IEEE, Vol. 68, No. 9, September
1980, pp. 1093-1098.
The purpose of this paper is to help
hardware-oriented engineers apply some
of the quality assurance techniques
learned by software engineers.
[31] Goodenough, J. B., and s. L. Gerhart; "Toward a
Theory of Test Data Selection," IEEE Transactions
on Software Engineering, Vol. SE-1, No. 2, June
1975.
This paper outlines a possible approach
to developing valid and reliable test
data.
[32] Hollhouse, M. A., and M. J. Hatch: "Experience
with Automated Testing Analysis," IEEE Computer,
August 1979, pp. 33-36.
123
This article describes some experience
with using an automated testing analysis
tool to measure testing coverage.
[33] Howden, W. E.: "Life-Cycle Software Validation,"
IEEE Computer, February 1982, pp. 71-78.
Suggests that validation be a part of
each phase of the life-cycle. Two validations activities - analysis and test
data generation - should take place during each phase.
[34] Howden, W. E.: "A Symbolic Evaluation and Program
Testing System," IEEE Transactions on Software
Engineering, Vol. SE-4, No. 1, January 1978, pp.
70-73.
The basic features of the DISSECT symbolic testing tool are described. Usage
procedures are outlines and the special
advantages of the tool are summarized.
[35] Howden, W. E.: "Symbolic Testing and the DISSECT
Symbolic Evaluation System," IEEE Transactions on
Software Engineering, Vol. SE-3, No. 4, July 1977,
pp. 266-278.
Results of two classes of experiments in
the use of symbolic testing and evaluation are summarized using the DISSECT
system.
[36] Howden, w. E.: "Functional Program Testing," IEEE
Transactions on Software Engineering, Vol. SE-6,
No. 2, March 1980, pp. 162-169.
An approach to functional testing is described in which the design of a
program is viewed as an integrated collection of functions.
The selection of
test data depends on the functions used
in the design and on the value spaces
over which the functions are defined.
[37] Huang,
J.
c.: "Instrumenting Programs for
IEEE Computer,
Symbolic-Trace
Generation,"
December 1980, pp. 17-23.
Describes how to instrument program us-
124
ing symbolic trace techniques.
Fortran
programs are used as examples to describe the method of automatic symbolic
trace.
[ 38]
Huang, J. C.: "Program Instrumentation and Software Testing," IEEE Computer, Vol. 11, No. 4,
April 1978, pp. 25-32.
Discusses the merits of program instrumentation and how software probes are
used in testing.
[39] Jensen, R. W., and c. c. Tonies: Software Engineering, Prentice-Hall, 1979, pp. 329-407.
Discusses software verification and validation methodologies.
Specifically,
automatic techniques are described as
used during the software life-cycle.
[40] King, J. C.: "Proving Programs to be Correct,"
IEEE Transactions on Computing, Vol. C-20, No. 11,
November 1971.
Describes a technique for proving that
computer programs will always execute
correctly. Proof of correctness of programs are defined with respect to a given abstract model of the program and its
execution.
[41] King, J. C.: "Symbolic Execution and Program Testing," Communication ACM, Vol. 19, No. 7, July
1976, pp. 385-394.
Describes the symbolic execution of programs.
A system called EFFIGY which
provides symbolic execution for program
testing and debugging is also described.
[42] Linden, T. A.: "A Summary of Progress Toward Proving Program Correctness," AFIPS-Fall Joint Computer Conference, 1972.
This paper summarizes recent progress in
developing rigorous techniques for proving that programs satisfy formally defined specifications.
[43] Liskov, B. H.:
"A Design Methodology for Reliable
125
Software Systems,"
ference, 1972.
AFIPS-Fall Joint Computer Con-
Presents a guideline for good system design.
Discusses modularization and
structured programming as prerequisites
for obtaining reliable software.
[44] Miller, E. F.: "Program Testing:
Theory," IEEE Computer, July 1977.
Art
Meets
Describes some recent efforts to build a
bridge linking the theory of program
testing with its practice.
[45] Miller, E. F.: "Toward Automated Software Testing:
Problems and Payoffs," Proceedings of the Eighth
Annual Symposium on Computer Science and Statistlcs, February 1975.
Discusses various fundamental technical
problems which must be surmounted before
testing can become the operational equivalent of a formal program proof and
certification activity.
[46] Morgan, D. E., and D. J. Taylor: "A Survey of
Methods for Achieving Reliable Software," IEEE
Computer, February 1977, pp. 44-53.
Discusses some preventive methods for
reliable software.
Also, some methods
of detecting errors.
[47] Myers, G. J.: The Art of Software Testing,
Wiley & Sons, New York, 1979.
John
Discusses
some
basic
approach
to
systematic testing.
Describes many
software testing techniques.
[48] Myers, G. J.: "A Controlled Experiment in Program
Testing and Code Walkthroughs/Inspections," Communication ACM, Vol. 21, No. 9, September 1978,
pp. 760-768.
Describes an experiment in program testing, employing software professionals
using seven methods to test a small PL/I
program.
126
[49]
Naylor, T. H.: Computer Simulation Experiments
with Models of Economic Systems, John Wiley and
Sons, New York, 1971.
Discusses simulation with
economic systems.
Details
conducted
with
various
methodologies.
respect to
experiments
simulation
[50] Ogdin, c. A.: "Software Aids for Debugging," MiniMicro Systems, July 1980, pp. 115-122.
Discusses functional testers (hardware)
and debug monitors (software).
[51] Otterstein, L. M.: "Quantitative Estimates of Debugging Requirements," IEEE Transactions on Software Engineering, Vol. SE-5, No. 5, September
1979, pp. 504-513.
This paper presents a model to estimate
the number of bugs remaining in a system
at the beginning of the testing and integration phases of development.
[52] Panzl, D. J.: "Automatic Software Test Drivers,"
IEEE Computer, Vol. 11, 1978, pp._ 44-50.
This article describes three types of
automatic software test drivers, namely,
AUT (Automatic Unit Test), TPL/F (Fortran Test Procedure Language), and
TPL/2.0 (Second Generation of TPL/F).
[53] Ramamoorthy, c. v., and s. F. Ho: "Testing Large
Software with Automated Software Evaluation Systems," IEEE Transactions on Software Engineering,
Vol. SE-1, No. 1, March 1975, pp. 46-58.
Calls most software development projects
"unsuccessful in terms of specification,
time and cost," and recommends automated
software tools to solve this problem.
[54] Rei fer, D. J.: "A Glossary of Software Tools and
Techniques," IEEE Computer, July 1977, pp. 52-62.
Provides a glossary of software tools
and techniques as they relate to the
127
software life-cycle.
[55] Riddle, W. E., and R. E. Fairley: Software Development Tools, Springer-Verlap, 1980.
Proceedings of a workshop on software
development tools.
Contains papers by
various authors on software tools and
techniques.
[56] Rubey, R. J., J. A. Dana, and P. W. Biche:
"Quantitative Aspects of Software Validation,"
IEEE Transactions on Software Engineering,
Vol.
SE-1, No. 2, June 1975, pp. 150-155.
This paper discusses the need for quantitative descriptions of software errors
and methods for gathering such data.
[57] Schneidewind, N. F., and H. Hoffman: "An Experiment in Software Error Data Collection and Analysis,"
IEEE Transactions on Software Engineering,
Vol. SE-5, No. 3, May 1979, pp. 276-286.
Experiments with the hypothesis that
program structure has
a significant
effect on error making, detection, and
correction as measured by various software error characteristics.
[58] Sorkowitz, A. R.: "Certification Testing:
A
procedure to Improve the Quality of Software
Testing," IEEE Computer, August 1979, pp. 20-24.
An independent quality control staff
uses automated tools to certify that
minimum testing criteria have been met.
[59] Tanenbaum, A. s.: "In Defense of Program Testing
or Correctness Proofs Considered Harmful," SIGPLAN
Notices, Vol. 11, No. 5, May 1976.
Argues that correctness proofs can supplement, but cannot replace comprehensive testing.
[60] Tratner, M.: "A Fundamental Approach to Debugging," Software Practice and Experience, Vol. 9,
February 1979, pp. 97-99.
The psychology of debugging is discuss-
128
ed. A fundamental approach to debugging
is proposed.
[61]
Vantassel, D. L.: Program Style, Design, Efficiency, Debugging and Testing, Prentice-Hall, Englewood
Cliff, N.J., 1978 (second edition).
For the beginning programmer, this book
provides a basic foundation for good
programming practices.
[62]
Yourdon, E.: Structured Walkthrough, Prentice Hall,
Englewood Cliffs, N.J., 1979.
Presents a methodology on walkthroughs.
Discusses how to make walkthroughs a reliable testing technique.
APPENDIX A
GLOSSARY OF TERMS
129
130
APPENDIX A
GLOSSARY OF TERMS
Algorithm
A procedure consisting of a finite number of steps and
guaranteed to terminate for all inputs and which
accomplishes a specific function.
Assertion
Statement of what is presumed to be fact.
Bottom-Up Testing
A systematic testing philosophy which seeks to test
those modules at the bottom of the software system
structure first.
Checkout
A process of improving implemented computer software
for delivery and customer usability.
Debugging
A subset of testing where the activity is diagnosing
the nature of errors and correcting them.
Dynamic V&V
Verification of software program behavior by execution
in a controlled environment •
.Methodology
A systematic
rules •
approach
to
methods,
principles
and
.Modularization
Process of dividing a program into subprograms
(modules) which can be compiled separately, but which
have connections with other modules.
131
Module
A separately invokable element of a software system.
N-S Charts
Structured program charts developed by I. Nassi and B.
Schneiderman.
Persistent Error
Error which eludes early detection and does
surface until the software becomes operational.
not
Playing Computer
The act of manually executing
use of a computer.
a
program without
the
Proof of Correctness
Use of techniques of mathematical logic to infer that
a relation between program variables assumed true at
program entry implies that another relation between
program variables holds at program exit.
Software Life-Cycle
Software system development process composed of four
hierarchical phases, namely, functional requirements,
design specifications, implementation and maintenance.
Software System
A collection of modules, possibly organized into
components and subsystems, which solves some problem.
Static V&V
Verification of software programs without
its run-time behavior.
regard for
Symbolic Execution
Assignment of symbols or expressions instead of actual
values to variables while following a program path.
132
Test Data
The actual values from a program's input domain that
collectively satisfy some test data selection criterion. (Goodenough [ 31]) •
Testing
A process of executing a program with the intent of
finding errors.
Test Predicates
A description of conditions and combinations of
conditions
relevant
to
the
program's
correct
operation.
(Goodenough [31]).
Top-Down Testing
A systematic testing philosophy which seeks to test
those modules at the top of the software system
structure first.
Validation
An attempt to find errors by executing a program in a
given real environment.
Verification
An attempt to find errors by executing a program in a
test or simulated environment.
APPENDIX B
RANKING OF SOFTWARE V&V
METHODOLOGIES
133
134
APPENDIX B
RANKING OF SOFTWARE V&V METHODOLOGIES
TABLE Bl.
Functional Requirements Phase - Ranking
of Methodologies.
Ranking with respect to
Methodology
Total
Cost
Ease of
Use
Checklist Review
(CS)
(E2)
(R3)
Mathematical
Checker
(C2)
(E2)
(R2)
Requirement
Review
(CS)
(El)
(R2)
Simulation
(C3)
(E3)
(R3)
Reliability
135
TABLE B2.
Design Specification Phase - Ranking
of Methodologies.
Ranking with respect to
Methodology
Total
Cost
Ease of
Use
Checklist Review
(CS)
(E2)
(R3)
Design Review
(CS)
(El)
(R2)
Mathematical
Checker
(C3)
(E2)
(Rl)
Proof of
Correctness
(Cl)
(E4)
(RS)
Simulation
(C3)
(E2)
(R2)
Reliability
136
TABLE 83.
Implementation Phase - Ranking
of Methodologies.
Ranking with respect to
Ease of
Use
Methodology
Total
Cost
Assertion Checker
(C2)
(E4)
(R4)
Checklist Review
(C5)
(E2)
(R2)
Desk Checking
(C5)
( El)
(R3)
Instrumentation
(C3)
(E2)
(R3)
Mathematical
Checker
(C2)
(E2)
(Rl)
Proof of
Correctness
(Cl)
(E5)
Simulation
(C2)
(E3)
(R3)
Structural
Analyzer
(C2)
(E2)
(R2)
Symbolic
Execution
(Cl)
(E4)
(R4j
Structured
Walkthrough
(C4)
(El)
(R2)
Test Coverage
Analyzer
(C2)
(E3)
(R3)
Test Data
Generation
(C3)
(E4)
(R3)
Test Driver
(C4)
(E2)
(R2)
Reliability
(R4)
-
137
TABLE B4.
Estimated Cost with Respect to a
Software Development Phase.
Functional
!Requirements
Design
Specs.
Assertion
Checker
*N/A
N/A
Checklist
Review
N/A
(1)
N/A
(1)
N/A
(1)
Design
Review
N/A
N/A
(1)
N/A
Desk
Checking
N/A
N/A
N/A
(1)
Instrumentation
N/A
N/A
(2)
(1)
$100,000$300,000
(5)
$100,000$300,000
(5)
Methodology
Mathematical
Checker
$100,000$300,000
(5)
Implementation
** ($50,000 $200,000)
(1)
Proof of
Correctness
N/A
(4)
(1)
(4)
(1)
Requirement
Review
N/A
(1)
N/A
N/A
Simulation
$10,000(3)
$50,000(3)
$100,000(3)
Structural
Analyzer
N/A
N/A
$25,000$100,000
(1)
Structured
Walkthrough
N/A
N/A
N/A
(1)
138
TABLE B4.
(continued)
Symbolic
Execution
N/A
N/A
$100,000$500,000
(1)
Test Coverage
Analyzer
N/A
N/A
$50,000$200,000
(1)
Test Data
Generation
N/A
N/A
$1,000$50,000
(1)
Test Driver
N/A
N/A
*
**
(1)
N/A
(1)
N/A - Not applicable
Procurement and usage cost, respectively
-
Salary (estimate of $10-25 per hour) dependent;
cost varies according to the number of software
personnel involved.
(2) - Usually not applicable since done normally inhouse; instrumentation package purchased could
cost in the range of $5,000-50,000.
(3) - Includes system modelling,
manpower.
computer time and
(4) - Very difficult to give a cost estimate; still
being researched; can be done manually with only
the personnel salary (see item 1 above) as cost.
(5) - Minimal usage cost (e.g., personnel salary, computer time and associated computer hardware) as
compared to procurement cost.
APPENDIX C
CATEGORIZATION OF V&V
METHODOLOGIES AS RELATED TO
A SOFTWARE DEVELOPMENT PHASE
139
140
APPENDIX C
CATEGORIZATION OF V&V METHODOLOGIES AS RELATED
TO A SOFTWARE DEVELOPMENT PHASE
TABLE Cl.
Categorization of Methodologies as Related
to a Software Development Phase.
Functional
Requirements
Phase
Design
Specification
Phase
Implementation
Phase
Checklist
Review
Checklist Review
Assertion Checker
Mathematical
Checker
Design Review
Checklist Review
Requirement
Review
Mathematical
Checker
Desk Checking
Simulation
Proof of
Correctness
Instrumentation
Simulation
Mathematical
Checker
Proof of
Correctness
Simulation
Structural
Analyzer
Symbolic
Execution
Structured
Walkthrough
Test Coverage
Analyzer
Test Data
Generator
Test Driver
APPENDIX D
SUMMARY OF ADVANTAGES AND
DISADVANTAGES OF EACH
METHODOLOGY
141
TABLE Dl.
Methodology
Summary of Advantages and Disadvantages of each Methodology.
Advantages
Disadvantages
Assertion
Checker
o potential of being reliable
o early detection of errors
during the testing process
o difficult to create assertions
and test cases
o not easy to use
o costly to use
o software package is not readily
available
o normally language dependent
Checklist
Review
o cost is relatively low for
large projects
o many reviewers of the code
(unlike the desk checking
methodology)
o process is easy to implement
once checklist is created
o gross errors but only some
subtle errors are normally
detected
o some difficulty in creating
checklist
o tendency to spend too much
time trying to find solutions
to problems during the review
Desk Checking
o cost is very low
o methodology is easily used
o gets rid of most gross errors
prior to integration on the
actual system
o subtle errors are seldom
detected
o lone programmer
o work is not normally double
checked
.......
-1=:>
N
TABLE Dl.
(cont'd)
Methodology
Advantages
Disadvantages
Design Review
o cost is relatively low (not
so for small projects)
o gross errors are normally
detected
o not difficult to use
o subtle errors are not normally
detected
o difficult to keep review on
intended course
o tendency to spend too much
time trying to find solutions during the review
Instrumentation
o easy to implement
o cost is in medium range
o potential of being highly
reliable
o code removal prior to delivery
to the user
o requires additional program
memory space
o execution time of the software
is increased
Mathematical
Checker
o software package is readily
available
o highly reliable for checking
mathematically oriented software
o easy to use
o costly to use
o good input test data to
sufficiently test the software
o high reliability on small
software segments
o proves mathematically that
the software as written does
what it is suppose to do
o vital for critical software
o not so reliable on large
software segments or programs
o not easy to put into practice
o proof verifier not easily
obtained (only a few being
researched)
Proof of
Correctness
o time consuming to analyze results
......
.j:::,
w
TABLE Dl.
(cont'd)
Methodology
A?vantages
Disadvantages
Proof of Correctness (cont)
o Section 2.3.4 lists additional o estimated cost is high
advantages
o creating assertions is not a
trivial task
o Section 2.3.4 lists additional
disadvantages
Requirements
Review
o cost is relatively low
o gross errors are normally
detected
o not difficult to put into
practice
o subtle errors not normally
detected
o difficult to keep review on
intended course
o tendency to spend too much
time trying to find solutions during the review
Simulation
o tests the software in an
environment similar to the
actual system
o provides insight into the
complexities of the actual
system which would otherwise
be unobtainable
o process is easy to implement
once the simulation model is
o simulation packages are normally expensive
o some difficulty and extensive time spent developing
the desired simulation model
o model may not represent the
actual system completely
o analysis of the simulation
data may be time consuming
......
~
~
TABLE Dl.
(cont'd)
Methodology
Advantages
Disadvantages
Structural
Analyzer
o potential for detecting most
subtle errors
o process is easy to implement
o provides analysis of logic
and data structure
o cost is normally high
o software package is not readily
available
o usually language dependent
o time consuming to analyze
results
Structured
Walkthrough
o cost is relatively low for
large projects
o many reviewers of the code
(unlike the desk checking
methodology)
o a long and taxing process
o difficult to find qualified
and interested participants
o tendency to spend too much
time trying to find solutions
during the walkthrough
Symbolic
o test case domain is increased
o potential of being very reliable when dealing with small
program segments
o costly to use
o doesn't handle conditional
statements, loops and index
operations easily
o only a few systems available
o difficulty in expressing software symbolically without ambiguities
Test Coverage
Analyzer
o high potential for detecting
errors
o gives information on program
segment with high usage
o helps to design reliable
o cost for software package is
normally high
o reliability depends on how
well the software is instrumented
......
~
{11
TABLE Dl.
Methodology
(cont'd)
Advantages
Disadvantages
test cases
o easy to implement process
o Section 2.3.9 lists additional
disadvantages
Test Data
Generation
o test data domain is increased
o invaluable when extensive
test data required
o not easy to implement
(creation of test predicates
or constraints))
o software package could be expensive
o too many test data sets are
generated
Test Driver
o easy to implement
o software doesn't have to be
very sophisticated (structured) since it is throwaway code
o cost is low
o high success rate in detecting subtle errors
o vital for bottom-up testing
o not very portable for use
with different programs
o driver hooks (links between
the driver and software being
verified) have to be removed
prior to delivery
o driver is only as reliable a
tool as designed
.......
+>a
0"1
APPENDIX E
BRIEF DISCUSSION OF VARIOUS
AUTOMATIC V&V TOOLS
147
148
BRIEF DISCUSSION OF VARIOUS
AUTOMATIC V&V TOOLS
ATTEST
ATTEST is an automatic test enhancement system. It
is composed of three major components: path selection, symbolic execution, and test data generation.
ATTEST analyzes programs written in ANSI Fortran.
DAVE
DAVE is a validation and error detection system for
programs written in ANSI Fortran. The DAVE system,
developed at the University of Colorado by L. J.
Osterweil and L. D. Fosdick, is a data flow analyzer which does test data generation and symbolic
path evaluation.
DISSECT
DISSECT is a symbolic evaluation system.
It is implemented in Lisp and can be used to symbolically
evaluate Fortran programs. The DISSECT project was
funded by the National Bureau of Standards and completed in 1976.
EFFIGY
EFFIGY is a symbolic execution system that interpretively executes programs written in a subset of
PL/I and includes several standard debugging features.
EFFIGY is interactive and permits the user
to trace the program execution at varying levels of
detail.
FACES
FACES is a Fortran automated code evaluation system.
It is a structural analysis tool which verifies correctness of the software by means of assertioN checking.
The FACES system is implemented in
and analyzes ANSI standard Fortran programs.
149
SELECT
The SELECT system, developed at the Stanford Research Institute, generates test data and verifies
assertions for program paths.
Testing and debugging is accomplished through symbolic execution of
the program.
It creates a symbolic representation
of the output variables for programs written in a
subset of Lisp.