Software Testing CS 408

Software Testing
CS 408
Lecture 5: Higher-Order Testing
1/31/17
Myers, Chapter 6
Basics
• A software error occurs when the program does not do what its
end user reasonably expects it to do
• Even if you could perform an absolutely-perfect module test,
you still could not guarantee that you have found all software
errors
• To complete testing, higher-order testing is necessary
• Software development is largely a process of communicating
information about the eventual product and translating this
information from one form to another
2
Basics
• Translate the software user's needs into a set of written requirements (Product
Backlog in Scrum)
• Design -- partitions the system into individual programs, components, or
subsystems, and defines their interfaces
• Translate the Design into source code
Most software errors stem from breakdowns in information communication
One solution is to orient distinct testing processes toward particular classes of
errors
Need for higher-order testing increases along with the size of the program
3
Basics
4
Function Testing
• Function testing is the process of attempting to find discrepancies
between the program and the external specification
• An external specification is a precise description of the program's
behavior from the end-user point of view
• Function testing is normally a black-box activity
Equivalence partitioning, boundary value analysis, cause-effect
graphing, and error-guessing methods
Covered back in Chapter 5
5
System Testing
• System testing is not a process of testing the functions of the complete
system or program
• System testing has a particular purpose: to compare the system or
program to its original objectives
• Several System Test categories can be used. Many of these are related
to non-functional Software Quality Attributes
Some of these will be important in CS 40800 (e.g., Stress Testing,
Performance Testing, Usability Testing, ....) But, you are not
required to do all (or even any!) of these
6
Sanity and Smoke Testing
Sanity testing is a very brief run-through of the functionality of a
program to assure that the software works roughly as expected. This
is often prior to a more exhaustive round of testing
Smoke Testing
Origin -- physical tests using smoke made to closed systems of
pipes to detect cracks or breaks
Subset of test cases that cover the most important functionality
of a component or system are selected and run to ascertain if the
most crucial functions of a program work correctly
Purpose is to determine whether the application is so badlybroken that further testing is unnecessary
7
Performance Testing
Many programs have specific performance or efficiency objectives,
stating such properties as response times and throughput rates under
certain workload and configuration conditions
Test cases should be designed to show that the program does not satisfy
its performance objectives
Volume Testing
Subject the program to heavy volumes of data
Purpose of volume testing is to show that the program cannot handle the
volume of data specified in its objectives
Example: The system is supposed to be able to store, retrieve, and modify
information concerning 1.5 million customers
8
Stress Testing
Stress testing subjects the program to heavy loads or stresses
A heavy stress is a peak volume of data, or activity, encountered over a short span of time
If an air traffic control system is supposed to keep track of up to 200 planes in its sector, you
could stress-test it by simulating the presence of 200 planes ... or more. If an operating system is
supposed to support a maximum of 150 concurrent jobs, the system could be stressed by
attempting to run 150 jobs simultaneously ... or more.
Web-based applications are common subjects of stress testing (Chapter 10)
You could stress a mobile device application -- a mobile phone operating system, for example -by launching multiple applications that run and stay resident, then try making or receiving one or
more telephone calls
9
Usability Testing
Tasking the ultimate end user of an application with testing the software in a real-world
environment
Covered in next chapter (Chapter 7)
Security Testing
Many programs now have specific security objectives
Security testing is the process of attempting to devise test cases that subvert the
program's security checks
One way to devise such test cases is to study known security problems in similar
systems and generate test cases that attempt to demonstrate comparable problems in the
system you are testing
Web-based applications often need a higher level of security testing than do most
applications (Chapter 10)
10
Storage Testing
Programs occasionally have storage objectives that state, for
example, the amount of system memory the program uses and the
size of temporary or log files
Configuration Testing
Some software must support a variety of hardware configurations,
including various types and numbers of I/0 devices and communications
lines, or different memory sizes
Often, the number of possible configurations is too large to test each one
Test a representative subset
11
jin, xuf001, yyzhou}@cs.ucsd.edu
pathy, Rukma.Talwadker}@netapp.com
Configuration Testing
Storage-A
600
Number of parameters
Number of parameters
500
500
400
300
200
100
MySQL
400
5.6.2
5.5.0
300
200
100
3.23.0
4.1.0
4.0.12
5.1.3
5.0.16
0
7/2006 7/2008 7/2010 7/2012 7/2014
0
1/1999 1/2003 1/2007 1/2011 1/2014
Release time
Release time
600
200
Apache
500
2.3.4
400
300
2.2.14
2.0.35
200
1.3.24
100
1.3.14
Number of parameters
Number of parameters
also severely
e fundamental
ation, reflected
knobs”). With
ensure high reprone task.
a fundamental
need so many
study the conusands of cusand hundreds
tware projects.
motivate softnd disciplined
gs, we provide
ficantly reduce
ple, the guideplify 19.7% of
ers. Also, we
in the context
Xu et.
ess in dealing
e practices for
700
Hadoop
160
2.0.0
1.0.0
120
0.19.0
80
40
0.1.0
MapReduce
HDFS
0
1/1998 1/2002 1/2006 1/2010 1/2014
0
1/2006 1/2008 1/2010 1/2012 1/2014
Release time
Release time
Figure 1: The increasing number of configuration parameters with
al.,software
FSE’15
evolution.
Storage-A is a commercial storage system from a
major storage company in the U.S.
12
Configurationdiskd_program
Testing (Examples from Xu et. al. OSDI’16
Parse config files;
store the settings
in program vars.
Use the setting of
for log rotation.
Configuration error:
diskd_program = a non-existent path
Initialization
Serving requests
conversations;
- 5 collections of logs &
runtime traces;
- 2 incorrect patches.
Diagnosis (48 hrs)
- 26 rounds of diagnostic
conversations;
!Hogging the CPU
for 7+ hrs¥
Squid
Use the setting of
Parse config files;
- 5 collections of logs &
store the
settings
diskd_program
runtime traces;
[Patch] Check existence of diskd_program during initialization
in program vars.
for log rotation.
- 2 incorrect patches.
Figure 1: A real-world LC error from Squid [37]. The error caused
system hanging for 7+ hours, and resulted in 48 hours of diagnosis efforts.
Later, a patch Serving
was added
to check the
existence
of the
Initialization
requests
!Hogging
the CPU
forconfigured
7+ hrs¥
path during initialization. Unfortunately, the patched check is still subCheck
existence
of diskd_program
during
initialization
ject to[Patch]
LC errors
such
as incorrect
file types and
permissions.
Figure 1: A real-world LC error from Squid [37]. The error caused
1. Configuration error:
3. Code snippets: /* TaskTracker.java */
system
hanging for 7+ hours, and
resulted in 48 hours of diagnosis efmapred.local.dir
//
no
check
initialization
forts.
Later, a patch was added to check
the at
existence
of the configured
= directory path w/ wrong owner
while (running)
{
path during initialization. Unfortunately,
the patched
check is still sub(mapred.local.dir is not used
try {
Infinite
loops
ject
toexec.
LC errors
such asjobs)
incorrect file
types and
permissions.
until
of MapReduce
...
Map reduce
access mapred.local.dir
2. Impact
...
Throw
Exception
}
catch(Exception
e)
{
The
TaskTrackers
were
trapped
1. Configuration error:
3. Code snippets: /* TaskTracker.java */
LOG.log(iRetrying!j);
into infinite loops (!When I ran
mapred.local.dir
//
} no check at initialization
jobs on a big cluster, some map
Too late to avoid
= directory path w/ wrong owner }
tasks never got started.¥)
while (running) { the failure!
(mapred.local.dir is not used
try {
Infiniteit loops
!TaskTracker
should check whether
can access
untilUser
exec.requests:
of MapReduce
jobs)
...
to the local dir at the initialization time, before taking any tasks.¥#
access mapred.local.dir
2. Impact
...
Throw
Figure
2:
A
real-world
LC
error
from
MapReduce
[12].
When
the
Exception
} catch(Exception e) {
The TaskTrackers were trapped
exception
caughtI ran
the runtime LOG.log(iRetrying!j);
exception induced by the LC erinto infinitehandler
loops (!When
13
IOException (when reading the key file)
5. Consequence:
HDFS auto-failover fails, and the entire HDFS service becomes unavailable.
Examples
(a) Missing initial checking
Error-handling configuration parameter:
CoreDumpDirectory
Apache httpd-2.4.10
1. LC Errors:
The running program has no permission to access coredump directory .
2. Initial checks: Check if the path points to an existent directory.
if (apr_stat(&finfo, fname, APR_FINFO_TYPE) != APR_SUCCESS)
return "CoreDumpDirectory does not exist";
if (finfo.filetype != APR_DIR)
return "CoreDumpDirectory is not a directory";
3. Late execution: Change working directory (chdir) to the path.
static void sig_coredump(int sig) {
iCoreDumpDirectoryj
...
apr_filepath_set(ap_coredump_dir, ...);
...
}
if(chdir(rootpath) != 0)
/* server/mpm_unix.c */
return errno;
4. Manifestation:
Error code returned by the chdir call
5. Consequence:
Apache httpd cannot switch to the configured directory, and thus fails to
generate the coredump file upon server crashing.
(b) Incomplete initial checking
14
Figure 3: New LC errors discovered in the latest versions of the
checking da
complicated
Figure 3b sh
this case, th
istence and
permissions
related LC
cause of cor
As shown
could have m
the system u
path used b
accessed by
open call, d
would resul
to check suc
Finding 2:
configuratio
system’s ini
Table 5 c
Configuration Testing
xecution runs
ns to capture
ulated execuerrors.
tes the emucheckers for
the checkers
an minimize
missing and
ld systems.
programs on
frameworks.
rs of various
s (each leads
w LC errors
lts show that
rld LC errors
h the existing
errors.
Software
Description
Lang.
HDFS
YARN
HBase
Apache
Squid
MySQL
Dist. filesystem
Data processing
Distributed DB
Web server
Proxy server
DB server
Java
Java
Java
C
C/C++
C++
# Parameters
Total
RAS
164
44
116
35
125
25
97
14
216
21
462
43
Table 3: The systems and the RAS parameters studied in §2.
Software
HDFS
YARN
HBase
Apache
Squid
MySQL
Deficiency of initial checking
Missing
Incomplete
41 (93.2%)
3 (6.9%)
29 (82.9%)
5 (14.3%)
18 (72.0%)
5 (2.0%)
4 (28.6%)
2 (14.3%)
9 (42.9%)
4 (19.0%)
6 (14.0%)
6 (14.0%)
Studied
param.
44
35
25
14
21
43
Table 4: Number of configuration parameters that do not have any
initial checking code (“missing”) and that only have partial checking and thus cannot detect all potential errors (“incomplete”).
15
Reliability Testing
If the program's objectives contain specific statements about reliability,
specific reliability tests might be devised
For example, medical monitoring software must perform for 100 days
without need to restart
This is often difficult to test in the short run
May be possible to simulate long periods of use
16
Therac 25
Then I got stuck!
17
Recovery Testing
Recovery objectives state how the system is to recover from
programming errors, hardware failures, and data errors
Programming errors can be purposely injected into a system to determine
whether it can recover from them
Hardware failures can be simulated
Data errors such as noise on a communications line or an invalid pointer
in a database can be created purposely or simulated to analyze the
system's reaction
18
Installation Testing
Some types of software systems have complicated installation procedures
Testing the installation procedure is an important part of the system testing process
Acceptance Testing
Usually is performed by the program's customer or end user and normally is not considered the
responsibility of the development organization
But, development organization should simulate this!
Alpha testing is simulated or actual operational testing by potential users/customers or an
independent test team at the developers' site
Beta testing comes after alpha testing
Versions of the software, known as beta versions, are released to a limited audience outside of the
programming team
This audience can include selected end users
19
System Tests
One of the most vital considerations in implementing the system test is
determining who should do it
1. Programmers should not perform a system test on their own software
2. Of all the testing phases, this is the one that the organization
responsible for developing the programs definitely should not perform
An ideal system test team might be composed of a few professional
system test experts (people who spend their lives performing system
tests) and a representative end user or two
20
Test Planning and Control
Immense project management challenge in planning, monitoring, and controlling the
testing process
Major mistake most often made in planning a testing process is the tacit assumption that
no errors will be found
Obvious result of this mistake is that the planned resources (people, calendar time, and
computer time) will be grossly underestimated -- a notorious problem in the computing
industry
People who will design, write, execute, and verify test cases, and the people who will
repair discovered errors, should be identified
Define mechanisms for reporting detected errors, tracking the progress of corrections,
and adding the corrections to the system
21
Test Completion Criteria
Criteria must be designed to specify when each testing phase will be judged to be complete. It is unreasonable
to expect that all errors will eventually be detected. The two most common criteria are:
l. Stop when the scheduled time for testing expires
Can satisfy this by doing absolutely nothing!
2. Stop when all the test cases execute without detecting errors -- that is, stop when the test cases are
unsuccessful
Subconsciously encourages you to write test cases that have a low probability of detecting errors
Since the goal of testing is to find errors, why not make the completion criterion the detection of some
predefined number of errors?
— You might state that a module test of a particular module is not complete until three errors have been
discovered
— Number of errors that exist in typical programs at the time that coding is completed (before a code
walkthrough or inspection is employed) is approximately 4 to 8 errors per 100 program statements
* This would say that a 2500-line program would contain 100-200 defects
Best practice is to continue testing until discovery of new defects drops significantly
22
Independent Testing
Advantages usually noted are...
1. Increased motivation in the testing process
2. Healthy competition with the development organization
3. Removal of the testing process from under the management control of
the development organization
4. Advantages of specialized knowledge that independent testers bring to
bear on the problem
23