Controlled testing for computer performance evaluation

Controlled testing for computer performance evaluation
by A. C. SHETLER
The Rand Corporation
Santa Monica, California
characteristics of the computer system being investigated
must be identified: hardware, soft\vare, interactions, and the
restrictions on user and system activity.
Controlling the workload includes specifying the execution
characteristics of the test jobs in detail. These jobs are often
designed with specific execution characteristics, written for
examining specific performance relationships.
INTRODUCTION
Because computer systems are still major investments, it is
appropriate that effort be directed to",-ard verifying and
improving their performance. Previous documents, such as
R-549 ,* recommend an overall framework for addressing computer performance improvements through the use of the
scientific method. This paper expands on that base, describing a specific procedure for implementing particular investigations through the technique of controlled testing. **
Though current computer system documentation describes
how the different systems function, performance aspects are
usually described only in the sales literature and in very
general terms. The complexity of these systems make selection, sizing, tuning, and workload characterization a task
that is exceedingly-and unnecessarily-risky. Controlled
testing can expose information about the execution characteristics of a computer system, permitting the analyst to
deal effectively with the assurance that performance prediction and improvement are not random processes.
Controlled computer performance testing is the process of
selectively limiting the inputs and operating conditions of a
computer system to assist an analyst in discovering and
verifying execution characteristics of that system. Controlled
testing is necessary because present operating systems contain complex performance-determining variables and relationships that confound the isolation of these relationships
when tests are executed in an uncontrolled (normal) environment. The profusion of variables also makes statistical
analysis difficult; because their quantity is unknown, unidentified variables can invalidate the conclusions of a testing
effort. Controlled testing for computer performance analysis
can reduce the variables by explicitly defining the test jobs
and the environment in which these tests are executed. By
limiting the activities on a computer system, the analyst
can isolate and test the performance relationships on that
system.
Controlling the environment involves identifying, monitoring, and isolating system activities that are concurrent with
test jobs being executed, while ensuring system hardware
and software are consistent among tests. The execution
WHY
CO~TROLLED
TESTING?
The selection of controlled testing over other performance
measurement techniques should be based on the appropriateness of the method to accomplish the purpose of the investigation with the resources available. Because performance analysis usually involves conclusions about millions of dollars
\vorth of equipment and labor, errors can be expensive and
the most appropriate cost-effective method should be
selected. While controlled testing may not always be the
appropriate data collection mechanism, some of the reasons
for selecting controlled testing are:
• The reduced side effects of normal system execution.
• The reduced statistical variability of normal system
execution.
• The reduced cost, in terms of training, for an analyst.
Controlled testing should be orderly, proceeding within an
established framework. Figure 1 depicts the activities of
controlled testing.
The activities identified in Figure 1 need expansion for
clarification. While many of the activities are relevant to all
well designed testing efforts, some are particularly pertinent
to testing efforts using controlled testing as the data collection vehicle; the context should reflect this.
ESTABLISH OBJECTIVES/EXPECTED VALUE
OF TESTING
The objectives of a controlled test must be appropriate to
the scope of the problem(s) being examined and must specify
the boundaries of the effort. In addition, some estimate of
the value to be received should be documented. The definition
of the objectives and expected value is· critical; otherwise
* See Bibliography for references.
** This controlled testing procedure is addressed in more detail in R1436. The concepts described in this paper are elaborated in the Report.
693
From the collection of the Computer History Museum (www.computerhistory.org)
694
National Computer Conference, 1974
Figure I-Controlled testing procedure
testing efforts, although interesting, become expensive and
futile.
Since the objectives of a controlled testing effort determine
the direction for the investigation, they must be established
early and with management concurr~nce. If an investigation
has multiple objectives, they should be assigned priorities.
Enthusiastic analysts often begin data collection before objectives are clearly defined, only to discover that the data
are not appropriate for the investigation.
When defining the objectives, the expected cost and value
received must be clearly identified so the cost of the investigation can be balanced against the expected results.
Critical resources must be established: the personnel; the
machine time; the investigative tools (such as simulators,
hardware and software monitors, statistical analysis packages, etc.); and the disruption to operations.
One performance investigation had an objective of establishing the feasibility of removing a core module in terms
of the effect of on-line system response time and batch turnaround time. The expected savings in rental for the core
module was approximately $90,OOO/year if the investigation
proved feasible in terms of maintaining an acceptable level
of user service. Simply removing the module to see what
happened was not an acceptable solution to the management
(in terms of the lost capacity, user dissatisfaction and the
$1,000 reinstallation fee). But the expected value of the investigation was sufficient to allocate a performance analyst,
systems programmer, and some blocks of computer time to
investigate the problem (costing less than one month's rental
of the core module). Once the economic benefits of the situation were established, the resources could be intelligently
allocated.
GENERATE HYPOTHESES
Once the objectives of a testing effort have been defined
and determined to be economically worthwhile, specific hypotheses should be developed as indicated in Figure 1. These
hypotheses will, when tested, become the medium for achieving the objectives.
The complexity of an objective usually determines the
extent of hypothesis generation. The important point is not
that some specific number of hypotheses be generated, but
that the hypotheses that address the objectives be testable.
Without testable hypotheses, an effort may simply continue
until funding is exhuasted. Without testable hypotheses,
analysis will likely be nebulous, and conclusions cannot be
supported without a resort to cries for faith in the analyst's
integrity.
The analyst should strive for simple hypotheses that are
easily tested. Where possible, multipJe simple hypotheses
should substitute for a single complex one. The relationship
between each hypothesis and the objective should be stated
and documented, even when it may appear obvious. Otherwise the hypotheses may be investigated that are irrelevant
to the objective. Explicit assumptions should be examined
to be sure they are not, in reality, hypotheses for testing.
The following information is necessary for documenting
hypotheses:
• Hypothesis: What the analyst proposes to test.
• Relationship: The relationship between what is being
tested and the objectives of the performance evaluation.
• Assumptions: The explicit and implicit assumptions
about the hypothesis being tested.
• Analysis: The information (data) and processing to
validate the hypothesis.
• Data: Description of the collection mechanism.
• Alternate Hypothesis: What the analyst believes is true
if the hypothesis is rejected.
TEST DESIGN
The types of analysis to be employed in test evaluation
must be defined and documented along with a detailed
description of data required for validating hypotheses. Then
the analyst can define the tests for gathering the data. The
analysis should include tests for an invalid hypothesis; without considering this, a series of tests could be executed which
do not generate conclusive evidence that a hypothesis is
either valid or invalid.
Generating a test design cal1s for ingenuity if testing is to
be performed efficiently. Properly designed tests may overlap and collect data for evaluating several hypotheses. The
environment-both system and workload-must be specified
during the test design. The details of both hardware and
software must be documented before the analyst moves on
to specifying assumptions, variables, and a detailed operating
procedure.
A configuration diagram of the hardware most clearly
identifies the system hardware. The devices not required for
testing should be shaded to aid visual identification of the
test equipment. Obviously, the operating system and its
version must be identified. In addition, descriptions of installation variations from the unmodified version of the
operating system are appropriate. These may include:
• A description of resident vs non-resident portions of the
monitor.
From the collection of the Computer History Museum (www.computerhistory.org)
Controlled Testing for Computer Performance Evaluation
• A description of installation modifications to the standard system, and how these may affect the testing.
• A list of subsystems, special security monitors, bulk
media conversion routines, etc., that will affect the testing.
• The dimensions of resident systems (such as time-sharing) that will be active during the test.
If possible, a mechanical description of the system state at
initialization of the testing should be included (i.e., an
operator's log display that lists the supported and existing
devices).
All assumptions relating to the testing effort should be
listed explicitly. This list will aid in evaluating the testing
procedure. Assumptions might be identified that relate to
the hypotheses, the analyses, the hardware, the software,
the tests, or the testing environment.
The analyst should explicitly identify the variables that
are expected to change during execution of the testing procedure. The relative significance of each variable may not
be easily determined and many of them may seem trivial.
However, they must be spelled out because a presumably
trivial variable can, in particular situations, have a nontrivial impact on testing results or on subsequent analysis.
Without initially listing the possible variables, the analyst
could waste time during the test execution and analysis
detecting these variables after side effects have been encountered.
Develop detailed operational procedure
An operational procedure sets forth in a document the
planned activities that operators and analysts will execute.
This "cook book" for executing the tests is included as part
of the test design. Without such a document, a test can
deviate in subtle but significant ways from the analyst's
intended course. Time is always limited during test execution
and haste can result in skipping a critical step in system setup or test execution if the operational procedure is not
carefully specified.
Before a test, the order of operations can be reviewed to
ensure the procedure. Where appropriate, the operational
procedure can be modified to reflect corrected assumptions.
The detailed operational procedure can take the form of a
checklist to be used during test execution to verify the
planned activities. Using this detailed procedure reduces the
time used to mentally verify that generally stated conditions
have been met.
Variahility
When designing tests, the analyst should assume that
statistics will usually vary in test results. The variability is
currently unavoidable, but is not an insuperable handicap.
Variability can be identified and quantified in the following
way:
• Execute the tests several times to identify the extent to
695
which standard system variability affects the data collected by the test set.
• Examine the test environment for variability factors
(i.e., scheduling, spooling, resource allocation). Once
such factors are detected, the test procedure can be
examined for ways to reduce variability.
PILOT TESTING
As Figure 1 indicates, once the analyst completes his test
design, he should conduct a pilot test. This permits a validation of the design before he attempts an actual test. The objective of pilot testing is to discover the problems with the
procedure; the analyst is seldom disappointed by finding the
procedure perfect.
If possible, the analyst should do all debugging prior to
stand-alone pilot test time. Programs, control cards, and
input parameters should not be debugged during stand-alone
time; these tasks are not time-critical and can usually be
performed more easily in the working environment. All jobs
should be completely tested, ready for use prior to pilot
testing.
The analyst should verify that the system state (environment) is constant. This is particularly important when the
test will be executed in more than one session; without it,
unexpected conditions may add new variables and confound
any conclusions. This constant system base should be established prior to actual test execution (that is, during pilot
testing). The operational procedure should include steps to
verify the base state.
Redesign experiment
Pilot testing of the experiments usually results in redesign
of some of those experiments or the procedures. Because
several pilot test sessions may be necessary, all output should
be kept until the experiment redesign is complete. In this
way, all information is available for reviewing the revised
design. As indicated in Figure 1, pilot testing should be
repeated as often as necessary to refine the total experiment.
Minor modification of the assumptions stated in the initial
test design are occasionally a result of pilot testing, but the
analyst should expect the revisions to the detailed procedures
to be extensive. Mter executing a pilot test, the analyst
should document the revised detailed functional and operational procedures. Included in this documentation should be
modifications to the assumptions (based now on experience
with the environment) and an identification of the critical
variables with a projection of their expected impact.
TEST EXECUTION
When an analyst completes the pilot tests and arrives at
the final design, a reasonably precise description of this test
procedure is already in hand. To ensure that the operational
procedure is closely followed, the analyst must be present
From the collection of the Computer History Museum (www.computerhistory.org)
696
National Computer Conference, 1974
during the test execution as an active participant rather
than an observer. Ideally, the analyst performs all operations-mounting tapes, operating the console, feeding cards,
etc. An inability to perform these functions may indicate a
critical lack of understanding about the system. (Experience
indicates that a few hours with a senior operator is usually
sufficient to acquaint an analyst with the essentials of operating any system for controlled testing.) For example, lack of
familiarity with console commands might indicate the analyst
is unaware of which system parameters are subject to modification and are important to performance determination.
Some analysis of performance data must parallel the test
execution. This allows the detection of "unexpected" events.
In addition, the test results can be monitored for indications
that the tests are proceeding as expecLed.
Time should be reserved for reruns and correcting errors
in procedures. Problems are unavoidable in even the most
carefully prepared efforts and the analyst must plan for
them. The amount of time for rerun and error correction
should be increased as (1) the elapsed time between pilot
testing and actual test execution increases, and (2) the number. Qf.un~Qntrolled variables increases. For example, using
one system to prepare for the test effort and another to
execute the final tests will dramatically increase the probability of errors (e.g., where all the preparation is done on a
system at the user's installation, and another system-a new
one-in a vendor machine room is used to run a test stream).
ANALYZE TEST RESULTS
The data produced during testing should be analyzed on
an informal level as the test is being executed. Thorough
analysis of the test results should begin, if possible, within
24 hours of the test period. This leaves the t('~ting experience clear in the mind of the analyst; although notes are
generated during the experiment, the details will fade with
time. The test results should be examined with particular
emphasis on the following:
• A desired level of control was specified in the testing
procedure; the t('~t results should indicate the level of
control attained and the status of the critical variables.
• The system components suppo;edJy being held constant
should be examined to determine whether they have
been changed.
• The system components being compared, tested, or both,
should be examined to determine the test's applicability.
The validity of the original hypotheses can be determined
onl:r b}T a careful anal:rsis of the actual test results. This
analysis may be extensive and part of it may take place long
after the test. However, each hypothesis should be examined
in some detail immediately, because a long analysis may be
required for one or more of the other hypotheses. More
thorough analysis can be done after the immediate analysis
where the special conditions requiring further work are identified.
The results of the analysis of the test data should be
documented in terms of the hypotheses being tested. The
implications of these results to the objectives of the performance investigation should also be documented. As indicated in Figure 1, results may be incomplete because some
hypotheses are invalid or the test execution revealed additional questions and additional analysis is required to determine the additional testing to complete the investigation.
DETERMINE INCREMENTAL TESTING
Analysts should expect some hypotheses to be untrue;
when this is so, it may be appropriate to generate new
hvn()t,hpI'lPI'l. Tn
l'Innit,i()n
t,p.;:t,
nl"A(>~l1l"P
--- ---. . - ............ , t,hp
..............................
. . , . . l""
.....
--.JJ.-~----'--.-.
o..J ................... - . ... .....,
TYHnr t.lll"n
~ ...............
J
v ................
Allt, t A
'-..J'\,.A.V
VV
have been inadequate (regardless of the most exhaustive
preparation). The test procedure, or experiments, may require redesign, but the analyst should only attempt to redesign the tests if the expected value exceeds the expected cost
with respect to the original-or a revised-objective.
An evaluation of new hypotheses must include the collection of new data. However, a hypothesis generated to match
a particular set 01 data can beproved·only with new data. *
Otherwise, the "hypotheses" are merely descriptions of existing data.
NOTE
F~ure 1 describes the procedure for implementing controlled testing, there are two related special topics that require expansion, Testing Processing Environment and Test
Stimulation. These topics are addressed specifically in Appendix A and B, respectively. The explanations in these appendices should add clarity to the test design phase for implementing the controlled testing procedure.
BIBLIOGRAPHY
Bell, T. E., B. W. Boehm, and R. A. Watson, Computer Performance
Analysis: Framework and Initial Phases for a Performance Improvement Effort, The Rand Corp., R-549-1-PR, November, 1972.
Bell, T. E.,Computer Measurement and Evaluation-Artistry, or Science?
The Rand Corp., P-4888, August, 1972.
Bell, T. E., Computer Performance Analysis: Measurement Objectives and
Tools, The Rand Corp., R-584-NASAjPR, February, 1971.
Boehm, B. W., Computer System Analysis Methodology: Studies in Measuring, Evaluating, and Simulating Computer Systems, The Rand Corp.,
R-520-NASA, September, 1970.
Bookman, P. G., B. A. Brotman, and K. L. Schmitt, "Measurement
Engineering Tunes Systems," Computer Decisions, Vol. 4, No. 14,
J.-\.pril, 1972, pp. 28-30.
Kolence, K. W., "A Software View of Measurement Tools," Datamation,
Vol. 17, No.1, January 1, 1971, pp. 32-38.
Lockett, J. A., A. R. White, Controlhd Testsfor Performance Evaluation,
The Rand Corp., P-5028, June 1973.
* If
a new hypothesis is generated from examining existing data, the
may be able to reject it based on an analysis of these datu, but
he cannot accept it.
anal~,rst
From the collection of the Computer History Museum (www.computerhistory.org)
Controlled Testing for Computer Performance Evaluation
Mayo, E., The Human Problems of an Industrialized Civilization, Macmillan. New York, 1933.
Roethlisberger, F. J., and W. J. Dickson, Management and the Worker,
Cambridge, Mass., Harvard University Press, 1939.
Seven, M. J., B. W. Boehm, and R. A. Watson, A Study of User Behavior in Problem Solving with an Interactive Computer, The Rand
Corp., R-513-NASA, April, 1971.
.
_.
.
Sharpe, William F., Economics of Computing, ColumbIa UmversIty
Press, 1969.
Shetler, A. C. and T. E. Bell, Computer Performance Analysis: Controlled
.
Testing, The Rand Corp., R-1436-DCA, 1974.
Shetler, A. C., Human Factors in Computer Performance Analys?'s, The
Rand Corporation, P-5128, 1974.
Warner, D. W., "Monitoring: A Key to Cost Efficiency," Datamation,
Vol. 17, No.1, January 1, 1971, pp. 40-42ff.
Watson, R. A., Computer Performance Analysis: Applications of Accounting Data, The Rand Corp., R-573-PR, May, 197].
Watson, R. A., The Use of Computer System Accounting Data to Measure
the Effects of a System Modification, The Rand Corp., P-4536-1, March,
1971.
APPEXDIX A-TESTING PROCESSIN"G
EKVIROK::\1ENT
While controlled testing can reduce the number of variables
that confuse analysis, total control divorces an experiment
from the workload characteristics of the real job stream.
Experiments can be designed to combine tests in a totally
controlled environment with tests in a partially controlled
environment to assure relevance of the results to the current
workload on a system, when this relevance is a requirement.
Performance investigations that combine partially and
totally controlled testing environments can provide assurance
of the relevance of test results for objectives that are difficult
to prove in other ways. An effective procedure for testing an
objective requiring applicability to a specific system has
been to apply two varieties of specific hypotheses: those
postulating abstract relationships between hard"\vare and
soft\vare and those postulating the effects of these relationships on 'processing the usual workload. The results of testing
the former under a controlled environment can direct testing
the latter hypotheses under the system environment to ensure
applicability of the results to the current environment.
Tests with both partial and total control are pursued in
the same basic framework, using the same techniques, and
observing the same caveats. However, some special considerations apply to tests in each environment.
Partially controlled workload environment
When the analyst must use the normal workload environment for an experiment, unpredictable elements must be
addressed including the operators, users, and shifting \vorkload. If these are ignored, the results may be more indicative
of unpredictable human reactions than of the effects being
investigated.
If the users or operators are aware of experiments in
progress, they may respond by changing their behavior and
create an uncharacteristic environment (often referred to as
697
the Hawthorne effect). Spurious workload changes can also
invalidate experimental results. If a testing period coincides
\vith a particular event, such as annual accounting and reports, the results can be deceiving "vhen applied to the standard working environment. Normal fluctuations in workload
over short periods (one or two days) can create the same
effect. The analyst must remember that a computer system
\vorkload seldom reaches a "steady state." The human behavior and \vorkload characteristics of a system during a
short test period must be compared with characteristics over
a longer period. A formal comparison period with appropriate
controls should be used to reduce the probability of invalid
results caused by autonomous changes.
Totally controlled environment
Specifying and generating the desired environment is critical in an experiment with total control. If the objective of
an experiment is to examine specific system interactions, a
relationship between the artificial and real workload need not
be established.
A cooperative effort between performance analysts and
system maintenance personnel is more than desirable; it
c~n be imperative for controlled tests to be productive.
When a hypothesis involves system-related interaction, implicit and explicit assumptions and conclusions that are
consistent with the system's operational characteristics can
be assured. HO\vever, the analyst must view all untested
statements about performance characteristics with skepticism; systems are subjected to continual change, so opinions
about performance characteristics may be no more than
strongly-stated computer folklore.
Selection of a "background" workload may be necessary
for some controlled workload experiments. A background
workload consists of jobs designed to create some degree of
system activity. The activity created by these jobs is important to the controlled test, but the specific job characteristics are considered unimportant. Background alternatives include:
• No background. Only the tests are executed.
• Background spooling.
• Activating an on-line system through artificial stimulation.
• Activating background jobs with specified execution
characteristics.
Detailed considerations
After selecting the general approach for the degree of
workload control the analyst must choose the details of the
processing envir;nment. The follo"",ing must be included in
the considerations for both partial and total control:
• The hardware and software configuration.
• The system state (including the activity level of system
From the collection of the Computer History Museum (www.computerhistory.org)
698
National Computer Conference, 1974
functions and sub-systems such as spooling concurrent
with test execution).
• File placement on peripheral devices.
In addition, the availability of computer time, the work
schedule, the purpose of the experiment, and the quality of
the data being collected to validate hypotheses will influence
the environment selection.
computer system workload. A non proportional job stream
should not be an excuse for analysts to do sloppy work;
valuable results can be produced if these job streams are
correctly used. A specialized job stream is used in establishing system behavior patterns when a nonrepresentative,
abnormal subset of jobs are run (for analysis of very specialized relationships). For example, an investigation to determine the effectiveness of an internal operating system modification required a specialized job stream to validate the
implementation of the modification.
APPENDIX B-TEST STIMULATION
Test stimulation involves loading a computer by imposing
programmed activity that has specific resource utilization
characteristics. When an anal:yst is perforrring controlled
testing, the loading stimulates well-defined system functions
in specific ways. The total spectrum of test stimulation includes not only conventional batch benchmark jobstreams
that have been used for equipment procurement, but scripts
used for stimulating on-line systems.
BATCH-STIMULATION
Batch stimulation can consist of (1) a job stream selected
from an actual installation workload to represent that workload (usually called benchmarks); (2) a job stream selected
to generate a sample of work, not intended to represent the
workload realistically; or (3) a synthetic job stream that
generates desired execution characteristics but does not attempt to replicate an installation workload. Each type of job
stream has a different, but valid, purpose.
Representative job stream (benchmark)
When a subset of the normal workload (with the same
average component utilizations) is desired, the accounting
data generated by the benchmark should be examined to
determine that these data are approximately equivalent to
the normal job stream accounting data. Once the objectives,
hypotheses, and procedures for an experiment are clearly
defined, many of the accounting statistics may be irrelevant
and can be ignored, or additional statistics may be added.
However, reports resulting from an investigation using a
benchmark-type job stream must include adequate caveats
about not extending the results beyond the workload of the
system being tested although they may have applicability to
other systems processing similar workloads.
Synthetic job streams
Synthetic jobs are designed to stimulate the system to
exercise specific components in well defined ways. Some advantages of using a synthetic job stream are:
• The synthetic job can be designed with execution characteristics that are totally specified and understood.
• Internal data collection and data reduction can be designed into a synthetic job; it can serve as both the
stimlllatQr lind the monitor.
• Compatability between machines and vendors can be
insured for the bulk of the job, and the analyst can
perform comparison studies.
ON-LINE STIMULATION
Generating test stimulation for on-line systems is difficult,
and the development of procedures for this activity has been
minimal. One of the primary causes for this situation is lack
of information about norma] input (command sequences from
on-line sessions with associated "think-time" gaps) to facilitate duplication for on-line stimulation. This lack means
that on-line test stimuli are usually generated by scripts
specifically designed to test whether certain functional characteristics are present, rather than to test a richer variety
of performance related activities of on-line systems. Such
script are usually created without even the crudest measured
data and stored on punched paper tape or printed for manual
transcription during experimentation. They are costly in
terms of the time to produce and execute.
In addition to these crude techniques for script input,
sometimes a system is tested- with another computer system
as a stimulator. The stimulator generates input from one or
more scripts and inputs this to the computer system under
test, replicating multiple terminal activity. The initial appeal
of such schemes is placed into perspective once the design,
implementation, and application costs have been determined.
Specialized job stream
Script consideration
Specialized job streams are often used to investigate unusual phenomena that are observed. A job stream selected
to generate a sample of the workload does not permit the
same analysis and the results cannot apply specifically to any
Crude techniques should be used initially in an investigation. The analyst can usually devise a minimum set of online commands, manually use them, and then expand the
simple scripts as experience dictates. For a context-dependent
From the collection of the Computer History Museum (www.computerhistory.org)
Controlled Testing for Computer Performance Evaluation
system, the analyst should initially determine the minimum
number of steps that can be meaningful, based on the independence of the script's performance from potential commands before and after it. In context-independent portions
of a system, single-line scripts are desirable because they
make analysis easy.
Some simple data collection for existing systems can be
implemented without resorting to sophisticated tests. Since
users of on-line systems are accustomed to people in the
immediate vicinity of the terminals, they can be observed
without disrupting their activities or altering the situation
being observed. Some of the items to observe are:
•
•
•
•
The incidence of using each subsystem (command mode),
The relative incidence of each command,
The rate of command submission,
The rate of typing,
699
• The evenness of input (whether activity occurs in bursts
or is relatively constant through time), and
• The incidence of active but idle terminals.
When better information is needed, automatic data collection facilities are often available for aggregate information
to reveal loading. Combining the data from an automatic
collection facility and personal observation of users may be
necessary when a data collection facility is not available to
obtain more detailed data.
A very basic rule has proved important in on-line stimulation: always begin with a very simple stimulation, even when
a complex type is clearly required. The initial stimulation
must be· simple enough that the analyst can explain each
response of the system to it. On-line systems are complex
and understanding them is difficult; the danger of complex
scripts is that they lead to incorrect conclusions.
From the collection of the Computer History Museum (www.computerhistory.org)
From the collection of the Computer History Museum (www.computerhistory.org)

Download Report

Controlled testing for computer performance evaluation

Paperzz.com

Your Paperzz