Intermediate Modeling and Steady

Verification and Validation
What we will do…
•
•
Verification and Validation
Statistical analysis of steady-state simulations



Warmup and run length
Truncated replications
Batching
V & V Introduction
• In a simulation, the real-world system is abstracted by a conceptual
model a series of mathematical and logical relationships concerning
the components and structure of the system.
• The conceptual model is then coded into a computer recognizable form
(i.e., an operational model), which we hope is an accurate imitation of
the real-world system.
• The accuracy of the simulation must be checked before we can make
valid conclusions based on the results from a number of runs.
Real World
Conceptual
Model
Quantitative Models
Analytical Models
Operational
Model
“Code”
V&V
Introduction
Cont…
This checking process consists of two main components:
Verification: Is “Code” = Model? (debugging)

Determine if the computer implementation of the conceptual model is
correct. Does the computer code represent the model that has been
formulated?
Validation: Is Model = System?

Determine if the conceptual model is a reasonable representation of the
real-world system.
V & V is an iterative process to correct the “Code” errors and modify the
conceptual model to better represent the real-world system
The Truth: Can probably never completely verify, especially for large
models
Common Errors While Developing Models
•
•
Incorrect data
Mixed units of measure

•
Blockages and dead locks


•
Hours Vs. Minutes
Seize a resource but forgot to release
Forgot to dispose the entity at the end
Incorrectly overwriting attributes and variables

Names
• Incorrect indexing

When you index beyond available queues and resources
Verification
Verification is debugging of code so conceptual model is
accurately reflected by the operational model
Various common sense suggestions that can be used in the
verification process:
•
Write the simulation program in a logical, well-ordered
manner. Make use of detailed flowcharts when writing the
code
•
Make the code as self-documenting as possible. Define all
variables and state the purpose of each section of the
program.
•
Have the computer code checked by more than one person.
Verification
Cont…
•
Check to see that the values of the input parameters have
not been changed inadvertently during the course of a
simulation run.
•
For a variety of input parameter values, examine the output
of simulation runs for reasonableness.
•
Use traces to check that the program performs as intended.

Break point: Stop at a particular block

Watch point: Stop when a condition is true
–

NQ(1) > 10 (If the queue length is > 10, stop and check)
Intercept: Stop whenever a particular entity moves
Verification
•
Some techniques to attempt verification

Eliminate error messages (obviously)

Single entity release, Step through logic
–
Set Batch Size = 1 in Arrive
–
Replace distributions with a constant

“Stress” model under extreme conditions

Performance estimation

Look at generated SIMAN .mod and .exp files
–
Run > SIMAN > View
Cont…
Validation
•
Process of developing confidence that inference drawn on
model tell us something about the real system
•
Conceptual Validity

Does the model structured, adequately represent the system?
–
•
Operational Validity

Is behavior of model is characteristic of real world system?
–
•
Rationalism
Empericalism
Believability

Do ultimate users have confidence in this model?
Validation
A variety of subjective and objective techniques can be
used to validate the conceptual model.
•
Face Validity
•
Validation of Model Assumptions
• Validating Input-Output Transformations
Face Validity
A conceptual model must be reasonable “on its face” to those who are
knowledgeable about the real-world system.
• Have experts examine the assumptions or the mathematical
relationships of the conceptual model for correctness.
 Such a critique by experts would be of aid in identifying any deficiencies
or errors in the conceptual model (Turing Test: compare simulation Vs
actual system)
 The credibility of the conceptual model would be enhanced as these
deficiencies are corrected during the iterative verification and validation
process.
•
•
If the conceptual model is not overly complicated, additional methods
can be used to check face validity.
Conduct a manual trace of the conceptual model.
Perform elementary sensitivity analysis by varying selected “critical”
input parameters and observing whether the model behaves as
expected.
Validation of Model Assumptions
We consider two types of model assumptions:
•
Structural assumptions – i.e., assumptions concerning the
operation of the real-world system
• Data assumptions
Structural assumptions can be validated by observing the
real-world system and by discussing the system with the
appropriate personnel
Validation of Model Assumptions – Examples
We could make the following structural assumptions about
the queues that form in the customer service area at a
bank.
• Patrons form one long line, with the person at the front of
the line receiving service as soon as one of the tellers
becomes idle.
•
A customer might leave the line if the others in line are
moving too slowly.
•
A customer seeing 10 or more patrons in the system may
decide not to join the line.
Validation of Model Assumptions – Examples
Assumptions concerning the data that are collected may also be necessary.
Consider the interarrival times at the above bank during peak banking
periods. Could assume these interarrivals are i.i.d. exponential random
variables. In order to validate these assumptions, we should proceed as
follows.
• Consult with bank personnel to determine when peak banking periods
occur.
• Collect interarrival data from these periods.
• Conduct a statistical test to check that the assumption of independent
interarrivals is reasonable.
• Estimate the parameter of the (supposedly) exponential distribution.
• Conduct a statistical goodness-of-fit test to check that the assumption of
exponential interarrivals is reasonable.
Validating Input – Output Transformations
We can treat the conceptual model as a function that transforms certain
input parameters into output performance measures.
In the banking example, input parameters could include:
• The distributional forms of the patron interarrival times and teller
service times.
• The number of tellers present.
• The customer queuing discipline.
The average customer waiting time and server utilization might be the
output performance measures of interest.
The basic principle of input-output validation is the comparison of
output from the verified operating model to data from the real-world
system.
Input-output validation requires that the real-world system currently exist.
Example
One method of comparison uses the familiar t test.
•
Suppose we collected data from the bank under study, and the
average customer service time during a particular peak banking
period was 2.50 minutes.
•
Further suppose that five independent simulation runs of this
banking period were conducted (and that the simulations were
all initialized under the same conditions).
•
The average customer service times from the five simulations
were 1.60, 1.75, 2.12, 1.94, 1.89 minutes.
Example
Cont…
We would expect the simulated average service times to
be consistent with the observed average service time.
•
Therefore, the hypothesis to be tested is:

•
H0 : E[Xi] = 2.50 min Versus H1 : E[Xi] ≠ 2.50 min
where Xi is the random variable corresponding to the
average customer service time from the ith simulation
run.
Example
Define
μ0 = 2.50 (= E[Xi] under H0),
n = 5 (the number of independent simulation runs),
n
=
∑Xi
i=1
n
(sample mean of runs)
(sample variance of runs)
Cont…
Example
Cont…
By design and a central limit theorem, the Xi’s are approximately i.i.d.
normal random variables. So,
t0 =
(X – μ0)
S / n
Is approximately a t random variable with n-1 degrees of
freedom if H0 is true.
For this example,
X = 1.86, S2 = 0.0387, and t0 = -7.28
Taking  = 0.05, t table gives t4,0.025 = 2.78
Therefore H0 is rejected.
This suggests that our operational model does not produce
realistic customer service times. Changes in the conceptual model
or computer code may be necessary, leading to another iteration of
the verification and validation process.
Robustness
•
Suppose we have validated the conceptual model (and verified the associated simulation
code) of the existing real-world system.
•
So we can say that the simulation adequately mimics the real-world system. And we can
assume that some non-existing system of interest and our conceptual model have only
minor differences.
•
If we wish to compare the real-world system to non-existing systems with alternative
designs or with different input parameters, the conceptual model (and associated code)
should be robust.
•
Should be able to make small modifications in our operational model and then use this new
version of the code to generate valid output performance values for the non-existing
system.
•
Such minor changes might involve certain numerical input parameters (e.g., the customer
inter-arrival rate) or the form of a certain statistical distribution (e.g., the service time
distribution).
•
But it may be difficult to validate the model of a non-existing system if it differs
substantially from the conceptual model of the real-world system.
Historical Data Validation
Instead of running the operational model with
artificial input data, we could drive the model with
the actual historical record.
Then it’s reasonable to expect the simulation to
yield output results very close to those observed
from the real-world system.
Example Outline
Suppose we have collected interarrival and service time
data from the bank during n independent peak periods.
•
Let Wj denote the observed average customer waiting time
from the jth peak period, j = 1…n.
• For fixed j, we can drive the operational model with the
actual interarrival and service times to get the (simulated)
average customer waiting time Yj.
• We hope that Dj ≡ Wj – Yj ≈ 0 for all j.
•
We could do a paired t test to test H0: E[Dj] = 0
Steady State Simulation
Time Frame of Simulations
• Terminating: Specific starting, stopping conditions
Run length will be well-defined (and finite; Known starting and
stopping conditions)
Steady-state: Long-run (technically forever)
 Theoretically, initial conditions don’t matter (but practically
they usually do)
 Not clear how to terminate a simulation run (theoretically
infinite)
 Interested in system response over long period of time
This is really a question of intent of the study
Has major impact on how output analysis is done
Sometimes it’s not clear which is appropriate

•
•
•
•
Techniques for Steady State Simulation
• The main difficulty is to obtain independent simulation runs
with exclusion of the transient period .
•
If model warms up very slowly, truncated replications can
be costly
Have to “pay” warm-up on each replication
Two techniques commonly used for steady state simulation
are:
 Method of Batch means, and
 Independent Replication.
None of these two methods is superior to the other in all
cases.

•
•
Warm Up and Run Length
•
Most models start empty and idle
 Empty: No entities are present at time 0
 Idle: All resources are idle at time 0
 In a terminating simulation this is OK if realistic
 In a steady-state simulation, though, this can bias the output
for a while after startup
– Usually downward (results are biased low) in queueingtype models that eventually get congested
– Depending on model, parameters, and run length, the bias
can be very severe
Warm Up and Run Length (cont’d.)
•
Remedies for initialization bias

Better starting state, more typical of steady state
–
Throw some entities around the model
–
How do you know how many to throw and where?
This is what you’re trying to estimate in the first place!

Make the run so long that bias is overwhelmed
–

Might work if initial bias is weak or dissipates quickly
Let model warm up, still starting empty and idle
–
Run > Setup > Replication Parameters: Warm-up Period
» Time units!
–
“Clears” all statistics at that point for summary report, any
Outputs saved data from Statistic module of results across
replications
Method of Independent Replications
Method of Independent Replications
•
(cont’d.)
Suppose you have n equal batches of m observations each.
m
The mean of each batch is: meani =
∑Xij
j=1
m
n
Overall estimate is: Estimate =
∑meani
i=1
n
The 100(1 - /2)% CI using t table is: [ Estimate  t S ]
n
Where the variance
S2
∑(meani – Estimate)2
=
i=1
n-1
Warm Up and Run Length (cont’d.)
•
•
Warm-up and run length times?

Most practical idea: preliminary runs, plots

Simply “eyeball” them

Be careful about variability — make multiple replications,
superimpose plots

Also, be careful to note “explosions”
Possibility – different Warm-up Periods for different output
processes

To be conservative, take the max

Must specify a single Warm-up Period for the whole model
Warm Up and Run Length (cont’d.)
Example:
Lengthen Replications to 5 days (7200 min), do 10 Replications
Truncated Replications
• If you can identify appropriate warm-up and run-length times, just make
replications as for terminating simulations

Only difference: Specify Warm-up Period in
Run > Setup > Replication Parameters

Proceed with confidence intervals, comparisons, all statistical analysis as
in terminating case
• So… What should be the length of warm-up period?

Abate J., and W. Whitt, Transient behavior of regular Brownian motion,
Advance Applied Probability, 19, 560-631, 1987
Batching in a Single Run
•
Alternative: Just one R E A L L Y long run

Only have to “pay” warm-up once

Problem: Have only one “replication” and you need more
than that to form a variance estimate (the basic quantity
needed for statistical analysis)
–
Big no-no: Use the individual points within the run as
“data” for variance estimate
–
Usually correlated (not indep.), variance estimate biased
Batching in a Single Run (cont’d.)
•
•
Break each output record from the run into a few large batches
 Tally (discrete-time) outputs: Observation-based
 Time-Persistent (continuous-time): Time-based
Take averages over batches as “basic” statistics for estimation:
Batch means
Tally outputs: Simple arithmetic averages
 Time-Persistent: Continuous-time averages
Treat batch means as IID
 Key: batch size must be big enough for low correlation
between successive batches
 Still might want to truncate (once, time-based)

•
Batching in a Single Run (cont’d.)
Batching in a Single Run (cont’d.)
•
Suppose you have n equal batches of m observations each.
m
The mean of each batch is: meani =
∑Xij
j=1
m
n
Overall estimate is: Estimate =
∑meani
i=1
n
The 100(1 - /2)% CI using t table is: [ Estimate  t S ]
n
Where the variance
S2
∑(meani – Estimate)2
=
i=1
n-1
Batching in a Single Run (cont’d.)
•
One replication of 50 days (about the same effort as 10
replications of 5 days each)
How to choose batch
size?
Equivalently, how to
choose the number of
batches for a fixed run
length?
Want batches big enough
so that batch means
appear uncorrelated.
Batching in a Single Run (cont’d.)
•
Arena automatically attempts to form 95% confidence intervals on
steady-state output measures via batch means from within each single
replication

“Half Width” column in reports from one replication
–
In Category Overview report if you just have one replication
–
In Category by Replication report if you have multiple
replications

Ignore if you’re doing a terminating simulation

Won’t report anything if your run is not long enough
–
“(Insufficient)” if you don’t have the minimum amount of data
Arena requires even to form a CI
–
“(Correlated)” if you don’t have enough data to form nearlyuncorrelated batch means, required to be safe