Network Modelling

Network Modelling & Simulation
Content:
Network Topology
Network Traffic characteristics
Distribution patterns
Probabilistic modelling
Random numbers & Pseudo-Random numbers
Queuing Theory (introductory)
What does it mean?
Modelling:
 A mathematical representation of the target system
 Building a model of a computer network
 Using that model to analyse and predict possible
behaviour of a real network
Simulation:
 Having the appearance of / to behave like / to copy.
 Reproducing the conditions of (a situation, etc.), as in
carrying out an experiment .
 To reproduce the behaviour of a system by providing
realistic configuration parameters to a simulation.
The need for network models
Network designers & managers need to ask a wide variety of
What If questions, for example:
• the number of users were doubled?
• the buffer size at routers was increased?
• a key device failed at peak traffic conditions?
• the 10 Mbs links were upgraded to 100 Mbs?
• we replace a hub with a switch?
• What is the cheapest way to increase the bandwidth
available to users in the accounts department?
• Why does the throughput between Building-A and
Building-B fall significantly between 2pm and 3pm every
day?
Why models?
Problems with experimenting on a live/test network:
• Scale and cost
• Risk (expensive devices may be purchased and NOT
•
•
solve the original problem)
Interruption (to existing users and services)
Time consuming
Hence the use of models of networks for:
• Planning a network implementation,
• Planning an expansion of an existing network,
• Problem diagnostics for an existing network,
• As a learning tool.
Network Topology
The topology of a network is represented by a model.
 The number and type of devices:
 Computers – clients, servers
 Network devices: hubs, switches, routers
 Connectivity: links and bandwidth
An example of a physical layout
JANET geographical view
JANET – topology
http://www.janet.ac.uk/services/network-services/netsight/index.html
What is not shown as part of topology
• The actual shape of the network.
• The physical location of devices.
• The distance between devices.
• Behavioural characteristics such as:
– what are the chances of any individual frame being
lost or corrupted?
– The reliability of each link or device i.e. what are the
chances of a link or device failing?
• The traffic patterns with the network:
– The amount of traffic at any point, at any time ?
– The source and sink characteristics of traffic ?
– Routes taken by traffic.
Traffic generation
A static (idle) network is not very interesting
- This just amounts to some devices and some
connections between them.
Applications run on networks and generate
traffic or load on a network.
It is the load on the network that causes the
network to exhibit certain behaviours, potentially
causing problems.
It is these behaviours that we are interested in.
Traffic generation (continued)
•
•
•
•
•
Network traffic: packets/ sec? packet size?
Not constant except on rare occasions
Cannot be predicted or modelled exactly
Need to use statistical techniques
Probability distributions with the right
parameters, for example:
- Inter-arrival time: exponential (mean 3 sec)
- Packet size: exp (mean = 1200 bytes)
Probability Distributions
Several probability distribution functions are available –
each characterised by certain parameters – mean, standard
deviation etc. In each case the actual function that describes
the distribution is the Probability Density Function (PDF).
No distribution can represent the random events in a
computer network with complete accuracy.
It is important to use appropriate distributions to give a
‘good approximation’ of the behaviour of typical systems.
This is known from analysis of live network traffic, typical
applications etc.
Important distributions
Name
Parameters
Use to represent:
Constant
None (fixed value)
Packet/frame size
Binomial
P(success)
Server up/dn
Exponential
λ, Mean
‘mean’ interpreted as ‘expected
value’ where E(x) = 1 / λ
PDF: λe-λx
(e = Euler constant ≈ 2.71828)
Mean
Interarrival time,
packet size
Min & max
(constant probability)
Application start
time
Poisson
Uniform
Normal
Mean, SD
PDF:
Pkt arrival rate
Errors in
experiments
Poisson distribution
(Used e.g. for packet arrival rate)
This distribution expresses the probability of a given number of events
occurring in a fixed interval of time if these events occur with a known average
rate (the mean) and independently of the time since the last event.
In the example shown, the mean is 6, therefore in the given time period (not
indicated on the graph) the most likely number of events is 6 – the cumulative
probability will always increase, towards 1.0 (but this will not be reached until all
possible values of k are accounted for).
1.2
1
Poisson, mean=6
P(k)
P(k)
0.8
C(k)
0.6
0.4
0.2
0
1
2
3
4
5
6
7
k
8
9
10
11
12
13
Exponential distribution, mean = 2.0
(e.g. Used for Interarrival time, packet size)
C (x)
1.2
P(x)
1
C(x)
Cumulative probability
of occurrences 0 to x
Exponential, mean = 2.0
( λ = 0.5 )
0.8
0.6
0.4
P(x) The probability of
(e.g.) a particular period
between packets
decreases as the time
interval increases.
0.2
0
0 x .4 0 .80.4
1.20.8
1.61.22.01.6
2.422.8
2.43.22.83.6
3.24.0
3.64.44 4.8
4.45.2
4.85.6
5.26.0
5.66.466.8
6.47.2
6.87.6
7.28.0
7.68.4
Variable frame sizes
Ignoring the ‘illegal’ sizes, the distribution is biased towards the smaller frame sizes
Normal distribution
Popular in social sciences, e.g. height of group of people.
For networking, useful for distribution of errors (see Std Dev from mean)
Normal Distribution
Probability of event
0.3
0.2
0.1
0
0
2
4
6
Event
8
10
Exponential Distribution (e.g. Random choice of interarrival time)
Using the distribution in a model:
1. Use the Cumulative curve (dark red), this represents the probability distribution p
2. Pick a random number y where (0 ≤ y ≤ 1) note y = p(x)
3. Using y, and the distribution p we can find x (track across to the curve to find x)
Note that for this curve short interarrival times are much more likely than longer times
If y were 0 the packet next will follow directly after the previous one
If y were 1 the next packet will follow after infinite time
1.2
1
Exponential (mean=2)
P(x), C(x)
0.8
0.6
P(x)
C(x)
0.4
0.2
0
0 0.4 0.8 1.2 1.6 2 2.4 2.8 3.2 3.6 4 4.4 4.8 5.2 5.6 6 6.4 6.8 7.2 7.6 8
(Interarrival time)
Simulation Results
Results of simulations can be presented in a number of ways:
Animation of the model
- this allows you to watch packets as they move around the
network. Gives an overall idea of what is happening.
Numerical results
- event counts (e.g. number of packets sent)
- statistical averages (e.g. average packet end-to-end delay).
Graphical results
- alternative way to view numerical results.
Identifying the time when the network behaviour ‘steadies’
This is an artefact of the modelling process – it arises because of the
use of moving averages.
This is NOT a behavioural artefact of the modelled network
→ we must ignore it when analysing results (see next slide)
Moving average
16000
14000
12000
win1
10000
win3
8000
win5
win11
6000
4000
2000
0
-2000 0
100
200
300
400
500
600
700
Illustration of Moving Average
The moving average takes a window of the most recent 10 values and averages them.
This is useful measure of average value and is often used in simulation models.
Note how the MA10 curve is inaccurate for the first 9 samples (it uses 10 samples to
determine the average, so inaccurate when less than 10 samples are available).
MA
Raw data
Value x
MA10(x)
60
70
60
80
50
60
70
50
30
40
50
20
60
40
50
60
50
6
13
19
27
32
38
45
50
53
57
56
51
51
47
47
47
45
Ignore the
first set of
values, in
this case,
because
MA10 is
used, ignore
9 values
Average
and variance
1.
2.
run-A
Average over the run (e.g. 50 values/statistic)
Average over several runs (diff. seeds)
run-B
run-C
run-D
average
0.3
0.25
0.2
0.15
0.1
0.05
0
0
500
1000
1500
2000
Model Validation
•
•
•
•
•
Models are approximations of the real thing
Lots of guesswork and simplification
Need to make the model as realistic as possible
Need to validate the model thoroughly
This is important if the results are to be credible and
useful
• Need sufficient amount of results, i.e. sufficient
depth and breadth of investigation, changing various
parameters, and changing the seed to ensure a wide
range of possible outcomes is investigated;
otherwise you cannot draw valid conclusions.
Verification
• Alternative calculations (queueing theory) –
difficult, often impossible
• Comparison against a real network
• Extrapolation from a smaller model which can be
verified
• Extrapolation from a smaller real network
The M/M/1 Queue
A simple queuing situation which can be solved exactly;
Given the mean packet interarrival time E(ta)
&
the mean service time E(ts),
we can calculate the mean queue length & the mean delay:
Utilisation ρ = E(ts) / E(ta)
•longer service time → higher ρ
•longer interarrival time → lower ρ
Queue Length
Delay E(Tq) = E(ts) / (1- ρ)
•1- ρ = proportion of time the resource is free
•Delay = service time divided by free time.
Server The more free the resource is, the shorter the
time to process jobs
Note – for self-checking, Delay
can be estimated as:
Queue length * service time.
This value should be in the
same ball-park as the value
calculated using the formula.
Queue length E(q) = ρ / (1- ρ)
•Queue length is the ratio of utilisation to free
time. At 50% utilisation the average queue size
will be 1
An example calculation
An application generates packets at the average rate of 5 per
second. (Arrival rate λ = 5)
The average packet size is 1200 bytes (Exponential dist)
Packets transmitted along a 100 kbps link.
Mean Interarrival time = E(ta) = 1/5 = 0.2 sec
Mean service time = E(ts) = 1200 x 8 / 100000 = 0.096 sec.
(in this case service time is time needed to transmit onto the link)
Utilisation ρ = E(ts) / E(ta) = 0.096 / 0.2 = 0.48
Delay E(tq) = E(ts) / (1- ρ) = 0.096 / 0.52 = 0.185 sec
Queue length E(q) = ρ / (1- ρ) = 0.48 / 0.52 = 0.923
Quick check technique for ρ (= link bandwidth / mean total traffic)
= 100000 / (1200 * 8 * 5) = 100000 / 48000 = 0.48
One for you to try
A database server receives client requests at the average rate of
one every 2 seconds.
Arrival rate λ = 0.5 E(ta) = 2
It can process a transaction in approximately 1.5 seconds
E(ts) = 1.5
Calculate the utilisation, average transaction delay and the
transaction queue size at the server.