Access Patterns of VMs Regional Spatial Locality Sub

Exploiting Spatial Locality to Improve
Disk Efficiency in Virtualized
Environments
Xiao Ling1, Shadi Ibrahim2, Hai Jin1, Song Wu1, Songqiao Tao 1
1Cluster
and Grid Computing Lab
Services Computing Technology and System Lab
School of Computer Science and Technology
Huazhong University of Science and Technology
2INRIA Rennes - Bretagne Atlantique
Rennes, France
Disk efficiency in virtualized environments
• VMs with multiple OSs and applications running on
a physical server
• Disk I/O utilization impacts I/O performance of
applications running on VMs
• Disk efficiency depending on exploitation of spatial
locality
– Disk scheduling exploits spatial locality
– Reducing disk seek and rotational overheads
But achieving high spatial locality is a
challenging task in a virtualized environment
Why difficult?
• Complicated I/O behavior of VMs
– More than one process running on VMs (e.g. Virtual
desktop, data intensive application)--mixed applications
• Transparency of Virtualization
Hypervisor
Process D
Guest OS
Process C
Shared disk
Process B
Software
Process A
File editing
Streaming
App
Guest OS
Guest OS
Block layer Lacks :
a goral view of I/O access
patterns of processes in the
virtualized environment
Shoulders of Giants
Studies on improving I/O performance of applications
proceed us
• Invasive mode scheduling
– Selecting the disk scheduler pair within both the hypervisor and VMs
according to access pattern of applications[ICPP’11, SIGOPS Oper.
Syst. Rev. ’10]
– An additional Hypervisor-to-VM interference
• Non-invasive mode scheduling
– Streaming scheduling [Fast’11], Antfarm[USENIX ATC’06]
– All VM with similar read applications
– Grabbing bandwidth among VMs
• Analysis of data accesses of VMs
– Only a specific(one) application is running within a VM
What do we solve?
• Considering mixed applications and the
transparency feature of virtualization
• Exploring the benefit of the spatial locality and
regularity of data accesses
• Disk scheduling how to exploit spatial locality
to maximize disk efficiency while preserving
the transparency of virtualization?
Outline
•
•
•
•
•
•
•
Problem Description
Related Work
Observe Disk Access patterns of VMs
Prediction Model
Design of Pregather
Performance Evalution
Conclusions and Future Work
Difference of Data Access
Traditional Environment
Virtualized Environment
simultaneously accessing different parts of
data blocks in the range of VM image space
Experiment settings
• Physical server
– four quad-core 2.40GHz Xenon processor,
– 22GB of memory and one dedicated SATA disk of 1TB
– Xen 4.0.1 with kernel 2.6.18 , Ext3 file system
• Configuration of VMs
– RHEL5 with kernel 2.6.18, Ext3 file system, 1GB memory
and 2 VCPU, 12GB virtual disk
– Defaut Noop scheduler
• workloads
– Sysbench-file I/O: sequential read/write, random
read/write
Access Patterns of VMs
Our observations:
• Regions across VMs
– requests from the same VM
• Sub-regions within VM
– different ranges and frequencies of access
Access Patterns of VMs
Regional
Spatial
Locality
Sub-regional
Spatial
Locality
Sub-regions
without
spatial
locality
Region
Sub-region
Region
Observations
• Special spatial locality
– Regional spatial locality->bounded by VM image
– Sub-regional spatial locality->access patterns of applications
• Ignoring of these spatial locality
– Seeking among VM
– increasing disk head seeks among sub-regions (e.g. CFQ, AS)
• Our goal
– taking advantage of special spatial locality to improve physical
disk efficiency in the virtualized environment.
How to exploit these spatial locality
• Batch Processing requests with special spatial locality
with adaptive non-working-conserving mode
– Easy capturing regularity of regional spatial
locality
– Hardly perceiving the regularity of Sub-regional
spatial locality due to transparency of
virtualization
The distribution of sub-regions with
spatial locality?
Access interval of these sub-regions?
Prediction
Model
Outline
•
•
•
•
•
•
•
Problem Description
Related Work
Zoom Disk Access patterns of VMs
Prediction Model
Design of Pregather
Performance Evalution
Conclusions and Future Work
Prediction Model
• Challenges
– the distribution of sub-regions with spatial locality is
changing with time and the access patterns of
applications
– Interference from background processes running on a VM
– different sub-regions may have different access regularity
• Analyzing historical data access within a VM image
to predict sub-regional spatial locality
Prediction Model-vNavigator
• Quantization of Access Frequency
– contributions of historical requests for prediction
– Temporal access-density of zone
Prediction Model-vNavigator
• Explore Sub-regional Spatial Locality
– temporal access-density threshold of a VM
where
– Clustering zones
Prediction Model-vNavigator
• Access Regularity of Sub-regional Spatial Locality
– The range of a sub-region unit
– Future access interval of the sub-region unit
where
is the average access
interval
Design of Pregather
• An adaptive non-work-conserving disk scheduling in
the hypervisor
– whether or not to dispatch the pending request without
starving other requests.
– How long wait for future request with spatial locality
• A spatial-locality-aware heuristic algorithm
– the regional spatial locality across VMs and the
prediction of sub-regional spatial locality from the
vNavigator model
– Guide Pregather to make the decision
– waiting time is less than seek time
The SPLA Algorithm
• Setting timer according to position of disk head
– Whether setting Coarse waiting time for regional spatial
locality
no pending
request from the
current serving
VMx
AvgD(VMx )
<D|neighor VMLBA of completed
request |
CoarseTimer=
AvgT(VMx )
– Whether setting Fine waiting time for sub-regional spatial
locality
pending request
from the the
current serving
VMx
Existing SR(Ui )
including LBA of
completed
request
FineTimer=
ST (Ui )
The SPLA Algorithm
• Dispatching request or continuing to wait
– Seektime(closest pending request, completed request)
– Within coarse waiting time
Seektime<
AvgT(VMx )
OR
Request from
VMx
Dispatch the request
and turn off timer
– Within fine waiting time
Seektime<
ST (Ui )
OR
LBA of Request
in SR(Ui )
Dispatch the request
and turn off timer
– till over timer or deadline of pending request or a
suitable new request
Implementation of Pregather
In Xen-hosted platform
Pregather allocates each VM an equal serving time slice
and serves VMs in a round robin fashion
Outline
•
•
•
•
•
•
•
Problem Description
Related Work
Zoom Disk Access patterns of VMs
Prediction Model
Design of Pregather
Performance Evolution
Conclusions and Future Work
Performance Evolution
• Goal of Experiments
– Verifying the vNavigator model
– the overall performance of Pregather for multiple VMs
– Evaluating the overhead of memory
• Setting Parameters
– The size of zone: 2000; prediction window:20ms; λ: 2;
– Time slice: 200ms
• Benchmark
– Sysbench-file I/O, hadoop, tpch
Verification of vNavigator Model
• The ratio of successful waiting
– VM with Sequential applications has clear sub-regional
locality (e.g. success ratio 90.3%)
– VM with only random applications has weak sub-regional
locality (e.g. success ration 80.4%)
10%
33%
31%
38%
22%
Pregather for Multiple VMs
• VMs with Different Access Patterns
1.6x
2.6x
Pregather for Multiple VMs
• Disk I/O efficiency for Data Intensive Applications
↑ 26% CFQ
↑ 28%AS
↑38%Deadline
At Zero:
Pregather: 65% CFQ: 53% AS: 36%
↓18%
↓20%
Pregather for Multiple VMs
• Disk I/O efficiency for Data Intensive Applications
with other applications
Compared with CFQ:
Q2: ↓10%, Q19: ↓8%, Sort: ↓12%
Pregather: 63%
Pregather for Multiple VMs
• Memory Overheads
916KB
Conclusion and Future Work
• Contributions
– Observing regional spatial locality and sub-regional
spatial locality
– an intelligent prediction model to predict the regularity
of sub-regional spatial locality
– Pregather with a spatial-locality-aware heuristic
algorithm in the hypervisor to improve disk I/O efficiency
without any prior knowledge of applications
• Future work
– extend Pregather to enable an intelligent allocation of
physical blocks
– Qos guarantee for VMs
Thanks!