Revisiting Resource Partitioning for Multi

CMAAS’2017
RevisitingResourcePartitioning
forMulti-coreChips:
IntegrationofSharedResource
PartitioningonaCommercialRTOS
21Apr.2017
PAK,EUNJI
Seniorresearcher,ETRI
(ElectronicsandTelecommunicationsResearchInstitute)
[email protected]
Agenda
•
Qplus-AIR, acommercialRTOS
•
Comprehensivesharedresource
partitioningimplementation
onQplus-AIR
Qplus-AIR
ARINC653compliantRTOS
CertifiableforDO-178BLevelA
IntroductiontoQplus-AIR
•
Qplus-AIR
– DevelopedbyETRIforsafety-criticalsystem(2010~2012)
– MainoperatingsystemfortheIFCC(Integratedflightcontrolcomputer)
ofUAV(UnmannedAvionicsVehicle),KAI
– IntegrateMC(MissionControl),FC(FlightControl),andC&C
(CommunicationsandCommands)intheIFCC
– ARINC653compliantRTOS*
– Robustpartitioningamong
applications
– Spatialandtemporal
– Preventcross-application
influenceanderror
propagationamong
applications
– Easyintegrationofmultiple
applicationswithdifferent
degreesofcriticality
*AirlinesElectronicEngineeringCommittee,AvionicsApplicationSoftwareStandardInterfaceARINCSpecification653Part1,2006
IntroductiontoQplus-AIR
•
Qplus-AIR
– CertifiablepackageforDO-178BLevelA
– LightweightARINC653support:kernel-levelimplementation
– Supportformulticoreplatforms(2014~)
•
RTWORKS
– AcommercialversionofQplus-AIR
– ManagedbyRTST(2013~),ETRI’sspin-offcompany
– Startwith4developers,andnowhas11OSdevelopers
– AUTOSAR(automotiveindustrystandard)and
ISO26262ASILDisinprogress
•
ETRIfocusesonresearchissues
whileRTSTfocusesoncommercialization
ApplicationExamples
•
Safety-criticalindustrialapplications
– Integratedflightcontrolcomputerofunmannedavionicsvehicle,
2010~2012
– Tiltrotorflightcontrolcomputer,2012
– Nuclearpowerplantcontrolsystem,2013
– HUMS(HealthandUsageMonitoringSystem)forhelicopter,
2013~2016
– Subwayscreen-doorcontrolsystem,2016 (exporttoBrazil)
– Communicationsystemofself-propelledguns,2017~
– (project)Autonomousdrivingcar,2015~
Comprehensive
sharedresource
partitioning
implementationon
Qplus-AIR
Contents
•
Introduction
•
HWplatform:P4080
•
Comprehensiveresourcepartitioningimplementation
– Memorybusbandwidthpartitioning
– DRAMbankpartitioning
– Sharedcachepartitioning– set-based/way-based
•
CombinedallthetechniquesontheQplus-AIR
•
Evaluations
•
Conclusions&FutureWork
Introduction[1/2]
•
Robustpartitioningamongapplications(partitions)
– Qplus-AIRsupportsspatialandtemporalpartitioning
– Ensuresindependentexecutionofmultipleapplicationswith
varioussafety-criticallevels
•
Robustpartitioningmaynolongerbevalidinmulticore
– Multiplecoressharehardwareresourcessuchascacheormemory
– Concurrentlyexecutingapplicationsaffecteachotherduetothe
contentiononsharedresource
– Majorsourceoftimingvariability
– PessimisticWCETestimation→overprovisioningofhardwareresources
andlowsystemutilization
– Insafety-criticalsystems,wehadtoturnoffbutonecore
Introduction[2/2]
•
Wemustdealwiththeresourcecontentionproperly
– WCEToftasksstaysguaranteedandtightlybounded
– Especiallyforsafetycriticalapplicationsthatrequirecertification
•
Requirementofinter-coreinterferencemitigation
– “Theapplicanthasidentifiedtheinterferencechannelsthatcould
permitinterferencetoaffectthesoftwareapplicationshostedonthe
MCPcores,andhasverifiedtheapplicant’schosenmeansof
mitigationoftheinterference.“
- FAACAST(CertificationAuthoritiesSoftwareTeam)-32APositionPaper*
•
Comprehensivesharedresourcepartitioning
implementationonARINC653compliantRTOS
– Integrateanumberofresourcepartitioningschemes,eachofwhich
targetsdifferentsharedhardwareresources, onQplus-AIR
– UniquechallengesduetothefactthattheRTOSdidnotsupport
Linux-likedynamicpaging
*CertificationAuthoritiesSoftwareTeam,PositionPaperCAST-32A:Multi-coreProcessors,2016.
HWplatform,P4080[1/2]
P4080architecture*
EightPowerPCe500mccores
Eachcorehasaprivate32KB-I/32KB-DL1and128KBL2cache
TwoL332-way1MBcacheswithcache-lineinterleaving
Twomemorycontrollersfortwo2GBDDRDIMMmodules
(eachDIMMmoduleshas16DRAMbanks)
– CoreNet coherencyfabric– interconnectscoresandotherSoC modules,a
high-bandwidthswitchthatsupportsseveralconcurrenttransactions
QMan
……
L1I-$
L1I-$
PowerPC
e500mccore
CoreNetInterface
DIMM
module
BMan
L2$
DDR
Controller
FMan
L3$
GPIO
DIMM
module
DUART
DDR
Controller
–
–
–
–
L3$
•
CoreNetFabric
*P4080QorIQIntegratedProcessorHardwareSpecifications,Feb2014.
HWplatform,P4080[2/2]
•
PartitioningsupportofrecentPowerPCprocessors*
Overall partitioning model
Eachcoreisallocatedtoeachpartition
CPCis
Partitioned
• Waypartition
(32KBperway)
Mainmemoryisdividedinto
severalphysicalregions
• Private
• Sharedbetweenpartitions;
accessibleatuserlevel
• Sharedamongpartitions;
restrictedtohypervisorlevel
Thismappingisenforcedbythe
core’sMMUs
Restrictthecoherencyoverhead
• Disablethecoherency– preventsnoopoverhead
• Specifyagroupparticipatingcoherency
Systemperipheralsarenotshared
• HypervisorisabletorestricttheirDMA-accessiblememoryrange
tosomepartofthememoryregion(throughtheMMU)
In this model, there are four distinct partitions, each
running on two cores. The main memory is divided
Figure 4. Example of a Partitioned System
into several physical regions:
• Private
*HardwareSupportforRobustPartitioninginFreescaleQorIQMulticoreSoCs(P4080andderivatives)
• Shared between partitions; accessible
at user level
Resourcepartitioningmechanisms
•
1. Memorybus(interconnect)bandwidthpartitioning
•
2. Memorybankpartitioning
•
Sharedcachepartitioning
– 3. Set-basedcachepartitioningwithpagecoloring
– 4. Way-basedcachepartitioningwiththesupportofP4080
hardware
•
CombineallthetechniquesandintegratedonQplus-AIR
•
Paging
– Memorybankpartitioningandset-basedcachepartitioning
assumesthatOSsupportsLinux-likepaging
– PagingimplementationinQplus-AIR
Resourcepartitioningmechanisms
Memorybusbandwidthregulator [1/2]
•
Busbandwidthregulator*
– Limitthebandwidthusagepercore
1)Setmemorybusbandwidthbudget
2)Count#ofrequestssenttomemorybus
Core1
Core2
Core1
Core2
#/10
#/10
10/10
3/10
Memorybus(CoreNet Fabric)
3) Generateaninterrupt
Memorybus(CoreNet Fabric)
4)Throttletherequestsfromcore1
Core1
Core2
Core1
Core2
10/10
3/10
10/10
3/10
Memorybus(CoreNet Fabric)
Memorybus(CoreNet Fabric)
*H.Yun,G.Yao,R.Pellizzoni,M.Caccamo,andL.Sha.Memorybandwidthmanagementforefficientperformanceisolationinmulti-core
platforms.IEEETransactionsonComputers,65:562–576,2015.
Resourcepartitioningmechanisms
Memorybusbandwidthregulator [2/2]
•
Implementation
– Setupthebudgetandconfiguretogenerateaninterruptwhena
coreexhaustthebudget
– Configureperformancemonitoringcontrolregistersandperformance
monitoringcounters
– OSschedulerthrottlesfurtherexecutionatthatcore
– ImplementinterrupthandlerfortheinterruptthatPMCgenerates
– Schedulerde-schedulethetasksonthecore
•
Periodofbandwidthregulatorexecution
– Iftooshort,overheadbecomesexcessive;incontrast,
iftoolong,predictabilityisworsened
– Defaultperiodofourimplementationis5ms
Resourcepartitioningmechanisms
Bank-awarememoryallocation
•
DRAMbank-awarememoryallocation*
– Managesmemoryallocationinsuchawaythatnoapplication
sharesitsmemorybankwithapplicationsrunningonothercores
Application1
Application2
Virtual
Memory
Virtual
Memory
1)request
memory
OS
Pagetable(virtual-tophysicaladdresstranslation)
18
16
HW
MMU
Core2
Core1
31
2)Allocate
physical
memory
14
12
7
L3cachesets
6
0
DRAM
Physical
memory
Bank1
Physical
memory
Bank2
DRAM
banks
L2cachesets
[P4080memoryaddressmapping]
Core2
Core1
Bank1
Bank2
Physical
memory
Physical
memory
*H.Yun,R.Mancuso,Z.-P.Wu,andR.Pellizzoni.PALLOC:Drambank-awarememoryallocatorforperformanceisolationonmulticoreplatforms.
InRTAS,2014.
Resourcepartitioningmechanisms
Set-basedcachepartitioning [1/2]
•
Set-basedpartitioningviapagecoloring*
– Allocationofphysicalmemoryconsideringthecachesetlocation
– 𝑛𝑢𝑚𝑏𝑒𝑟𝑜𝑓𝑐𝑜𝑙𝑜𝑟𝑠 =
./.012341
5/612341∗./.01/223.3/83938:
31
Physicalpagenumber
Application1
Application2
12
16
2)Allocate
physical
memory
7
0
L3cachesets
colors
DRAM
Cache
1)requestmemory
RTOS
Core1
Core2
*R.Mancuso,R.Dudko,E.Betti,M.Cesati,M.Caccamo,andR.Pellizzoni.Real-timecachemanagementframeworkformulti-corearchitectures.
InRTAS,2013.
*M.Chisholm,B.C.Ward,N.Kim,andJ.H.Anderson.Cachesharingandisolationtradeoffsinmulticoremixed-criticalitysystems.InRTSS,2015.
Resourcepartitioningmechanisms
Set-basedcachepartitioning[2/2]
•
Implementations
– Manipulatesvirtualtophysicaladdressmapping
– allocatedisjointcachesetstoeachcore
– Amongaddressbits[15:7],cachesetindex,exploits[15:12]bits,which
intersectswiththephysicalpagenumberinP4080
•
L2co-partitioning&Restrictionsofset-basedpartitioning
– Co-partitionL2cache
– L3cachesetisdeterminedby[15:12]andL2cachesetby[13:6]
– Using[13:12]bitshasasideeffectofco-partitioningL2cache
– Onlythe[15:14]bitsareallowedforL3cachesetpartitioning
– Thenumberofcachepartitionsislimitedto4
– Ifweadoptfor8cores,
somecachesets
inevitablysharedby2cores
31
18
16
14
12
7
L3cachesets
banks
L2cachesets
[P4080memoryaddressmapping]
6
0
Resourcepartitioningmechanisms
Way-basedcachepartitioning[1/2]
•
Way-basedpartitioningwithHardware-levelsupport
– Configuremainmemorywithmultipledistinctpartitions
– Foreachpartition,registerthe(memoryrange,target,andpartitionID)
intheLAW(LocalAccessWindow)register
– PartitiontheL3cacheandallocatedisjointcachewaystoeachcore
– ConfiguretheL3cache(CPC)relatedregisters– transactionsfromthe
specifiedpartitioncanallocatetheblocksinthedesignatedcacheways
– E.g.,transactionsfromthe‘partition1‘
allocateblocksinthe‘way0,1,2,3’
Part.1
Part.2
Part.3
Part.4
e6500
core
e6500
core
e6500
core
e6500
core
MMU
MMU
MMU
MMU
Part.
Part.
Part.
Part.
L1cache
L1cache
L1cache
L1cache
1 2 3 4
CPC(L3cache)
Physicalmemory
(DDR3,DRAM)
Part.1
Part.2
Part.3
2MBBankedL2cache
Part.4
CoreNetCoherencyFabric
LocalAccessWindows
LocalAccessWindows
LocalAccessWindows
CPCConfiguration
Register
shared
Resourcepartitioningmechanisms
Way-basedcachepartitioning[2/2]
•
Relaxedrestrictionsonthenumberofcachepartitions
– Withset-basedcachepartitioning,numberofcachepartitionsis
restricteduptofour
– P4080supportscachepartitioningwithper-waygranularity,with
eachwayproviding32KB
– L3cacheis32-wayandcanbepartitionedto32parts
•
Limitationsofway-basedcachepartitioning
– Way-basedcachepartitioningcannotbeusedwith
set-basedcacheormemorybankpartitioning
Part.1
– Conflictingrequirementofmemoryallocation
(core1)
– Sequentialvs.interleaving
– MayberelevanttoallotherPowerPCchipmodels
– Cachewaylockingallowintegration
Part.1
Part.2
vs.
Part.2
(core2)
Part.1
Part.2
Part.1
Part.2
– MostARMprocessorssupportscachewaylocking
– PowerPCe500mcprocessorsupportscachelockinginablockgranularity
Implementationissues
FromtheperspectiveofanRTOS[1/4]
•
Challenges– paging
– PagecoloringassumesthatOSmanagesmemorywithfixed-sized
pages(normally,4KB)
– Qplus-AIRdeliberatelyavoidpagingduetothetimingpredictability
isworsenedwhenaTLBmissoccurswithinapagingscheme
•
MemorymanagementofQplus-AIR
– Managedwithvariablesizedpages
ratherthanfixed4KBpages
– Kerneldata/code,partitionregions
– Manageseachregionasonelargepage
- 1TLBentryforeachregion
– OSlockstheentryintheTLB
- ForceallthemappingdatatostayintheTLB
– Sizeofmemoryforeachapplicationisconfigured
bydevelopers
– MMUisusedtopreventcross-application
memoryaccesses
Kernelcode
16MB
Kerneldata
16MB
Partition1
16MB
Partition2
64MB
Partition3
64MB
Memory
layout
Size
(example)
Implementationissues
FromtheperspectiveofanRTOS[3/4]
•
MemorymanagementinP4080
– TwolevelsofMMU
[ref.]PowerPCe500mccorereferencemanual
– Hardware-managedL1MMU
– Software-managedL2MMU
– EachMMUconsistsof
– TLBforvariable-sizedpages(VSP),
11differentpagesizes(4KB~4GB)
– TLBfor4KBfixed-sizedpages(FSP)
– TLBlockingforvariable-sizedpages
•
Modify memorymanagementofQplus-AIR
– Tosupportpagecoloring,whichisusedtoimplement
memorybankpartitioningandset-basedcachepartitioning
– Manageapplication’smemoryregionswith4KBgranularity
– Managementofkernelregionswasunchanged– bindperformance
predictabilityofkernelexecution
Implementationissues
FromtheperspectiveofanRTOS[3/4]
•
Overheadofpaging
– ‘Latency’benchmarkwithchangingdatasizeandaccesspattern
– Sequentialaccessandrandomaccessoflinkedlist
– Measuretheaveragememoryaccesslatency
paging overhead
(sequential access)
paging overhead
(random access)
90
300
no paging
paging
80
250
no paging
paging
average memory latency
average memory latency
70
60
Upto6%overhead
whendatasize>2MB
50
40
[note]
TLBhitratio=98.43%
L2TLBhas512-entry
30
20
Upto197%overhead
whendatasize>2MB
200
150
100
50
10
0
0
2000
4000
6000
data size (KB)
8000
10000
0
0
2000
4000
6000
data size (KB)
8000
10000
Implementationissues
FromtheperspectiveofanRTOS[4/4]
•
Analysisofoverhead
– DegradationisduetotheMMUarchitectureofe500mccore
– L1instructionanddataTLBsandL2unifiedTLB
– L1MMUiscontrolledasaninclusivecacheofL2MMU
– InPowerPCe6500core,L1andL2MMUisnotinclusive
I-TLB
Invalidated
(inclusionproperty)
Datasizeincreases
D-TLB
L1TLB
L1TLB
L1TLB
L2TLB
L2TLB
L2TLB
L1I-TLBmiss!
TLBentryforcode
TLBentryfordata
•
Evict(replaceout)
InstructionTLBentries
L1I-TLBmissevenifthecodesizeis
withintheL1I-TLBcoverage
Requirementsforthepredictablepaging
– Somestudiesfocusedonpredictablepaging*
– COTShardwareprovidesmeansforimplementingpredictablepaging–
software-managedTLBorTLBlocking
*D.HardyandI.Puaut.Predictablecodeanddatapagingforrealtimesystems.InECRTS,2008.
*T.Ishikawa,T.Kato,S.Honda,andH.Takada.Investigationandimprovementontheimpactoftlb missesinreal-timesystems.
InOSPERT,2013.
Resourcepartitioningmechanisms
Integrationofpartitioningschemes
•
Fourtechniqueswithpaging
– Memorybuspartitioning(RP-BUS),memorybankpartitioning(RPBANK),set-basedcachepartitioning(RP-$SET),andway-based
cachepartitioning(RP-$WAY)
•
Integrationofmemorybus,memorybank,andset-based
andway-basedcachepartitioningmechanisms
– Notethatway-basedcachepartitioningcannotbeintegratedwith
memorybankpartitioningorset-basedcachepartitioning
•
Possibleintegration options
– Integrationoption#1:RP_BUS,RP_BANK,andRP_$SET
– Restrictionsonthenumberofavailablecachepartitions
– Integrationoption#2:RP_BUSandRP_$WAY
– Contentionsonmemorybankisunavoidable
Evaluations [1/5]
•
Evaluationsetup
– Hardwareplatform
– P4080withactivate4or8oftotal8cores
– Softwareplatform
– Qplus-AIR
– Syntheticbenchmark
– Latency :traversealinkedlisttoperformaread/writeoperationoneach
node,memoryrequestismadeoneatatime
– Bandwidth :accessmemoryinsequencewithnodatadependency
betweenconservativeaccesses– CPUgeneratemultiplememory
requestsinparallel,maximizingmemorylevelparallelism(MLP)
availableinthememorysystem
– Metric
– Averagememoryaccesslatency(ns)– timetoread/writeoneblock(64B)
– Normalizeaveragelatencytothebest-casewithoutresourcecontention
Evaluations[2/5]
•
Evaluationsetup
– Twobenchmarkmixes
– 4-core MIX
– Causecontentiononallthe
memoryresourcestoevaluate
eachpartitioningmechanism
andintegratedone
– 8-coreMIX
Core1
Core2, 3
Core4
Latency
(512KB)
Bandwidth
(4MB)
Bandwidth
(32MB)
Core1, 2
Core3, 4, 5, 6
Core7, 8
Latency
(512KB)
Bandwidth
(4MB)
Bandwidth
(32MB)
– toshowthelimitationofset-basedcachepartitioning
– Datasizeconfiguration
Datasize
Examples
Platform:2MBLLC
on4-coreCPU
Cache(LLC)
hit rate
LLC
SizeofLLCdividedby
numberofcores
2MB/4cores=
512KB
100%
DRAM/small
TwicethesizeofLLC
2MB;2 =4MB
0%
DRAM/large
Significantlylargerthan Muchlargerthan2MB
LLC
(32MBinourexperimentalsetup)
0%
Evaluations [3/5]
(b)
(c)
(d)
(e)
core1
0.41
0.55
0.97
0.97
1.00
core2
0.49
0.57
0.62
0.78
1.00
core3
0.50
0.57
0.62
0.79
1.00
core4
0.93
0.87
0.87
0.85
1.00
4-coreMIX,IntegrationOption#1
– RP_BANK,RP_$SET,andRP_BUS
– (b)RP_BANK:allthecoresareenabledtoaccessbanksinparallel
– (c)AddingRP_$SETensures512KBL3cacheforLatency(LLC)app
runningoncore1(56%improvementcomparedtotheworst-case)
– Moreover,feweraccessestomainmemorywererequestedbycore1
helpsperformanceonothercores
– (d)AddRP_BUS:Performancewhenalltechniquesareputtogether
1.1
Normalizedperformance
•
(a)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
(a)WORST
1 istheperformance
w/ointerference
(b)RP_BANK
core1
(c)RP_BANK+
RP_$SET
core2
core3
(d)RP_BANK+
RP_$SET+RP_BUS
core4
(e)BEST
Evaluations [4/5]
•
(a)
(b)
(c)
(d)
core1
0.41
1.00
1.00
1.00
core2
0.49
0.78
0.91
1.00
core3
0.50
0.79
0.91
1.00
core4
0.93
1.01
0.89
1.00
4-coreMIX,Integrationoption#2
– RP_$WAYandRP_BUS
– RP_BANKisinapplicable
– Inthisbenchmark,memoryaccessisnotconcentratedtoabanksince
RP_$WAYallocatesmemorytoeachcoresequentially
– However,worstcasecouldarisedependingonantaskbehavior
– RP_$WAYvs.RP_SET
– PagingoverheadonRTOSdegradesperformance
– 3%, 16%, 17%, and13%foreachapplicationoncore1,2,3,and4
1.1
Normalizedperformance
1.1
1
1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
(a)WORST
1 istheperformance
w/ointerference
(b)RP_$WAY
core1
core2
(c)RP_$WAY+
RP_BUS
core3
core4
(d)BEST
0.2
(c)RP_BANK+
RP_$SET
Evaluations [5/5]
•
8-coreMIX,Integration#1&#2
core1
core2
core3
core4
core5
core6
core7
core8
(a)
(b)
(c)
(d)
(e)
(f)
0.37
0.37
0.30
0.30
0.30
0.30
0.82
0.82
0.64
0.64
0.42
0.42
0.42
0.42
0.75
0.74
0.64
0.63
0.54
0.54
0.54
0.54
0.74
0.73
0.88
0.88
0.52
0.52
0.53
0.53
0.94
0.94
0.87
0.86
0.71
0.71
0.71
0.71
0.79
0.79
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
– Restrictionsonnumberofpossiblecachepartitions
– RP_$SET– 4partitions,RP_$WAY– 32partitionsinP4080platform
– PerformanceofLatency(LLC)isabout64%and88%withRP_$SETand
RP_$WAY,respectively
– Overheadofpaging
– Comparetheperformancein(b)and(d),or(c)and(e)
Normalized performance
1.2
1
0.8
0.6
0.4
0.2
0
(a)WORST
1 istheperformance
w/ointerference
(b)RP_BANK+RP_$SET (c)RP_BANK+RP_$SET
+RP_BUS
core1
core2
core3
core4
core5
(d)RP_$WAY
core6
(e)RP_$WAY+RP_BUS
core7
core8
BEST
Conclusions&FutureWork
•
Conclusions
– Qplus-AIR,anARINC653compliantRTOS
– Comprehensivesharedresourcepartitioningimplementationonan
ARINC653compliantRTOS,Qplus-AIR
– Implementationissuesofimplementingandcombiningmultiple
resourcepartitioningmechanisms
– TheuniquechallengesweencounteredduetothefactthattheRTOSdid
notsupportLinux-likedynamicpaging
•
FutureWork
– Predictablepaging
– Evaluationwithreal-worldapplications
ThankYou
fortheattention
[email protected]
References [1/2]
[1]AirlinesElectronicEngineeringCommittee,AvionicsApplicationSoftwareStandard
InterfaceARINCSpecification653Part1,2006.
[2]BIOSandkerneldeveloper’sguildforAMDfamily15hprocessors,March2012.
[3]ARMCortex53TechnicalReferenceManual,2014.
[4]P4080QorIQIntegratedProcessorHardwareSpecifications,Feb2014.
[5]CertificationAuthoritiesSoftwareTeam,PositionPaperCAST-32A:Multi-core
Processors,2016.
[6]QorIQ T2080ReferenceManual,2016.
[7]M.Chisholm,B.C.Ward,N.Kim,andJ.H.Anderson.Cachesharingandisolation
tradeoffsinmulticoremixed-criticalitysystems.InRTSS,2015.
[8]J.Flodin,K.Lampka,andW.Yi.Dynamicbudgetingforsettlingdramcontentionofcorunninghardandsoftreal-timetasks.InSIES,2014.
[9]D.HardyandI.Puaut.Predictablecodeanddatapagingforrealtimesystems.InECRTS,
2008.
[10]T.Ishikawa,T.Kato,S.Honda,andH.Takada.Investigationandimprovementonthe
impactoftlb missesinreal-timesystems.InOSPERT,2013.
[11]H.Kim,A.Kandhalu,andR.Rajkumar.Acoordinatedapproachforpracticalos-level
cachemanagementinmulti-corereal-timesystems.InECRTS,2013.
[12]T.Kim,D.Son,C.Shin,S.Park,D.Lim,H.Lee,B.Kim,andC.Lim.Qplus-air:Ado-178b
certifiablearinc 653rtos.InThe8thISET,2013.
References [2/2]
[13]R.Mancuso,R.Dudko,E.Betti,M.Cesati,M.Caccamo,andR.Pellizzoni.
Real-timecachemanagementframeworkformulti-corearchitectures.InRTAS,2013.
[14]M.D.BennettandN.C.Audsley.Predictableandefficientvirtualaddressingforsafetycriticalreal-timesystems.InECRTS,2001.
[15]J.NowotschandM.Paulitsch.Leveragingmulti-corecomputingarchitecturesin
avionics.InEDCC,2012.
[16]J.Nowotsch,M.Paulitsch,D.Bühler,H.Theiling,S.Wegener,andM.Schmidt.Multicoreinterference-sensitivewcetanalysisleveragingruntimeresourcecapacity
enforcement.InECRTS,2014.
[17]S.A.PanchamukhiandF.Mueller.Providingtaskisolationviatlbcoloring.InRTAS,
2015.
[18]M.K.QureshiandY.N.Patt.Utility-basedcachepartitioning:Alow-overhead,highperformance,runtimemechanismtopartitionsharedcaches.InMICRO,2006.
[19]R.E.KesslerandM.D.Hill.Pagereplacementalgorithmsforlargereal-indexedcaches.
InACMTrans.onComp.Sys.,1992.
[20]L.Sha,M.Caccamo,R.Mancuso,J.-E.Kim,andM.-K.Yoon.Singlecoreequivalentvirtual
machinesforhardreal-timecomputingonmulticoreprocessors,whitepaper.2014.
[21]N.Suzuki,H.Kim,D.deNiz,B.Anderson,L.Wrage,M.Klein,andR.Rajkumar.
Coordinatedbankandcachecoloringfortemporalprotectionofmemoryaccesses.InICCSE,
2013.
[22]H.Yun,R.Mancuso,Z.-P.Wu,andR.Pellizzoni.Palloc:Drambank-awarememory
allocatorforperformanceisolationonmulticoreplatforms.InRTAS,2014.
[23]H.Yun,G.Yao,R.Pellizzoni,M.Caccamo,andL.Sha.Memorybandwidthmanagement
forefficientperformanceisolationinmulti-coreplatforms.IEEETransactionson
Computers,65:562–576,2015.