QoS-driven Lifecycle Management of Service-oriented Distributed Real-time & Embedded Systems Aniruddha Gokhale [email protected] www.dre.vanderbilt.edu/~gokhale Assistant Professor ISIS, Dept. of EECS Vanderbilt University Nashville, Tennessee February 16th, 2006 www.dre.vanderbilt.edu Service-oriented Style of Distributed Realtime & Embedded Systems – Regulating & adapting to (dis)continuous changes in runtime environments • e.g., online prognostics, dependable upgrades – Satisfying tradeoffs between multiple (often conflicting) QoS demands • e.g., secure, real-time, reliable, etc. – Satisfying QoS demands in face of fluctuating and/or insufficient resources • e.g., mobile ad hoc networks (MANETs) 2 Characteristics of SOA-style DRE Systems • Manifestation of Service-Oriented Architectures (SOA) in the distributed real-time & embedded (DRE) systems space – – – – Applications composed of a one or more “operational string” of services A service is a component or an assembly of components Dynamic (re)deployment of services into operational strings is necessary New class of QoS (performance + survivability) requirements • Realized using enabling component middleware technologies e.g., CCM, .NET and J2EE 3 QoS Issues for SOA-style DRE Systems Failover Unit C1 C2 C3 C4 C5 • Per-component concern – choice of implementation – Depends of resources, compatibility with other components in assembly • Communication concern – choice of communication mechanism used • Assembly concerns – what components to assemble dynamically? What order? What configurations end-to-end are valid? • Failure recovery concern – what is the unit of failover? • Sharing concern – shared components will need proactive survivability since it affects several services simultaneously • Availability concern – what is the degree of redundancy? What replication styles to use? Does it apply to whole assembly? • Deployment concern – how to select resources? Risk alleviation? 4 Tangled Concerns in SOA-style DRE Systems • Demonstrates numerous tangled para-functional concerns • Significant sources of variability that affect endto-end QoS (performance + survivability) Separation of Concerns & Managing Variability is the Key Design-time Deployment-time Run-time 5 (1) Design-time Variability Management in SOA-style DRE Systems • Focus on Separation of Concerns • “What if” Analysis • Analytical methods • Simulation methods • Model-driven generative programming for “what if” • Understanding the impact of individual concerns • Students involved: • Krishnakumar Balasubramanian, Jaiganesh Balasubramanian, Gan Deng, Amogh Kavimandan, James Hill, Sumant Tambe, Arundhati Kogekar, Dimple Kaul Work partly supported by DARPA PCES program (PI), DARPA ARMS Program, PI on subcontracts from Lockheed Martin ATL, & NSF CSR-SMA Program, PI Separation of Concerns using CoSMIC Component Package Component Component Component Assembly Component Assembly Component Component Component Component Component Component Configurator pa cka s ML ge IC (P (4 )c on (O fig CM ur es L,Q oS ML ) (3) Component Packager ) ag Component (PICML) assembly Component sp nfi gu CoSMIC ec ific ati on Component Developer fe b ed k ac Assembly Component Deployer ra t ion (6) deployment Deployment Application RACE Framework (5) planning n si g d e a ck (9 ) e d b fe (1) co Deployment Planner y f ( x1 , x2 ,... xn ) (8) reconfiguration & replanning Component Assembler ing de ve lop ML s &P IC ML ) Component Assembly Component p (2) assembles (ID Component k ac planning Component Assembly DAnCE Framework Assembly Component Imp Imp Imp l l l Resource Requirements (7) analysis & benchmarking (Cadena & BGML) Properties System analyzer • Project Lead and PI DARPA PCES program • CoSMIC project focuses on separation of deployment and configuration concerns • Model-driven generative programming framework • Complementary technology to CIAO and DAnCE middleware • www.dre.vanderbilt.edu/ cosmic Analysis & Benchmarking • • • • CoSMIC tools e.g., PICML used for separation of concerns in operational strings Captures the data model of the OMG D&C specification Synthesis of static deployment plans for DRE components New capabilities being added for static deployment planning Work supported by DARPA PCES Program, PI 7 Case Study for “What if” Analysis: Virtual Router • Network services need support for efficient (de)-multiplexing, dispatching and routing/forwarding • .e.g., VPN Service CE provided by a virtual router Virtual Router Provider Edge VR (PE) CE Multiple tunnels to customer edge or virtual routers Firewall Multiple tunnels to backbone or virtual routers VR Provider Edge (PE) VR VR VR CE CE CE CE Provider Edge VR (PE) VR VR Provider Edge VR (PE) VR VR VPN2 CE VR CE Backbone 1 VR VR CE CE CE Provider Edge (PE) VR VR Provider Edge VR (PE) Level 2 Service Provider VR VR CE CE CE VR VPN3 VR CE CE VR CE VR Backbone 2 VR Level 1 Service Providers VPN2 VR Level 1 Service Providers VPN1 VR VPN3 VPN1 CE • Provides differentiated services to customers, e.g., prioritized service • VPN setup messages must be efficiently (de) multiplexed, serviced and forwarded • Implemented using middleware • Need to estimate capacity of the system at design-time Problem boils down to capacity planning and estimating performance of configured middleware 8 Performance Analysis of Reactor Pattern in VR CE Provider Edge VR (PE) VPN1 CE CE VR VR VR • Customers send VPN setup messages to router • VPN setup messages manifest as events at the VR • VR must service these events (e.g., resource allocation) and honor the prioritized service, if any • Accepted messages are forwarded • Events could be dropped in overload conditions The Reactor architectural pattern allows event-driven applications to demultiplex & dispatch service requests that are delivered to an application from one or more clients. • Reactor pattern decouples the detection, demultiplexing, & dispatching of events from the handling of events • Participants include the Reactor, Event handle, Event demultiplexer, abstract and concrete event handlers 9 Modeling VR Capabilities in a Reactor Single Threaded Reactor Event Handler with exponential service time m2 • Differentiated services for two classes Events are handled in prioritized order N2 N1 Event Handler with exponential service time m1 l1 Poisson arrival rate • Consider VPN service for two customer classes Reactor accepts and handles two types of input events l2 Poisson arrival rate select-based event demultiplexer incoming events network Model of a single-threaded, selectbased reactor implementation • Each event type has a separate queue to hold the incoming events. Buffer capacity for events of type one is N1 and of type two is N2. • Event arrivals are Poisson for type one and type two events with rates l1 and l2, resp. • Event service time is exponential for type one and type two events with rates m1 and m2, resp. 10 Performance Metrics of Interest for Reactor •Throughput: -Number of events that can be processed -Applications such as telecommunications call processing. •Queue length: -Queuing for the event handler queues. -Appropriate scheduling policies for applications with real-time requirements. •Total number of events: -Total number of events in the system. -Scheduling decisions. -Resource provisioning required to sustain system demands. •Probability of event loss: -Events discarded due to lack of buffer space. -Safety-critical systems. -Levels of resource provisioning. •Response time: -Time taken to service the incoming event. -Bounded response time for real-time systems. 11 Performance Analysis using Stochastic Reward Nets Transition A1 A2 N2 N1 B1 Sn1 S1 Place B2 Immediate transition Sn2 Inhibitor arc StSnpSht T_SrvSnpSht T_EndSnpSht Token S2 SnpShtInProg Sr1 Sr2 (a) (b) • Stochastic Reward Nets (SRNs) are an extension to Generalized Stochastic Petri Nets (GSPNs) which are an extension to Petri Nets. • Extend the modeling power of GSPNs by allowing: Guard functions Marking-dependent arc multiplicities General transition probabilities Reward rates at the net level • Allow model specification at a level closer to intuition. • Solved using tools such as SPNP (Stochastic Petri Net Package). 12 Modeling the Reactor using SRN (1/2) Event arr. A1 A2 N2 N1 B1 Service queue B2 Sn1 Sn2 S1 S2 Sr1 Servicing the(a) event Drop events on overflow StSnpSht T_SrvSnpSht T_EndSnpSht Prioritized service SnpShtInProg Sr2 Service completion (b) • • • • • • Models arrivals, queuing, and prioritized service of events. Transitions A1 and A2: Event arrivals. Places B1 and B2: Buffer/queues. Places S1 and S2: Service of the events. Transitions Sr1 and Sr2: Service completions. Inhibitor arcs: Place B1and transition A1 with multiplicity N1 (B2, A2, N2) - Prevents firing of transition A1 when there are N1 tokens in place B1. • Inhibitor arc from place S1 to transition Sr2: - Offers prioritized service to an event of type one over event of type two. - Prevents firing of transition Sr2 when there is a token in place S1. 13 Modeling the Reactor using SRN (2/2) A2 A1 N2 N1 B1 B2 Sn1 Sn2 S1 S2 Sr1 Sr2 StSnpSht T_SrvSnpSht T_EndSnpSht SnpShtInProg (a) (b) • Process of taking successive snapshots • Reactor waits for new events when currently enabled events are handled • Sn1 enabled: Token in StSnpSht & Tokens in B1 & No Token in S1. • Sn2 enabled: Token in StSnpSht & Tokens in B2 & No Token in S2. • T_SrvSnpSht enabled: Token in S1 and/or S2. • T_EndSnpSht enabled: No token in S1 and S2. • Sn1 and Sn2 have same priority • T_SrvSnpSht lower priority than Sn1 and Sn2 14 VR SRN: Performance Estimates • SRN model solved using Stochastic Petri Net Package (SPNP) to obtain estimates of performance metrics. • Parameter values: l1 0.5/sec, l2 0.5/sec, m1 2.0/sec, m2 2.0/sec. • Two cases: N1 = N2 = 1, and N1 = N2 = 5. Perf. metric N1 = N2 = 1 N1 = N2 = 5 #1 #2 #1 #2 Throughput 0.37/s 0.37/s 0.40/s 0.40/s Queue length 0.065 0.065 0.12 0.12 Total events Loss probab. 0.25 0.065 0.27 0.065 0.32 0.35 .00026 .00026 Observations: • Probability of event loss is higher when the buffer space is 1 • Total number of events of type two is higher than type one. • Events of type two stay in the system longer than events of type one. • May degrade the response time of event requests for class 2 customers compared to requests from class 1 customers 15 VR SRN: Sensitivity Analysis • Analyze the sensitivity of performance metrics to variations in input parameter values. • Vary l1 from 0.5/sec to 2.0/sec. • Values of other parameters: l2 0.5/sec, m1 2.0/sec, m2 2.0/sec, N1 = N2 = 5. • Compute performance measures for each one of the input values. 1.6 1.4 Throughput 1.2 1 0.8 0.6 0.4 0.2 0 0.4 0.44 0.5 0.57 0.66 0.8 1 1.33 2 Lambda1 Observations: • Throughput of event requests from customer class #1 increases, but rate of increase declines. • Throughput of event requests from customer class #2 remains unchanged. 16 Middleware Pattern Simulations in OMNeT++ • OMNeT++ is a discrete event simulator for networked systems • Developers write C++ code for simulation • www.omnetpp.org .ned files Simulation kernel Mod Submod1 Submod2 Statistics Output Vector File Output Scalar File Mod_n.h/.cpp Submod1.h/.cpp Submod2.h/.cpp Visualization and Animation OMNeT++ Initialization File OMNeT++ Message File UI Library 17 The Simulation Model for Reactor Event Handlers with queues Statistics Collector Synchronous Event Demultiplexer Event Generator Reactor 18 Addressing Middleware Variability Challenges Although middleware provides reusable building blocks that capture commonalities, these blocks and their compositions incur variabilities that impact performance in significant ways. • Compositional Variability • Incurred due to variations in the compositions of these building blocks • Need to address compatibility in the compositions and individual configurations • Dictated by needs of the domain • E.g., Leader-Follower makes no sense in a single threaded Reactor • Per-Block Configuration Variability Reactor single threaded thread pool event handling strategy select poll Qt WaitForMultipleObjects Tk event demultiplexing strategy • Incurred due to variations in implementations & configurations for a patterns-based building block • E.g., single threaded versus thread-pool based reactor implementation dimension that crosscuts the event demultiplexing strategy (e.g., select, poll, WaitForMultipleObjects 19 Automation Goals for “What if” Analysis Applying design-time performance analysis techniques to estimate the impact of variability in middleware-based DRE systems Refined model of a pattern Refined of a weave variability model pattern workload Refined model of a pattern Invariant Refined model of a pattern Refined model of a pattern Refined model weaveof a variability pattern 200 150 100 50 0 Refined model of a pattern Composed System 200 system workload 150 • Build and validate performance models for invariant parts of middleware building blocks • Weaving of variability concerns manifested in a building block into the performance models • Compose and validate performance models of building blocks mirroring the anticipated software design of DRE systems • Estimate end-to-end performance of composed system • Iterate until design meets performance requirements 100 50 0 20 Automating & Scaling the “What if” Process • Model-driven Generative technologies • Developed the SRN Modeling Language (SRNML) in GME • Applied C-SAW framework (from Univ of Alabama, Birmingham) for model scalability R&D supported by NSF CSR-SMA Program in collaboration with Dr. Jeff Gray (UAB) and Dr. Swapna Gokhale (UConn) 21 Analyzing Impact of Individual Concerns Engineering Mechanics – Statics & Dynamics – for analyzing impact of concerns? • Borrow concepts from physical systems to analyze the impact of individual concerns on end-to-end system • Method of joints, method of sections, free body diagrams, equilibrium conditions 22 Engineering Mechanics for DRE Systems Failover Unit C1 C2 C3 C4 C5 A concern is viewed as a “force” Challenges • Directionality – are concerns vectors? • Rigidity – are assemblies rigid or deformable? • Force distribution – does a concern have components along Cartesian axes • Well-defined structures – do software components have properties like trusses • Second order effects – transient effects showing up elsewhere • Notion of friction – these are probably the capacities of resources 23 (2) Deployment-time Intelligence • Near optimal deployment planning decisions • Specialized middleware stacks • Students involved: • Arvind Krishna (graduated), Jaiganesh Balasubramanian, Gan Deng, Dimple Kaul, Arundhati Kogekar, Amogh Kavimandan Work partly supported by DARPA ARMS Program, PI on subcontracts from Lockheed Martin ATL Deployment Challenges • Service workloads and resource capacity issues – service placement depends on workloads and available resources • Component accessibility patterns -- component survivability depends on its sharing degree • Differentiated levels of service –affects resource provisioning and survivability strategies • Service failover – different failover possibilities e.g., as a whole or part assembly or one component at a time • Resource sharing – increases the risk of component(s) requiring proactive survivability strategy 25 • No one-size-fits-all dependability strategy – cannot dictate one FT strategy on all services Service Placement Problem • A resource configuration is a tuple RC = (C, D, HC, EC) where: C1 S1 A2 A1 C2 C3 C4 S4 S3 • PI(c): processing index (capacity) • MI(c): memory index • RI(c): reliability index • D: is a set of Data access units of types in {Ai,Sj} • HC: C (D): is a map associating each c in C with a set of data access units • EC C C : is a set of comm. links each attributed by: • BI(e): bandwidth index • RI(e): reliability index • System performance can be measured in a variety of ways. Considering a task assignment TA: T C: A3 1 PU (TA) C • C: is a set of computation nodes each attributed by: cC PI (t ) tTA ( c ) PI (c) SU (TA) 1PU (TA) 2 MU (TA) 3 LU (TA) • Resource utilization: for processing it is defined as the average of all task processing utilization, given as • Memory utilization MU(TA) and link utilization LU(TA) can defined similarly • System utilization factor: The weighted sum percentage of utilizing the system resources • Reliability is more tricky to measure. In general, the reliability of a given computation string is the multiplication of the reliability indices of the underlying nodes and communication edges. • The reliability factor RF(TA) for a given task assignment, TA, depends on: • The reliability of all its computation strings. • The group reliability the underlying nodes (taking into account their relative distances). • The resource utilization of the systems. The more the system hardware are utilized the less reliable it is. 26 Specializations via Generative Programming Specialized Middleware Stack demuxing & dispatching marshaling protocol adapter Crosscutting, Configurable, QoS Property Manager Component (1) concurrency (2) security (2) persistence (3) instrumentation (4) others Component Lifecycle Manager CONTAINER • GME-based POSAML language for POSA2 pattern language • Generative programming to synthesize FOCUS and AspectC++ rules • Synthesize specialized middleware stacks for distributed deployment of operational strings. 27 Run-time QoS-aware Mechanisms • Focus on Autonomic Mechanisms • Survivability & Fault tolerance • Students involved: • Jaiganesh Balasubramanian, Sumant Tambe, Jules White, Nishanth Shankaran Work supported by DARPA ARMS Program, PI on subcontracts from Lockheed Martin ATL, BBN Technologies, & Telcordia Distributed Virtual Container Approach • primary … Virtual Container Concept for Component M/W • • • … … Virtual Container • • • • • Salient features • … secondary Based on a virtualization idea Spans boundaries across all the replicas, which could be placed on different physical nodes Provides a single point for resource provisioning & component programming Seamless environment for configuring FT, LB, online swapping Handles fine-grained checkpointing across all the replicas in virtual container Reliable multicast & state synchronization confined to a virtual container Maintains information about how the replicas are connected to the external component assemblies • • Provides an operating context for the components/assemblies requiring QoS Relieves programmer from having to configure the middleware for QoS support Clients are oblivious to replication • Normal container programming model • Middleware hides the virtualization details 29 Run-time QoS & Survivability Mechanisms • A configurable approach to survivability including micro- (infrastructure) & macro- (assembly & operational string) level strategies • Micro-level strategies monitor infrastructure state to make proactive decisions at • Component level (swapping & migration) • Middleware level (configurations) • Component Server Level (process resource allocations) • Node level (multiple components) • Macro-level strategies monitor assembly health to make failover decisions • Failover based on type of failover unit • Affects service placement decisions • May involve load balancing • State synchronization issues • Replication styles (hidden by FT strategies) • Initial prototype developed using Component-Integrated ACE ORB (CIAO) & Deployment & Configuration Engine (DAnCE) (www.dre.vanderbilt.edu) 31 Research Summary Applications Middleware R&D in new, holistic approaches to end-to-end QoS management in services-enabled distributed real-time & embedded systems Research Challenge • Managing problem space variability OS & Protocols Hardware Research Approach Benefits • Model-driven generative approach to separation of concerns • Enhance the state-of-art in MDD and AOSD technologies • Design-time “What-if” • Variety of analysis techniques analysis using including non traditional generative prog mechanisms • Generative technologies for automated analysis • Application of Engineering Mechanics • Deployment-time intelligent decisions • New applications of constraints optimization theory • Middleware specializations • Near optimal deployment • Specialized middleware stacks • Run-time Mechanisms • Multilevel, proactive QoS mgmt schemes • Virtualization ideas • Largely autonomic • Survivable systems 33
© Copyright 2026 Paperzz