Automating SLA Modelling

Automating SLA Modelling
Tony Chau
IBM Toronto & University of Toronto
Vinod Muthusamy, Hans-Arno Jacobsen
University of Toronto
Elena Litani, Allen Chan, Phil Coulthard
IBM Toronto
October 27, 2008
Outline
•
•
•
•
Service level agreement (SLA)
SLA modelling & implementation
Flexible SLA model
Automatic generation of monitoring
artifacts
SLA Example
service time < 3s,
otherwise, charge
provider
Y
Deposit
Y
Validate
request
Check
credit history
Approved?
Good?
N
Ask
Manager
cost < $0.02
Notify
N
SLA Example
service time < 3s,
otherwise, charge
provider
Y
Deposit
Y
Validate
request
Check
credit history
Approved?
Good?
N
Ask
Manager
cost < $0.02
Notify
N
SLA Example
service time < 3s,
otherwise, charge
provider
Y
Deposit
Y
Validate
request
Check
credit history
Approved?
Good?
N
Ask
Manager
cost < $0.02
Notify
N
Service Level Agreement
• Contract between service providers and
consumers
• Define the level of service agreed by both
parties
• Optionally define the penalty if the level of
service is not satisfied
SLA Modelling Today
• Informally expressed (e.g., in a Word document)
– Error-prone interpretation
– Time consuming implementation
• e.g., to monitor the SLA, create dashboards, perform
service selection (fabric support), automate resource
provisioning
– Tightly coupled with the business process
• Formally expressed (e.g., WSLA)
– Still tightly coupled with the process
– Not designed to be reused
– Inflexible to changes in the SLA or process
Objective
• Simplify modelling and monitoring of SLA
for any given business process
– Encourage reusability and extensibility of SLA
model
– Automatic generation of monitoring artifacts
Automating SLA Modelling
• Flexible SLA model
– Based on modular, composable and
extensible SLA components
• Automatic Generation of Monitoring
Artifacts
– Based on distributed, event-driven
architecture
The Flexible SLA Model
• Component-oriented
• Composes of several SLA components
– Metrics
– Service level objectives (SLOs)
– Violation actions
• SLA components grouped in libraries
– form building blocks of constructing SLAs
Reusable SLA Components
A library of SLA components can be reused, composed, configured,
and extended to quickly model arbitrarily complex SLAs.
Metric
Library
Measure some aspect
of a process
Example Metric Type
SLO
Library
Evaluate an SLA
objective
Metric Instances
Id = ExecTime
Name = ProcessTime
Name = Execution Time
Type = ExecTime
Schema = {Scope (type:Scope)}
Action
Library
Code that is executed
upon SLO violation
Component Compositions
GenEventAction
EmailAdminAction
Scope=EntireProcess
PoorExecTimeSLO
Dependent Events Function
void getDependentEvents() {
return {e1, e2, e3, e4};
}
SevereExecTimeSLO
Name = BookingTime
Type = ExecTime
AvgExecTime
Scope={flight, train}
Event Handler
void onEvent(e) {
static entry = {i1, i2, …, in}
if (e.activity in entry)
entry[e.instance] = e.time
else
diff = e.time – entry[e.instance]
publish (diff, e.instance)
}
ProcExecTime
FinishProcInstance
StartProcInstance
Reusable SLA Components
A library of SLA components can be reused, composed, configured,
and extended to quickly model arbitrarily complex SLAs.
Metric
Library
Measure some aspect
of a process
Example Metric Type
SLO
Library
Evaluate an SLA
objective
Metric Instances
Id = ExecTime
Name = ProcessTime
Name = Execution Time
Type = ExecTime
Schema = {Scope (type:Scope)}
Action
Library
Code that is executed
upon SLO violation
Component Compositions
GenEventAction
EmailAdminAction
Scope=EntireProcess
PoorExecTimeSLO
Dependent Events Function
void getDependentEvents() {
return {e1, e2, e3, e4};
}
SevereExecTimeSLO
Name = BookingTime
Type = ExecTime
AvgExecTime
Scope={flight, train}
Event Handler
void onEvent(e) {
static entry = {i1, i2, …, in}
if (e.activity in entry)
entry[e.instance] = e.time
else
diff = e.time – entry[e.instance]
publish (diff, e.instance)
}
ProcExecTime
FinishProcInstance
StartProcInstance
Reusable SLA Components
A library of SLA components can be reused, composed, configured,
and extended to quickly model arbitrarily complex SLAs.
Metric
Library
Measure some aspect
of a process
Example Metric Type
SLO
Library
Evaluate an SLA
objective
Metric Instances
Id = ExecTime
Name = ProcessTime
Name = Execution Time
Type = ExecTime
Schema = {Scope (type:Scope)}
Action
Library
Code that is executed
upon SLO violation
Component Compositions
GenEventAction
EmailAdminAction
Scope=EntireProcess
PoorExecTimeSLO
Dependent Events Function
void getDependentEvents() {
return {e1, e2, e3, e4};
}
SevereExecTimeSLO
Name = BookingTime
Type = ExecTime
AvgExecTime
Scope={flight, train}
Event Handler
void onEvent(e) {
static entry = {i1, i2, …, in}
if (e.activity in entry)
entry[e.instance] = e.time
else
diff = e.time – entry[e.instance]
publish (diff, e.instance)
}
ProcExecTime
FinishProcInstance
StartProcInstance
Reusable SLA Components
A library of SLA components can be reused, composed, configured,
and extended to quickly model arbitrarily complex SLAs.
Metric
Library
Measure some aspect
of a process
Example Metric Type
SLO
Library
Evaluate an SLA
objective
Metric Instances
Id = ExecTime
Name = ProcessTime
Name = Execution Time
Type = ExecTime
Schema = {Scope (type:Scope)}
Action
Library
Code that is executed
upon SLO violation
Component Compositions
GenEventAction
EmailAdminAction
Scope=EntireProcess
PoorExecTimeSLO
Dependent Events Function
void getDependentEvents() {
return {e1, e2, e3, e4};
}
SevereExecTimeSLO
Name = BookingTime
Type = ExecTime
AvgExecTime
Scope={flight, train}
Event Handler
void onEvent(e) {
static entry = {i1, i2, …, in}
if (e.activity in entry)
entry[e.instance] = e.time
else
diff = e.time – entry[e.instance]
publish (diff, e.instance)
}
ProcExecTime
FinishProcInstance
StartProcInstance
Loose Coupling of Flexible Model
SLAs and processes can be modified independently. Their loose
coupling reduces the possibility of invalidating the SLA.
SLA Modification
Processes Modification
Change SLA to consider cost of all
invoked services.
Change process to only look for a flight if the
train takes too long.
Deposit
Name = ServiceCost
Validate
request
Check
credit history
Approved?
Good?
Type = TotalServiceCost
Ask
Manager
Scope = {deposit, notify}
Notify
Deposit
Name = ServiceCost
Type = TotalServiceCost
Scope = EntireProcess
No changes to the process
are required.
Validate
request
Check
credit history
Approved?
Good?
Ask
Manager
No changes to the SLA
are required.
Notify
Validation
During design time, the SLA can be automatically validated against
the process. This provides confidence to modify SLAs or processes
independently.
Pass or Fail
Process
Validation
GenEventAction
EmailAdminAction
PoorExecTimeSLO
SevereExecTimeSLO
AvgExecTime
Invalid SLA
components
ProcExecTime
FinishProcInstance
StartProcInstance
SLA
 Check if required parameters in all
component instance have been
specified.
 Verify that parameters are valid.
 Traverse SLA component graph and
compute union of dependent events.
 Verify that each event’s activity exists in
process.
Automatic Generation
Given an SLA, monitoring artifacts can be automatically generated for
the process. The runtime artifacts, when executing, monitors the
process to detect whether the SLA is violated.
Process
Necessary events
turned on
Generation
Engine
GenEventAction
EmailAdminAction
PoorExecTimeSLO
SevereExecTimeSLO
AvgExecTime
ProcExecTime
FinishProcInstance
StartProcInstance
SLA
Machine logic
for monitoring
Runtime Architecture
Runtime is based on publish/subscribe model. SLA components in
the SLA model are generated as agents in the publish/subscribe
system. These agents act as both publishers and subscribers.
GenEventAction
EmailAdminAction
PoorExecTimeSLO
Process
AvgExecTime
FinishProcInstance
StartProcInstance
SevereExecTimeSLO
ProcExecTime
publish end
event
publish
avgExecTim
e event
subscribe
execTime
event
publish start
event
Publish/Subscribe System
publish
execTime
event
subscribe
start and
end event
Runtime Execution
Events are consumed by agents. After events are processed, agents
emit new events to propagate updates. It causes a chain reaction to
re-evaluate the SLA.
GenEventAction
EmailAdminAction
PoorExecTimeSLO
Process
AvgExecTime
FinishProcInstance
StartProcInstance
publish start
event
SevereExecTimeSLO
ProcExecTime
subscribe
execTime
event
publish
execTime
event
subscribe
start and
end event
Distributed Architecture
•Designed for distributed systems
– Agents can be arbitrarily deployed across the pub/sub system
– Scalable
Flexible Architecture
• Shared processing and network traffic
– Multiple SLAs that make use of the same
metric can share the generated agent
– Events are sent once even if multiple agents
are interested
• Dynamic runtime modification of SLAs
– Add, modify or remove SLAs during
monitoring
– No downtime
Implementation
• Flexible SLA model
– Editor implemented for user to modify SLAs
– SLAs created for a given BPEL process
– BPEL and WebSphere Integration Developer used as
proof of concept
• Generation engine
– Generate monitoring artifacts to be executed in
WebSphere Business Monitor Server
– Monitoring artifacts monitors BPEL process, to detect
whether the SLA is violated
– WebSphere products used as proof of concept
Conclusion
• Simplify modelling and monitoring of an SLA for any given business
process
– Maximize reusability and extensibility
– Reduce development time
– Reduce likelihood of errors
• Flexible SLA model
–
–
–
–
SLA components grouped in libraries
SLAs developed by assembling different SLA components
Extensible, flexible, compensable and configurable
Validation
• Automatic generation of monitoring artifacts
–
–
–
–
Automatic enabling of events in a process
Runtime architecture based on pub/sub system
Decentralized and scalable
Better use of computing/network resources