BPEL for Scientific Workflows

BPEL for Scientific Workflows
Shiyong Lu
What is BPEL
• BPEL stands for Business Process
Execution Language, a de facto
specification language for business
workflows mainly consisting of Web
services.
• Is BPEL suitable/sufficient for scientific
workflows?
An overview of BPEL
Invoke Activity
Receive Activity
Reply Activity
Assign Activity
Sequence Activity
If Activity
While Activity
repeatUntil Activity
forEach Activity
Flow Activity
Pick Activity
Scope Activity
BPEL is sufficient! (Akram et al)
Requirements for scientific workflows:
1. Modular design
2. Exception handling
3. Compensation mechanism
4. Adaptivity to environment
5. Flexibility to select services dynamically
6. Steering and monitoring
because… (Akram et al)
1. Modular design: (Sequence, flows, scope)
2. Exception handling (global, scoped, inline)
3. Compensation mechanism (compensation
handler)
4. Adaptivity (XML any data type)
5. Flexibility to select services dynamically
(exception handler, compensation handler)
6. Steering and monitoring (Pick, onMessage,
onAlarm)
despite… (Akram et al)
Some limitations which are overcomable by
WSIF, J2EE and WS-* specifications.
•
Reusability of primitive and structured activities in BPEL is limited. It
is not possible to re-execute an activity that is defined earlier by
referring to it later;
•
It is not possible to trigger an <onMessage> or
<onAlarm> event within the workflow for the purposes of dynamically
modifying the workflow at runtime.
3. The BPEL specification does not explicitly support user
interactions resulting in different levels of support for different
workflow engines.
BPEL or Taverna? (A comparison
done by Tan et al.)
• They compare their usability in the full
lifecycle of a scientific workflow, including
service discovery, service composition,
workflow execution, and workflow result
analysis.
• They determine that BPEL offers a
comprehensive set of primitives for modeling
processes of all flavors, while Taverna
provides a more compact set of primitives
and a functional programming model that
eases data flow modeling.
The life cycle of Scientific
Workflows
The challenges of Scientific
Workflows
• Resources are highly distributed. Scientific
workflows usually use services owned by
other organizations, like data storage, highperformance computing,
• Dataflow oriented. Data is considered to be
the first-class citizen in scientific workflows,
because scientific workflows are mostly
pipelines of parallel data processing. In a data
flow, tasks and links represent data
processing and data transport, respectively;
parallel execution of independent tasks is
desired to be modeled for free -- tasks can
The challenges of Scientific
Workflows (con’t)
• Large scale. Scientific workflows often
contain many tasks, involve large data
sets, and require intensive computation.
The modeling tool should make it easy
to model such complex workflows.
• Data analysis and provenance is an
important step and the workflow
execution can be in an iterative manner.
Comparison on service discovery
(BPEL vs Taverna)
• Services are often virtualizations of data
storage, computation capability or other
resources.
• Service endpoints are not naturally known to
users, either because users are not familiar
with the service itself, or because the service
deployment may have changed in time.
• Taverna: a scavenger meta-service, Feta, and
the MyGrid ontology, BPEL tools are still weak
in service discovery.
Comparison on workflow
modeling (BPEL vs Taverna)
References
• Asif Akram, David Meredith, Rob Allan: Evaluation of BPEL to
Scientific Workflows. CCGRID 2006:269-274
• Tan et al. “A comparison of using Taverna and BPEL in building
scientific workflows: the case of caGrid”, 2009. Concurrency
and Computation: Practice and Experience
• Wei Tan, Paolo Missier, Ravi K. Madduri, Ian T. Foster: Building
Scientific Workflow with Taverna and BPEL: A Comparative Study in
caGrid. ICSOC Workshops 2008:118-129
• Aleksander Slominski, “Adapting BPEL to Scientific
Workflows”, book chapter in book “Workflow for e-Science”,
Eds: Ian Taylor, et al.
• Wassermann et al, “Sedna: A BPEL-Based Environment for
Visual Scientific Worklow Modeling”, book chapter in
book “Workflow for e-Science”, Eds, Ian Taylor, et al.