Taverna Workflows for Systems Biology

Taverna Workflows for
Systems Biology
Katy Wolstencroft
School of Computer Science
University of Manchester
What is a Taverna Workflow?



Workflow management
system
Sophisticated analysis
pipelines
A set of services to
analyse or manage data
(either local or remote)

Data flow through
services

Control of service
invocation
Taverna Workflows







Interoperability, Integration
and Collaboration
Access to distributed and
local resources
Iteration over data sets
Automation of data flow
Agile methods development
Extensible
Experimental protocols
Workflows are ideal for…

High throughput analysis



Transcriptomics, proteomics, Next Gen
sequencing, etc
Data integration, data interoperation
Data management


Model construction
Data format manipulation
Systems Biology
Taverna Workbench
List of services
Workflow engine
to run workflows
Construct and
visualise workflows
Web Services
Scripts
Programming
libraries
e.g. KEGG
e.g. beanshell, R
e.g. libSBML
Taverna Workbench
Freely available
open source
Current Version 2.2
70,000+ downloads
across version
Part of the myGrid Toolkit
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W729-32.
Taverna: a tool for building and running workflows of services.
Hull D, Wolstencroft K, Stevens R, Goble C, Pocock MR, Li P, Oinn T.
myGrid Open Suite of Tools
Workflow Repository
Workflow GUI Workbench
Client User Interfaces
Third Party Tools
Service Catalogue
Provenance
Store
Workflow
Server
Web Portal
Activity and Service
Plug-in Manager
Open
Provenance
Model
Secure Service Access
Programming and
APIs
Examples of Workflows
for Systems Biology
Escherichia coli : From cDNA
Microarray Raw Data to Pathways
and Published Abstracts
Using gene-expression
patterns associated with
two lymphoma types to
predict the type of an
unknown sample
Wei Tan Univ. Chicago,
CABIG
SysMO SUMO: Systems
Understanding of Microbial
Oxygen responses
Afsaneh Maleki-Dizaji,
University of Sheffield
Identify
differentially
expressed genes
using t-test with
R
Peter Li, MCISB
High Throughput Experiments
Workflows for Model Building



Results from
experiments in
systems biology ->
related to
mathematical models
in SBML
Workflows can link
data and models
Workflows can create
models
SBML
Location of
components
Species
Reactions
Model construction workflow
Input: list of ORFs
1. Get reaction
info
Get annotations
2. Create
compartments
3. Create
species
4. Create
reactions
Output: SBML
file
Peter Li et al, MCISB, myGrid
Peter Li et al, MCISB, myGrid
Integrating libSBML into
Taverna
Workflows for Data Integration
Read enzyme
names from
SBML
Query maxd
database using
enzyme names
Calculate colours
based on gene
expn level
Create new
SBML model
with new colour
nodes
Mapping transcriptomic data onto
SBML models
Reuse, Recycle,
Repurpose Workflows
HUMAN Microarray CEL
file to candidate pathways
SUMO
From cDNA Microarray Raw Data to
Pathways and Published Abstracts
Workflows through web interface
Reuse, Recycle,
Replay Workflows
Metware: Workflows for metabolomics, Netherlands/Germany
Steffen Neumann, Leibniz Institute of Plant Biochemistry
Workflows in e-Laboratories
SysMO SEEK


e-Laboratory for
interlinking and
sharing data, models,
SOPS and workflows
for Systems Biology
in Europe
Workflows for data
analysis
Summary



Informatics in Systems Biology relies on
data integration and large-scale data
analysis
Taverna workflows are a mechanism for
linking together resources and analyses
myExperiment allows you to reuse
workflows and benefit from others work
More information

Taverna


myExperiment



http://www.myexperiment.org
http://wiki.myexperiment.org
BioCatalogue


http://www.taverna.org.uk
http://www.biocatalogue.org
SysMO-SEEK

http://www.sysmo-db.org