The VL-E Proof of Concept Environment

Using the VL-E
Proof of Concept Environment
Connecting Users to the e-Science Infrastructure
David Groep, NIKHEF
Virtual Laboratory for e-Science (NL)
• To boost e-Science by
– the creation of an e-Science environment
– and doing research on methodologies
• To carry out concerted research
– along the complete e-Science technology
chain,
– ranging from applications to networking,
– focused on new methodologies and reusable
components.
Virtual Laboratory for e-Science
Medical
Diagnosis &
Imaging
BioDiversity
BioInformatics
Data
Intensive
Science/
Food
Informatics
VL-e
XXXXXXXX
Application Oriented Services
Grid Services
Harness multi-domain distributed
resources
Dutch
Telescience
VL-E in a nutshell
• Experiments become more complex
– more than just coping with the data
– Computer is integrated part of the experiment
– support the experimental process end-to-end
Technology (push)
…
Grid
Resource Sharing
Web
Networks
Application Needs (pull)
Experiment validation
Papers and associated data
Provenance meta-data
Information modeling
Data/Resource Collection Access
…
The Experimental Process
Parameter settings,
Callibrations,
Protocols
…
acquisition
experiment
sensors,amplifiers
imaging devices,
,…
parameters/settings,
algorithms,
intermediate results,
…
raw data
processing
conversion, filtering,
analyses, simulation, …
software packages,
algorithms
…
processed data
presentation
visualization, animation
interactive exploration, …
interpretation
Rationalization of the experiment and processes via protocols
Metadata
Much of this is lost when an
experiment is completed.
Combining data sources
Key element for all users: Data Combination
• From different organisations
– data ownership preserved
– data correctness maintained by preventing
‘forks’
• Extracting common meaning
– need for workflow definition and
ontologies in collaborative experiments
Combining data in Cognition Science
• Collaborative scientific
research
– Information sharing
– Metadata modeling
• Allows for experiment
validation
– Independent confirmation
of results
• Statistical methodologies
– Access to large collections
of data and metadata
• Training
– Train the next generation
using peer reviewed
publications and the
associated data
Combining Acquisition and Simulation
• Robert: kun je hier een mooi plaatje voor
maken?
Het lijkt me de goede plaats om ook insilico experimenten even te noemen
Role of the Proof-of-Concept (PoC)
• Platform for user application development
• Provisioning network & grid infrastructure
– stable releases of common tools
– tested ‘external’ middleware
– stable releases of internal developments
• Support for users & dissemination
– infrastructure installations
– end-user helpdesk
– on-site aid in migration
PoC Release n
Medical
Diagnosis &
Imaging
Characteristics
Usage
Initial compute
platform
Environments
BioDiver
sity
BioInformatics
Data
Intensive
Science/
Food
Informatics
Dutch
Telescience
Stable, reliable, tested
Cert. releases Grid MW & VLsoftware
Application development
NL-Grid production cluster
Central mass-storage facilities
+SURFnet
VL-e Proof of Concept Environment
LCG2.x + SRB +
Release
Developers
Candidate n+1 Heaven/Haven
Flexible, test environment
Integration tests
Functionality tests
Test & Cert.
Adventurous
Grid MW & VL-software
Compatibility
application
people
NL-Grid Fabric
Research Cluster
Flexible, ‘unstable’
Virtual Lab.
rapid prototyping
(interactive simulation)
DAS-2, local resources
GT3.2 + *
VL-e Certification
Environment
LCG2.x + others
Tagged
Release Candidates
Download Repository
PoC Installer
Cluster Tools
VL-e Rapid
Prototyping
Environment
Developer CVS
Nightly builds
Unit tests
stable, tested
releases
external
middleware
products
Involving Users
• Training via tutorials on middleware
– good attendance, but slow uptake later on
• On-site support in integration
– good technology update, but people intensive
• User driven integration: application pull
– rapid update, good attendance
– requires an ICT scientist to work long-term
with the domain scientists to recognize and
extract generic elements
Tutorials
• Grid, LCG2 tutorials
• Hands-on event series
‘Grid Admin Nerd Group’
‘After Sales Service’
• Documentation
• User help-desk (by phone & mail)
User Experience:
nice, but information quickly ‘lost’
On-site support
• EMUTD example
Maurice to provide image & input
• Effective use of EDG/EGEE tools for job
submission, SRB for data access
User experience:
problem effectively solved!
but with high manpower investment by PoC
Application
Specific
Part
Application
Specific
Part
Potential Generic
Potential
part
Generic
Management
Virtual
Laboratory
part
Management
of comm. &
of
comm.
& Services
Application
Oriented
computing
computing
Application
Specific
Part
Potential Generic
part
Management
of comm. &
computing
Grid Services
Harness multi-domain distributed resources
Application pull
Application Pull VL-E methodology
Can we keep our users content?
• Take care of grid & generic aspects
– collaboration community building & security
– policy-constraint & dynamic resource sharing
• Software Integration
– there are many tools already … ‘just integrate them’
– but only wide deployment will show the weaknesses
• Make it work
–
–
–
–
consistent software engineering practices
hide changes lower layers by use of standard interfaces
Easy-to-use installers (PoC Installer, Quattor)
and teach us how to scale up to a grid service provider
http://www.vl-e.nl/