K.Harrison and A.Soroko
Cosener’s House, Abingdon, UK
22 May 2002
Framework-Grid interfaces: technical survey
– Need for Framework-Grid interfaces
– Outline of required functionality
– Tools for software installation and configuration
– Production tools
– Grid interfaces currently under development
– Conclusions
Aim to give general background, and brief overview
of software products relevant to a Framework-Grid
interface for ATLAS and LHCb
Many items covered in more detail in later presentations
Need for Framework-Grid interfaces
– Resources for Grid activities are becoming available
in increasing numbers
Want to take advantage of these resources as early as
possible
– Take Cambridge as an example:
For Grid activities, have now:
32 X 400 MHz Pentium II processors
20 X 1.13 GHz Pentium III processors
About 0.5 Tbyte of disk space
Globus 2.0 installed
In near future:
will add 2 Tbyte file server
will install EDG middleware
will connect to UK eScience Grid and EU Testbed
– Physics at Cambridge using Grid resources:
ATLAS: already submitting ATLFAST simulation jobs;
plan to participate in data challenges
LHCb: participating in data challenges (initially non-Grid;
later with Grid)
NA48: preparing to simulate 10^8 events for evaluation
of backgrounds in rare kaon decays (300 days of CPU
time)
To fully exploit possibilities for physics studies, need a
tool that simplifies Grid access and job configuration:
Framework-Grid interface
– First ideas for a Grid interface with built-in knowledge
of the Gaudi/Athena framework used by ATLAS and LHCb
developed in summer 2001, in particular by P.Mato and C.Tull
Gaudi/Athena and Grid Alliance (GANGA)
– GANGA might eventually be:
a completely new Grid interface
an adaptation/evolution of an existing Grid interface
– In all cases, expect GANGA to be modular and to make use
of tools/service developed by others
This workshop should help us understand how to proceed
Outline of required functionality
– A Framework-Grid interface for ATLAS and LHCb will need to
provide access to services that can be logically divided
into two categories
– Grid services are developed in the context of many groups
and work packages:
Security services
Job submission
Job decomposition
Resource allocation and management
Data replication and cataloguing
Application-independent monitoring
Would hope to use these as they are (assume no further
development needed)
– Framework-related services (specific to ATLAS and LHCb)
will need to be developed in parallel with the interface
implementation:
Job configuration
(algorithms to run, properties, input/output requests)
Management of software environment
(executables, libraries,databases, etc)
Automatic creation of job-description files
Error recovery
Application-specific monitoring
Bookkeeping
Tools for software installation and configuration
– In general, Grid resources will not be dedicated to a
single experiment: might run jobs for ATLAS one day,
CDF the next and LHCb the day after
Framework-Grid interface will need access to a tool that
allows setting up of the user’s software environment
– Tools of interest include:
LCFG: developed in context of EU WP4, based on
rpm files
DAR: developed at FNAL, based on tarballs
pacman: developed at Boston University, fetches,
installs and manages packages based on rpm files
or tarballs, makes use of software cache
See presentation by S.Youssef
Production tools
– Production tools already in use can provide ideas for
implementing some of the services to be offered by a
Framework-Grid interface
– As an example, consider Simulation for LHCb and its
Integrated Control Environment (SLICE)
see presentation of G.Kuznetsov
– Working in a non-Grid environment:
Production requests to distributed facilities are submitted
via a web page
Java servelets create job scripts and options files
Production is monitored using control system based on PVSS
Update of bookkeeping database, transfer of output data to
mass storage and quality checks performed automatically
– Grid-based system at experimental stage
LHCb production strategy using SLICE
physicist
request for production: nr of
events, channel, datatype
(implies a workflow),
configuration, deadline for
completion
physics coordinator: ratifies
production request which gets
added as outstanding request to
the database
Job creation/submission (via Web): identify
outstanding requests, select workflow(s), give nr
of events, create scripts
bookkeeping
database
production manager:
-Create required nr of jobs (500 evts each)
-Determine configuration
-Determine/create runtime environment
-Run executable
-Check data
-Copy data/logs
-Flag production as completed, prepare
updating of bookkeeping db
Monitoring (via PVSS): submit jobs to distributed
sites, see what jobs are running, how many,
channel, datatype, site, current event nr,
configuration used by job, submit time, kill jobs
Servelet
Purpose
Maprunmc
sicbmc for rawh
production
Brunelrun
Brunel for DST
production
Bbinclrun
Sicbmc + sicbdst
for physics
production
Mcbrunel
Sicbmc v249 +
Brunel v9r1 for data
challenge tests,
dbase v243r1p1,
v243r3
(From E.vanHerwijnen)
Submit jobs
remotely
view
Update bookkeeping
database
Execute
on farm
Transfer data to
Mass store
Monitor
performance
of farm via
Web
Data Quality
Check
(From E.vanHerwijnen)
Grid interfaces currently under development
– Middleware (Globus, EDG, PPDG, other) provides an
interface to grid services via command-line instructions
given in a particular sequence
– More user-friendly interfaces are being developed by
several groups:
Alice Environment (Alien)
see also presentations of P.Buncic and L.Goosens
EDG GUI
see also presentation of D.Colling
Grid Enabled Web Environment for Site-Independent
User Job Submission (GENIUS)
Grid Access Portal for Physics Applications (Grappa)
see also presentation of C.Tull
Others?
AliEn
•
General characteristics of AliEn:
–
–
–
–
Under development by Alice Offline Group, but not specific
to Alice
Uses iVDGL or EDG middleware, Globus toolkit, and a
variety of external modules (SOAP, PAM, SWIG, etc)
Based on Perl
User access via machine on which AliEn is installed:
•
•
–
Command-line interface allows authentication, access to
distributed catalogue, job submission, etc
With appropriate module installed, also have GUI interface
Web interface is under development
Functionality of AliEn (I)
• File Catalogue:
– To access the catalogue, user types: alien
– To authenticate to the server, user must have either a globus
certificate, or ssh keys
– User can browse the catalogue using UNIX-like commands
– Catalogue entries seen by user are Logical File Names (LFN)
– Each user has a home directory, and can register files by giving
LFN, PFN, and size
Functionality of AliEn (II)
• Getting a file (from local SE)
Proxy
2
Authen
Lfn?
Pfn and
SE
Get lfn
Client
1
SE
3 Pfn?
File
SE at the site
of the client
(From P.Saiz)
Functionality of AliEn (III)
• Job submission:
– Jobs may be executed on any cluster of AliEn
– Output is accessible through the AliEn catalogue
– alien StartMonitor starts a daemon that forwards job requests
to a central server
– alien login gives user the AliEn prompt, which allows access
to the AliEn Catalogue and provides commands to submit jobs
– User gives job description using Classads (name of the
executable, possible arguments or input files, extra
requirements for the job, etc)
Functionality of AliEn (IV)
• Submitting jobs
IS
4 Registering stdin
Proxy
Authen
CPUServer
3
Cluster
Monitor
1
2
submit
Client
(From P.Saiz)
Functionality of AliEn (V)
• Executing a job
One per
organization
IS
Proxy
CPUServer
2
One per
element
Cluster
Monitor
(From P.Saiz)
1
CE
3
Process
Monitor
Possible Local Queues:
•LSF
•PBS
•BQS
•Globus
•CONDOR
•DQS
AliEn GUI
• AliEn xfiles
– alien xfiles creates a window for browsing the catalogue
AliEn C API
•
AliEn C API will provide C++ (ROOT) binding
–
Proposed types
typedef unsigned long Alien_t; // opaque handle to Alien connection
// associated struct contains ALIEN connection state
typedef struct AlienResultStruct {
char **results; // array of result strings
int result_count; // number of results
int current; // current result
} AlienResult_t;
typedef struct AlienAttrStruct {
char **attribute; // array of attribute names
char **values; // array of attribute values
int atrr_count; // number of attribute pairs
int current; // current attribute
} AlienAttr_t;
Alien C API
–
Some function declarations
// Connect to ALIEN server. Return handle to ALIEN instance, 0 in case of failure.
Alien_t AlienConnect(const char *alien_server, const char *user, const char *passwd);
// Close connection to ALIEN server. Returns -1 in case of error.
int AlienClose(Alien_t srv);
// Return ALIEN version string.
const char *AlienGetInfo(Alien_t srv);
// Add physical file to catalog and associate logical file name. Returns -1 on error, like
// lfn,pfn already exists, illegal handle, etc.
int AlienAddFile(Alien_t srv, const char *lfn, const char *pfn);
// Delete lfn and associated pfn's. Returns -1 on error, like illegal handle, lfn not existing, no
// perm, etc.
int AlienDeleteFile(Alien_t srv, const char *lfn);
EDG GUI for Job Submission
EDG GUI for Job Submission
EDG GUI for Job Submission
•
EDG GUI for Job Submission
•
GENIUS
•
GENIUS general characteristics:
–
–
–
–
–
–
Under development by NICE s.r.l. (Italy), and INFN
Uses EDG middleware, Globus toolkit and the EnginFrame
framework of NICE srl.
Based on Java and XML, which is translated by EnginFrame into
HTML, WML, PDF and enriched XML
Unix/NT integration makes extensive use of the available Internet
standards (HTML, HTTP, JAVA, XML, etc.)
User must obtain an account on an interface machine where
GENIUS is installed and upload globus certificate
Testbed access is provided via web page from anywhere (desktop,
laptop, PDA, WAP telephone, etc)
GENIUS
•
GENIUS modules:
–
–
–
–
–
Service: XML representations of computing-related facilities
Client Tier: any browser and its extensions, the layer with which
users interact
Server Tier: one or more servelet-enabled web servers, providing
contents and services to the clients, and controlling resource
activities in the back-end
Resource Tier: where a number of "Agents" control the actual
computing resources (clusters, stand-alone hosts, etc) and provide
correctly formatted results to the servers
Plug-ins: developed for the Resource Tier: LSF, AFS, Nfuse,
Globus and DataGrid
GENIUS
•
GENIUS modules: The EnginFrame work-flow
GENIUS
•
GENIUS architecture:
https+java/xml+rfb
GENIUS
EnginFrame
WEB Browser
GENIUS is bult on
top of the already
existing DataGrid
command-line
interface
Local
WS
Apache
EDG
UI
EDG+GSI
the Grid
(From R.Barbera)
GENIUS functionality (I)
•
GENIUS services:
–
–
–
–
–
–
–
–
File Services
Security Services
Job Services
Information Services
Monitoring Services
Interactive Services (Virtual Network Computing package )
VO services
Statistics
GENIUS functionality (II)
–
File Services:
•
•
•
•
•
•
•
•
Create a File
View a File
Edit a File
Rename a File/Directory
Delete a File/Directory
Create a Directory
Upload a File
Show the Environment
GENIUS functionality (III)
– Security Services:
•
Upload Your Certificate
– Upload .globus Tar ball
– Upload Your .p12 Certificate
•
•
•
•
Information on proxy
Renew proxy
Change GENIUS Password
Change X.509 PEM phrase
GENIUS functionality (IV)
–
Job Services:
•
Single Job
–
–
Job Submission
»
The user has to provide the JDL file
»
Select one of the possible Computing Elements
»
Press the button “Submit job”
Job Queue
»
–
–
–
•
Job identifier, JDL file, time, Computing Element, present status, possible action
Job Output (The user has to press the button “Get Output”)
Job Data (The user can inspect personal spooler area)
Clean Job Queues
List Available Resources
GENIUS functionality (V)
–
Job Services:
•
Job Submission
GENIUS functionality (VI)
–
Job Services:
•
List Available
Resources
GENIUS functionality (VI)
–
Information Services:
•
•
•
Sites belonging to the testbed
Computing Elements
present at each site with
the information on the
local resource manager
Storage Elements present
at each site, the connection
port, the size and the
mount point
Grappa
• General characteristics of Grappa:
–
–
–
–
–
–
Under development in context of Grid Physics Network (GriPhyN)
Project and ATLAS
Prototype based on XCAT Science Portal
Allows user to submit jobs to US-ATLAS testbed resources
Provides file staging, remote job-option file editing, basic
monitoring
Provides a set of tools for collaborative data analysis
Packaged with pacman
Grappa
Grappa current architecture:
(From R.Gardner)
Athena Notebook
XCAT Science Portal
Tomcat Server
Grappa
XCAT architecture:
User’s Web
Browser
Portal Web Server
(tomcat server + java servlets)
GSI
Authentication
Jython
Intepreter
Grid
(From S.Smallen)
•Jython - access to Java
classes:
–Globus Java CoG kit
–XCAT
–XMESSAGES
Notebook
Database
Grappa functionality
•
Provided via Athena
Active Notebook
Users can:
–
–
–
–
Submit Athena Jobs to the
GRID
Manage resources
Submit a sequence of
jobOptions files to the GRID
Monitor status of running jobs
Conclusions
– Framework-Grid interfaces will be of immediate use for
physics studies
– Various tools and services relevant to a
Framework-Grid interface are already available
– User-friendly (GUI-based) Grid interfaces are being
developed by several groups
Workshop should help us understand how to proceed with
development of Framework-Grid interface for ATLAS
and LHCb
© Copyright 2025 Paperzz