ADASS03-Reg

THE US NATIONAL VIRTUAL OBSERVATORY
Resource Registries for the
Virtual Observatory
R.Plante (NCSA), G. Greene (STScI), R. Hanisch (STScI),
T. McGlynn (NASA/GSFC), W. O’Mullane (JHU),
R. Williams (Caltech), R. Williamson (NCSA)
ADASS 2003 – Strasbourg
14 October 2003
1
The role of Resource Registries
• Used to discover and locate resources—data and
services—that can be used in a VO application
• Resource: anything that is describable and identifiable.
– Besides data and services: organizations, projects,
software, …
– Presently concerned with simple set of resource types
• Registry: a list of resource descriptions
– Expressed as structured metadata
to enable automated processing and searching
ADASS 2003 – Strasbourg
14 October 2003
2
Selected Requirements
• Allow user to select resources that are likely to pertain to a
scientific question
• Select resources based on characteristics…
– Type of resource: catalogs, image archives, EPO, services
– Coverage in space, time, and frequency
– Where data comes from, who curates it
• Dynamic: resources will come and go
• Distributed: Should not depend on a single point of failure
or single view of the VO.
• Preserve the data providers’ control over their data
– Curators control what gets registered, content, updates
– Allow integration with existing resource management
• Allow extension to new types of resources
ADASS 2003 – Strasbourg
14 October 2003
3
IVOA Registry Working Group (RWG)
IVOA = International Virtual Observatory Alliance
• Common, global approach to registries
• Work packages:
–
–
–
–
Science requirements and use cases
Resource metadata
Registry interfaces
Prototyping
• Distributed model for registries
ADASS 2003 – Strasbourg
14 October 2003
4
Registry Model
Full
Searchable
Registry
VO
Projects
Local
Publishing
Registry
Data
Centers
Full
Searchable
Registry
Local
Publishing
Registry
Local
Searchable
Registry
Specialized
Portals & Services
ADASS 2003 – Strasbourg
14 October 2003
5
Registry Model
harvest
(pull)
Full
Searchable
Registry
VO
Projects
Local
Publishing
Registry
Data
Centers
Full
Searchable
Registry
Local
Publishing
Registry
Local
Searchable
Registry
Specialized
Portals & Services
ADASS 2003 – Strasbourg
14 October 2003
6
Registry Model
harvest
(pull)
Full
Searchable
Registry
VO
Projects
replicate
Local
Publishing
Registry
Data
Centers
Full
Searchable
Registry
Local
Publishing
Registry
Local
Searchable
Registry
Specialized
Portals & Services
ADASS 2003 – Strasbourg
14 October 2003
7
Registry Model
harvest
(pull)
Full
Searchable
Registry
VO
Projects
replicate
Local
Publishing
Registry
Data
Centers
Full
Searchable
Registry
selective
harvesting
Local
Publishing
Registry
Local
Searchable
Registry
Specialized
Portals & Services
ADASS 2003 – Strasbourg
14 October 2003
8
Registry Model
Full
Searchable
Registry
VO
Projects
Local
Publishing
Registry
Data
Centers
Full
Searchable
Registry
Local
Publishing
Registry
search
queries
Local
Searchable
Registry
Client
Applications
ADASS 2003 – Strasbourg
Specialized
Portals & Services
14 October 2003
9
Registry Model
Full
Searchable
Registry
VO
Projects
Local
Publishing
Registry
Data
Centers
Full
Searchable
Registry
Local
Publishing
Registry
search
queries
Local
Searchable
Registry
Client
Applications
ADASS 2003 – Strasbourg
Specialized
Portals & Services
14 October 2003
10
Registry Model
Full
Searchable
Registry
VO
Projects
Local
Publishing
Registry
Data
Centers
Full
Searchable
Registry
Local
Publishing
Registry
search
queries
Local
Searchable
Registry
Client
Applications
ADASS 2003 – Strasbourg
Specialized
Portals & Services
14 October 2003
11
NVO Prototype Registry
• To support a Data Inventory Service (DIS)
What is known about a position in the sky?
– Use a registry to locate and query standard services:
• Cone Search Services: querying catalogs
• Simple Image Access Services:
querying image archives and cutout services
(see P3.18, McGlynn et al.; http://heasarc.gsfc.nasa.gov/vo/data-inventory.html)
• Components
–
–
–
–
–
Publishing Registries
Searchable Registry
Resource Metadata
Harvesting Protocol
Populated with service descriptions
ADASS 2003 – Strasbourg
14 October 2003
12
Resource Metadata
• Under development within the IVOA RWG
• The standard comes in two parts:
– Prose document that defines concepts
independent of an encoding scheme
see P3.5, Hanisch et al. “Resource Metadata for the VO”
– XML Schemas
• Draws on Dublin Core metadata
– An interdisciplinary standard for core resource
metadata http://dublincore.org
• Schema to stablize this month
ADASS 2003 – Strasbourg
14 October 2003
13
Resource Metadata: XML Schema
• Classes of Resources
Organisation, DataCollection, Service, Registry
– Specific classes inherit from generic <Resource>
• Organized into separate schemas:
– Core resource metadata: VOResource
– Various extensions schemas containing specific types
• Capable of describing…
– Data centers, research organizations, missions,
observatories
– Data collections, archives
– VO standard services: Cone Search, Simple Image Access
– Existing Browser/CGI-based services
ADASS 2003 – Strasbourg
14 October 2003
14
Publishing Registries:
getting information into registries
• Two publishing registries established at
Caltech and NCSA.
• Motivation:
– Register Simple Image
Access Services
– Develop techniques for
easy registration
• Resource descriptions
stored as XML
documents using
VOResource schema
P3.22 Williamson & Plante
ADASS 2003 – Strasbourg
14 October 2003
15
Harvesting Interface
• Adopted Open Archives Initiative (OAI) Protocol for
Metadata Harvesting
– HTTP/CGI-based protocol for exposing metadata to
harvesters (e.g. searchable registries)
• Advantages:
–
–
–
–
Existing, field-tested design we didn’t have to re-invent
Fairly easy to implement
Existing tools for emitting and harvesting metadata
Exposes our metadata to larger digital library community
Caltech & NCSA Registries use existing tool that implements OAI
interface. To use, we had to:
• Store XML descriptions in files in a directory structure
• Provide an XSL style sheet to convert to Dublin Core XML
See P3.22; Williamson & Plante;
ADASS 2003 – Strasbourg
14 October 2003
16
Models for Registering Resources
• Curator uses another site’s registry
– Good for a few resources whose descriptions are fairly
static
e.g. @NCSA: http://nvo.ncsa.uiuc.edu/nvoregistration.html
• VORegistry-in-a-box:
– Deployable package that allows a data provider to run
own registry “out of the box”
See P3.22; Williamson & Plante;
http://nvo.ncsa.uiuc.edu/VO/software
– Good for larger number of resources that might be
updated often
• Curator builds own OAI interface
– Good for very large number of resources
– Automate XML generation using site’s existing
information management tools
ADASS 2003 – Strasbourg
14 October 2003
17
Searchable Registry
• Searchable Registry was set up at JHU/STScI
see P3.8 Green et al., “Searchable Registry for the NVO”
http://skyserver.pha.jhu.edu/devel/registry
• OAI harvester collects resource descriptions
– from Publishing Registries at Caltech & NCSA
– Loads data into relational database
• SOAP Web Service interface
http://skyserver.pha.jhu.edu/devel/registry/registry.asmx
– Searching
• Currently provides specialized querying useful for DIS
– Re-harvest request
• To get updated records from publishing registries
ADASS 2003 – Strasbourg
14 October 2003
18
Registry Model
JHU/STScI
Full
Searchable
Registry
harvest
(pull)
Local
Publishing
Registry
Caltech
NCSA
Local
Publishing
Registry
search for
services
DIS
Data
Inventory Service
ADASS 2003 – Strasbourg
14 October 2003
19
Registry Model
JHU/STScI
Full
Searchable
Registry
harvest
(pull)
Cone
Search
Service
Cone
Search
Service
Local
Publishing
Registry
Caltech
NCSA
Local
Publishing
Registry
search for
services
DIS
Cone
Search
Service
Data
Providers
Simple
Image
Access
Simple
Image
Simple
Access
Image
Access
Data
Inventory Service
ADASS 2003 – Strasbourg
14 October 2003
20
Summary
• We built a working prototype registry system to
support an end-user VO service
– Distributed Publishing and Searchable components
– Encoded descriptions using emerging VO XML standard
schemas
– OAI Harvesting Standard deployed easily
– Used to discover Cone Search and SIA services
• What’s next: Interoperable registries IVOA-wide
– Stablize XML metadata standard
– Standardize registry interfaces
ADASS 2003 – Strasbourg
14 October 2003
21