ASTROLABE

The Astrolabe Project: Identifying and Curating
Astronomical ‘Dark Data’ through
Development of Cyberinfrastructure Resources
Gretchen Stahlman, PhD Candidate
University of Arizona School of Information
Library and Information Services in Astronomy (LISA) VIII, June 7, 2017
Astrolabe
• Astrolabe is a new data repository and computational
environment being created at University of Arizona
(UA).
• Partners include:
–
–
–
–
–
UA School of Information
UA Department of Astronomy and Steward Observatory
UA University Libraries
CyVerse (formerly the iPlant Collaborative)
American Astronomical Society
• Astrolabe has been funded by:
– UA Office for Research & Discovery (now RDI)
– National Science Foundation ACI
The “Astrolabe” Project
• Astrolabe has a mission to:
– Collect, preserve, disseminate
– Provide tools for analysis and data sharing
– Expose research data
Archive Management
Image credit: Digital Curation Centre, www.dcc.ac.uk
Image credit: http://data-archive.ac.uk/create-manage/life-cycle
Lifecycle Preservation and Access
• Curation “is the active management and
appraisal of digital information over its entire
life cycle” (Pennock, 2007).
• Curation requires insightful knowledge of data
and communities.
• Resources must be developed to support
publication of (and links between) research
AND data.
“Dark Data” in the Long Tail
• Large projects have well-planned data stores,
while large amounts of data remain uncurated
(Heidorn, 2008).
• “Like dark matter, this dark data on the basis
of volume may be more important than that
which can be easily seen” (p. 281).
• Long Tail data require institutions, practices
and policies to make these data useful to
researchers.
Long Tail Distribution in Astronomy
The “Top 20%”
0
1.0e-06
Density
2.0e-06
3.0e-06
Curating the Long Tail with Astrolabe
0
200000
400000
600000
Awards
800000
1000000
General Properties of “Dark Data”
• “Dark Data” are typically:
– Heterogeneous
– Generated through unique procedures
– Curated by individual scientists
– Not maintained
– Obscured or protected
– Seldom reused
– Currently unnoticed
Astronomical Data
• Common Data Types
– Sky images
– Light curves
– Spectroscopy
– Catalogs
• Common Data Format
– FITS (Flexible Image Transport System)
• Culture of Open Access
American Astronomical Society (AAS)
• Key professional society for astronomers in
the US
• Hosts two major conferences each year
• Non-profit organization
• Publishes four major journals
– The Astronomical Journal (AJ)
– The Astrophysical Journal (ApJ)
– The Astrophysical Journal Letters (ApJL)
– The Astrophysical Journal Supplements (ApJS)
CyVerse.org
• Discovery Environment
– Use hundreds of Apps and manage data in a simple
web interface
• Atmosphere
– Custom cloud-based scientific analysis platform or use
a ready-made one for your area of scientific interest
• Data Store
– Store, manage, access, and share all the data related
to research
CyVerse Services
Astrolabe Organizational Model
July 2015 Workshop Outcomes
• Identify mission and clear science use cases
• Take advantage of CyVerse cyberinfrastructure
and longevity of University of Arizona
• Obtain community buy-in and manage
expectations
• Focus on “low-hanging fruit” such as data not
curated elsewhere and data behind figures in
journals
• Develop a follow-on workshop for additional
feedback
July 2016 Workshop Outcomes
• Physical format of dark data (i.e. historical
data stored on tapes)
• Author websites archiving data (not typically
long-lived)
• LSST time domain and serendipitous data
cases (follow-up to LSST observations and
discovery through historical data)
• Searching the literature for references to dark
data (for indicative text, broken links, etc.)
Astrolabe Timeline
• 2013 - AAS Strategy Meeting
• 2015 - Workshop #1 in Tucson funded by UA Start for Success seed grant
• 2016
– UA Accelerate for Success awarded for one-year pilot – collaborators
include iSchool, Steward, UA Libraries, CyVerse, AAS
– Changed name from Arizona Astronomical Data Hub (AADH) to
Astrolabe to focus beyond AZ
– Established a Board of Directors
– Workshop #2 focusing on specifying requirements for Astrolabe
system
– NSF ACI Grant awarded to develop WorldWide Telescope as Astrolabe
front end and visualization tool, with the idea that this could scale to
other repositories
WorldWide Telescope (WWT)
A screenshot from WWT HTML5 web client – worldwidetelescope.org
Current Status of Astrolabe:
2017 Activities and Objectives
• Searching for uncurated or “at-risk” data by mining the literature,
and by contacting authors individually based on our team’s review
of particular types of publications
• Recently contracted a developer to accomplish objectives specified
in recently-awarded NSF grant (award #1642446), will hire
additional developers
• Working on funding proposals for system development, including
project to create protocols for migrating data from obsolete media
into Astrolabe
• Collaborating with CyVerse to develop and optimize interfaces,
apps, metadata templates and indexing, cone search and VO
• Installed Montage for conversion of FITS to JPEG to TOAST
• Designing website as interface to CyVerse data store to facilitate
data deposition and reuse
Our Team
•
Principal Investigators
– Bryan Heidorn, PhD, UA School of Information
– Dennis Zaritsky, PhD, UA Department of Astronomy
•
AAS Affiliate
– Julie Steffen, AAS Director of Publishing
•
WWT Developer
– Jonathan Fay, AAS Contractor and Microsoft Software Engineer
•
Postdoctoral Researcher
– Huanian Zhang, PhD, UA Department of Astronomy
•
Graduate Research Associate
– Gretchen Stahlman, UA School of Information
•
Astrolabe Advisory Board Members
Robert Hanisch (NIST)
Chris Lintott (Oxford/AAS)
Barbara Kern (U of Chicago)
Julie Steffen (AAS)
Frank Timmes (AZ State/AAS)
Benjamin Weiner (Steward/UA)
Edwin Henneken (ADS)
Henry “Trae” Winter, Astrolabe Advisory
Board Chair (CfA)
Thank you!
http://astrolabe.arizona.edu
This material is based upon work supported by the National Science Foundation under Grant No. 1642446.