The Astrolabe Project: Identifying and Curating Astronomical ‘Dark Data’ through Development of Cyberinfrastructure Resources Gretchen Stahlman, PhD Candidate University of Arizona School of Information Library and Information Services in Astronomy (LISA) VIII, June 7, 2017 Astrolabe • Astrolabe is a new data repository and computational environment being created at University of Arizona (UA). • Partners include: – – – – – UA School of Information UA Department of Astronomy and Steward Observatory UA University Libraries CyVerse (formerly the iPlant Collaborative) American Astronomical Society • Astrolabe has been funded by: – UA Office for Research & Discovery (now RDI) – National Science Foundation ACI The “Astrolabe” Project • Astrolabe has a mission to: – Collect, preserve, disseminate – Provide tools for analysis and data sharing – Expose research data Archive Management Image credit: Digital Curation Centre, www.dcc.ac.uk Image credit: http://data-archive.ac.uk/create-manage/life-cycle Lifecycle Preservation and Access • Curation “is the active management and appraisal of digital information over its entire life cycle” (Pennock, 2007). • Curation requires insightful knowledge of data and communities. • Resources must be developed to support publication of (and links between) research AND data. “Dark Data” in the Long Tail • Large projects have well-planned data stores, while large amounts of data remain uncurated (Heidorn, 2008). • “Like dark matter, this dark data on the basis of volume may be more important than that which can be easily seen” (p. 281). • Long Tail data require institutions, practices and policies to make these data useful to researchers. Long Tail Distribution in Astronomy The “Top 20%” 0 1.0e-06 Density 2.0e-06 3.0e-06 Curating the Long Tail with Astrolabe 0 200000 400000 600000 Awards 800000 1000000 General Properties of “Dark Data” • “Dark Data” are typically: – Heterogeneous – Generated through unique procedures – Curated by individual scientists – Not maintained – Obscured or protected – Seldom reused – Currently unnoticed Astronomical Data • Common Data Types – Sky images – Light curves – Spectroscopy – Catalogs • Common Data Format – FITS (Flexible Image Transport System) • Culture of Open Access American Astronomical Society (AAS) • Key professional society for astronomers in the US • Hosts two major conferences each year • Non-profit organization • Publishes four major journals – The Astronomical Journal (AJ) – The Astrophysical Journal (ApJ) – The Astrophysical Journal Letters (ApJL) – The Astrophysical Journal Supplements (ApJS) CyVerse.org • Discovery Environment – Use hundreds of Apps and manage data in a simple web interface • Atmosphere – Custom cloud-based scientific analysis platform or use a ready-made one for your area of scientific interest • Data Store – Store, manage, access, and share all the data related to research CyVerse Services Astrolabe Organizational Model July 2015 Workshop Outcomes • Identify mission and clear science use cases • Take advantage of CyVerse cyberinfrastructure and longevity of University of Arizona • Obtain community buy-in and manage expectations • Focus on “low-hanging fruit” such as data not curated elsewhere and data behind figures in journals • Develop a follow-on workshop for additional feedback July 2016 Workshop Outcomes • Physical format of dark data (i.e. historical data stored on tapes) • Author websites archiving data (not typically long-lived) • LSST time domain and serendipitous data cases (follow-up to LSST observations and discovery through historical data) • Searching the literature for references to dark data (for indicative text, broken links, etc.) Astrolabe Timeline • 2013 - AAS Strategy Meeting • 2015 - Workshop #1 in Tucson funded by UA Start for Success seed grant • 2016 – UA Accelerate for Success awarded for one-year pilot – collaborators include iSchool, Steward, UA Libraries, CyVerse, AAS – Changed name from Arizona Astronomical Data Hub (AADH) to Astrolabe to focus beyond AZ – Established a Board of Directors – Workshop #2 focusing on specifying requirements for Astrolabe system – NSF ACI Grant awarded to develop WorldWide Telescope as Astrolabe front end and visualization tool, with the idea that this could scale to other repositories WorldWide Telescope (WWT) A screenshot from WWT HTML5 web client – worldwidetelescope.org Current Status of Astrolabe: 2017 Activities and Objectives • Searching for uncurated or “at-risk” data by mining the literature, and by contacting authors individually based on our team’s review of particular types of publications • Recently contracted a developer to accomplish objectives specified in recently-awarded NSF grant (award #1642446), will hire additional developers • Working on funding proposals for system development, including project to create protocols for migrating data from obsolete media into Astrolabe • Collaborating with CyVerse to develop and optimize interfaces, apps, metadata templates and indexing, cone search and VO • Installed Montage for conversion of FITS to JPEG to TOAST • Designing website as interface to CyVerse data store to facilitate data deposition and reuse Our Team • Principal Investigators – Bryan Heidorn, PhD, UA School of Information – Dennis Zaritsky, PhD, UA Department of Astronomy • AAS Affiliate – Julie Steffen, AAS Director of Publishing • WWT Developer – Jonathan Fay, AAS Contractor and Microsoft Software Engineer • Postdoctoral Researcher – Huanian Zhang, PhD, UA Department of Astronomy • Graduate Research Associate – Gretchen Stahlman, UA School of Information • Astrolabe Advisory Board Members Robert Hanisch (NIST) Chris Lintott (Oxford/AAS) Barbara Kern (U of Chicago) Julie Steffen (AAS) Frank Timmes (AZ State/AAS) Benjamin Weiner (Steward/UA) Edwin Henneken (ADS) Henry “Trae” Winter, Astrolabe Advisory Board Chair (CfA) Thank you! http://astrolabe.arizona.edu This material is based upon work supported by the National Science Foundation under Grant No. 1642446.
© Copyright 2026 Paperzz