Section: Local Data Management Data Formats: Choosing and Adopting Community Accepted Standards Curt Tilmes NASA Version 1.0 February 2013 Copyright 2013 Curt Tilmes Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 Overview • Some guidelines for choosing and adopting community accepted standards Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 Background • Most projects (rightly so) focus on the content of their data files, you need to consider the format as well. • Since you captured or created the data, and stored them in your own files, you know • • • • how the data are organized, how to read them, how to use them, characteristics of the data that could constrain their use. • The goal of a good data format is to make it easier for others to read the data too. • Many hours have gone into developing standards for formats – try to learn from them. Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 Why use community standards? • If you try to develop your data format from scratch, you will forget something. • Build on the experience and improvements built into the community standards over years of use. • Tools and analysis software natively support reading community standard data. • Reduce development effort and support reuse. • Positive feedback – they are more likely to be adopted by others. Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 Why use community standards? http://xkcd.com/927/ Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 A few guidelines • Consider your archive: • Do they have any recommendations? • Consider your users: • Who wants this data? Why do they want it? • What do they want to do with it? • Will they be using your data in concert with other data? • Consider heritage: • What worked well for similar data in the past? • What could be done better for newly created data? • Consider tools: • Try to use data formats supported by the software you intend to use it with. Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 Some examples • HDF – Hierarchical Data Format • HDF4 and HDF5 versions are in use today • A NASA variant called HDF-EOS is used within the Earth Observing System program. • The Aura project developed a common approach across their instruments and released guidelines as a Technical Note. • NetCDF – Network Common Data Form • Widely used by agencies including NASA and NOAA • Climate and forecast (CF) metadata conventions help standardize some things into NetCDF in a common manner. Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 Adopting standards • The standard gives you a starting point, not a complete solution. • Communicate early with a broad range of data users: archivists, software engineers, scientists. • Consider how you will be writing the data and how you will be reading the data. • Get feedback before making final decisions. • Start sharing sample data in proposed format to nail down specifics and work out ambiguities. • Document your use and application of the standard completely. Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 Resources • HDF: http://www.hdfgroup.org • HDF-EOS: http://hdfeos.org • HDF-EOS Aura File Format Guidelines: • http://disc.sci.gsfc.nasa.gov/Aura/additional/documentation/HDFEOS_ Aura_File_Format_Guidelines.pdf • http://www.esdswg.org/spg/spgfolder/events/esdswg-meeting-october25-27-2005/auraasabestpracticerev2.pdf • NetCDF: http://www.unidata.ucar.edu/software/netcdf • CF: http://cf-pcmdi.llnl.gov/ Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 Other Relevant Modules • Local Data Management – Data Formats: Using Selfdescribing Data Formats • Learn more about the advantages of using formats for your data that have important metadata and other information embedded within them Local Data Management - Data Formats: Choosing and Adopting Community Accepted Standards; Version 1.0, February 2013 Recommended Citations Tilmes, C. 2013. “Local Data Management – Data Formats: Choosing and Adopting Community Accepted Standards.” In Data Management for Scientists Short Course, edited by Ruth Duerr and Nancy J. Hoebelheinrich, Federation of Earth Science Information Partners: ESIP Commons. doi:10.7269/P33N21B6 Copyright 2013 Curt Tilmes.
© Copyright 2026 Paperzz