The last decimetre for linked data and geo-samples: when pragmatics ace semantics and syntax Linking Environmental Data and Samples May 30, 2017 Canberra, AUSTRALIA Peter Fox; [email protected], @taswegian – Tetherless World Constellation Rensselaer Polytechnic Institute *also Woods Hole Oceanographic Institution Samples and Data? Samples - born digital? • Nope • Well, mostly not – A topic for a bar discussion (or panel) • But, increasingly their metadata can be! • Opportunity – Ineffective – lack of current technical capability, e.g. linked-data Premise 1 • “we” remain comfortable for “commodity offerings to rule our science” – Without being part of the change Premise 2 • Producers rather than consumers dictate what is captured and in what format – Approach what technical solutions are pursued from a “use” perspective – Where would we be if OGC had been really interested in samples? Premise 3 • “we've” not hacked into the data/ metadata generation pipelines to the extent that we must so that consumers (the pragmatists) can use and learn from extant samples and sample collections – Stay tuned for this part… Caveat emptor • Parts of what follow <do> exist … • However, not as part of a coherent whole; neither architecture nor services • Thanks to Bryan Broderic and Mark Gahegan for long-in-the-past discussions but they cannot be held responsible for what comes next ;-) Premise 1 • “we” remain comfortable for “commodity offerings to rule our science” – Without being part of the change Metadata encoding(s) • EXIF/IPTC for images • Geo-TIFF for images • IGSN for samples (XML) • Collect more automatically of course… • However, it is how you use it, structure it, and what “schemas” to adopt… Schema.org/datasets Slide from Ruth Duerr Slide from Ruth Duerr NSDIC landing page What “Google” sees… Slide from Ruth Duerr Thus…please… • A Samples and SampleCollection extension for schema.org • Revisions to Dataset and DatasetCollection schema to add ”sample” fields, e.g. fromSample, isSampleType, … • This would mean – IGSN metadata in RDF (RDFa) – Landing pages (Samples, Collections, ?) – And more … Premise 2 • Producers rather than consumers dictate what is captured and in what format, so – Approach what technical solutions are pursued from a “use” perspective – Cf. schema.org just presented but has implications for what fields are in the schema and which are populated, e.g. • isdivisible • isavailable (vs. exists) – sx.igsn.org (cf. dx.crossref.org or …) with content negotiation (sample “citation” – maybe) – Add samples to your ORCID profile ;-) Data-Information-Knowledge Ecosystem Producers Consumers Experience Data Creation Gathering Information Presentation Organization Knowledge Integration Conversation Inference Context Sample Transduction 16 Producers Consumers Quality Control Quality Assessment Fitness for Purpose Fitness for Use Trustee Trustor 17 Semiotic model 18 Semiotics 19 20 Semiotics • Also called semiotic studies or semiology, is the study of sign processes (semiosis), or signification and communication, signs and symbols Compute Entropy/ Conditional Ent. “Safety/ navigability” “Egg code” “Thickness, age, etc. of 21 ice” Premise 3 • “we've” not hacked into the data/ metadata generation pipelines to the extent that we must so that consumers (the pragmatists) can use and learn from extant samples and sample collections – See Premise 1 and 2 – Revisit Architecture(s) and Services – Conceptual and Logical models v. schema (Information) Architecture • Definition: – “is the art of expressing a model or concept of information used in activities that require explicit details of complex systems” (wikipedia) – “… I mean architect as in the creating of systemic, structural, and orderly principles to make something work - the thoughtful making of either artifact, or idea, or policy that informs because it is clear.” Wuman 23 More detail to connect us • “The term information architecture describes a specialized skill set which relates to the interpretation of information and expression of distinctions between signs and systems of signs.” (wikipedia, emphasis added) “Information architecture is the categorization of information into a coherent structure, preferably one that the most people can understand quickly, if not inherently. 24 Semiotic triangle • When you build an information system (elements, relations, operation), it has “SYMBOLS” to stand for “SOMETHING” • Design of your symbols and how they go together (architecture) enables the “THOUGHT” (or not) A sample I can USE. Sample, type, etc. 25 And yet, I’m still not done.. http://4.bp.blogspot.com/-7mYclB2oypk/TWrlhBPvHxI/AAAAAAAAALc/mwjhBbuZ9kU/s1600/yawn4.jpg +Provence (please) Investigator isA Mapping the many sample use cases into PROV-O Drill core isA isA X-ray 27 Born digital? • Until samples and people are born digital, social and cultural considerations will be present (whole session on that later in the week, so I did not want to pre-empt it) • We know how to do everything I presented (and more I did not) • Ready? Go. • Me: [email protected], @taswegian Informatics enables a new approach • Use cases – Pragmatics • Stakeholders • Distributed authority • Access control • Semantics! • Maintaining Identity RPI Tetherless World Constellation tw.rpi.edu • Government Data • Health care/Life Sciences • Environmental Informatics Future Web •Web Science •Policy •Social Hendler Xinformatics •Data Science •Semantic eScience •Data Frameworks Fox Lots of technology but the oldest building on campus! Semantic Foundations •Knowledge Provenance •Inference, Trust Senior scientists + ~ 40 = Post-docs, Staff, Grad, UGrad McGuinness Met-uh-dat-ah
© Copyright 2026 Paperzz