The Ongoing Challenges of Citing the Results of Scholarly Research

TheOngoingChallengesofCitingtheResultsofScholarlyResearch
MaureenC.Kelly/May23,2016
“Scholarlycommunicationhasbuiltupanimportanttraditionofcitation.Itreflectsthefactthatinall
areasofresearch[...]weprogressbybuildingonthepast.Andweacknowledgeourdebttothepastby
citationtoit.Bydoingso,weassurethatoursourcescanbechecked,verified,validated.Butthat
impliesthatthematerialsoreferenced,socitedmustbeavailableforchecking,verifying,validating.
Whathappensifthesourcedata[iselectronic]andhasbeenerased,orworseyet,alteredsinceitwas
lastused?Theentirestructureofscholarlyprogresswouldcollapse.”—Dr.RobertHayes(1992).
ContentChanged,StandardsFollowed
Dr.Hayesmadeanimportantobservation.Butweneedtolookverycloselyatthetransitionfromprint
journalstoelectronicrecordsofresearchinordertofullyunderstandthechallengeswe’vefacedinthe
pastfewdecades,andtheonesthatlieahead.Andweneedtokeepinmindthatthegoalisnot
necessarilyaboutcreatingtraditional“citations.”Rather,itisaboutbeingabletoreliablyidentify,
locate,andaccesspriorresearchrecords.
Wehavealongtraditionofcommunicatingresearchresultsneatlywrappedinajournalarticleand
packagedinajournalissue.Thepracticedatesbackto1665,whentheRoyalSocietyfirstpublished
PhilosophicalTransactions.Thatpublication“pioneeredtheconceptsofscientificpriorityandpeer
reviewwhich,togetherwitharchivinganddissemination,providethemodelforalmost30,000
scientificjournalstoday”(RoyalSociety,2016).Andformostofthose350years,journalswere
distributedinprintedformats.
Overthepasthalfcentury,technologyhasdrivensignificantchangesinthatparadigm.Intheearly
stagesofchange,e-journalswereofferedascomplementstoprintjournals.Librariesoftensubscribed
tobothforms,andthedigitalversionwasbasicallyareplicaoftheprintone:aPDFversionofthe
printedjournalpages.Thecontentremainedstableandcitable.Publishers’productionsystemsand
librarydeliverysystemsadaptedtomeetthosechanges,butthecontentwasstilllockedwithin
traditionaljournalsandarticles.Therewasan“official,”fixedpresentationoftheresearchresults.
Changesinthisprocesswereinitiallydrivenbypressuretospeedupthepublicationcycle.Journals
begantopublishe-firstarticles,whichcreatedchallengesfortraditionalcitationmetadatabecausethe
pagenumberswereoftenunavailablewhenthee-firstversionwasreleased.Sometimesthee-version
waslaterupdatedtoincludethepagination.Questionsaroseastowhichwastheversionofrecord.
Butstillthecontentremaineddiscoverableandcitable.
-1-
Overthelasttenyearswehaveseenamoresubstantiveshiftinpublishingandlibrarypracticesasejournalshavelargelyreplacedprintjournalsinlibrariesandintheeconomicsofscholarlypublishing.
Overthattime,wehavealsoseenarefocusingawayfromthejournalandjournalissueasthe
containerforscholarlycontent.Now,thefocusisonelectronicdatabasesofarticleswherethejournal
andissueinformationareusedsimplyassupportingmetadata.Wealsoseecasesinwhichthe
electronicversionofajournalissuecontainsmoreinformationthanitsprintcounterpart.
Fullyelectronicversionsofscholarlycontent,withavailableXMLandHTML,offersignificant
advantages,buttheyalsobringonnewchallenges.Contentispackagedinlargedatabasesandis
remotelyaccessible.Searchengineshavereplacedabstractingandindexingservicesasthetoolsfor
discovery.Differentversionsofarticlesmaybeavailablefrompreprintserversandinstitutional
repositories.GoogleScholarletsussearchacrossmultipledatabasesandoftenprovideslandingpages
forcontentthatisbehindafirewall.Gonearethedayswhenwewouldgothelibraryandscanthe
shelvesoraskthelibrariantolocateanarticleforus—now,scholarlyinformationdiscoveryandaccess
havebecomeado-it-yourselfenterpriseforresearchers.
Asscholarlycontentmovedtoelectronicformats,supportingstandardsfollowed.Westillusestyle
guidestoprepareourcitationswhensubmittingarticlesforpublication,butwenowhavethe
advantageofbibliographicdataformatsthataremoreactionableinanelectronicenvironment.
Inthelate1960stheISBNcameonthescene,followedbytheISSNintheearly1970s.Thebar-coded
versionsofthesestandardshavebecomeimportanttoinventoryandpurchasingsystems.(TheISSN
gainedtractionafterthepostofficemadeitarequirementforsecond-classmailingpermits.)Butthe
majorstandardsinitiativeforstreamliningcontentmanagementinanelectronicenvironmenthasbeen
theDOI(DigitalObjectIdentifier)System.Itcomplementsratherthanreplacesstandardcitationsby
providingaconcisecodeforeachdigitalobject,ratherthanauser-friendlydescriptionoftheresource.
ThelatestimplementationoftheDOIispresentedasaURL(UniformResourceLocator)structure.In
spiteofthischangeinformat,itremainsareliableidentifierofthedigitalobjectratherthanits
locationontheInternet,whichcanbeunstable.TheURLresolvestoanunderlyingregistrationsystem
ratherthantheopenInternet.Soitremainsasurrogatefor‘thecite’ratherthan‘asite’.Theworkof
theDOISystemhasbeencomplementedbyworkintheareasoflibrarylinkresolvers,mostappropriate-copyresolution,smartlandingpages,andsimilardevelopments.TheDOIhasbeen
incrediblysuccessfulbutitremainsaworkinprogressasitcontinuestoadapttonewcontenttypes
andtonewdiscoveryanddeliveryenvironments.
ContentContinuestoChange;StandardsFollow
Wehavebeguntoseechangesintheoutputofscholarlyresearchthatimpactwhatconstitutesa
publishableknowledgeobject.Thesechangesgobeyondwhetherthejournalarticleisinprintor
-2-
electronicformat.Newtypesofresearchresultsarecausingustoreevaluatewhetheratraditional
journalarticleisasufficientcontainerfordistributingscholarlyknowledge.
Scholarlyresearchmethodshavechangedsince1665,andnewtypesofresearchresultshavebecome
criticaltotheresearchprocess.Ithasbecomeclearthatthetraditionaljournalarticlemodelis
insufficient.In1997,IgaveatalkattheICSTImeetingfollowedbyanarticleinICSTIForum(Kelly,
1997).
Inthatarticle,Iarguedthat“neitherprintjournalarticlesorbooksnortheirelectronicequivalentsare
sufficienttothetaskahead.”Rather,“wemustlookbeyondthecurrent,text-centricparadigm.”Itwas
becomingapparentasearlyas1997thatweneedednewchannelsforsharingresearchresultsinaway
thatwasmorefunctionalandreusablethanajournalarticlepresentationmethodcouldaccommodate.
Researchresultswerebecomingincreasinglycomplexandmuchvaluewasbeinglostbyreducingthem
tostatictextandtables.
GenBankwasthefirstmajordatainitiativetobreakoutofthejournalarticlepublishingparadigm.It
beganatLosAlamosandtransitionedtotheNationalCenterforBiotechnologyInformation(NCBI).In
the1980s,asgeneticsequencesbecameasignificantresearchoutput,journalsstruggledtopublish
thesestringsoflettersintextform.Yes,theyreallydidprintpagesandpagesofA-T-G-Ccombinations.
Imaginethechallengeofcopyeditingsuchstrings.Butmoreimportant,imaginethelossofvaluethat
resultedfromreducingthisknowledgeintosimpletextstrings.AsGenBankbecameestablished,
publishersjoinedthemovebydecliningtopublisharticlesuntilthesequenceshadbeendepositedin
GenBank.Littlebylittle,GenBankhasgrowninsophisticationandfunctionalitytobecomea
cornerstoneofgeneticresearch,regularlyfacilitatingimportantnewdiscoveries.
Manyothercommunities,suchasastronomyandgeology,nowrelyondatabasestofacilitate
collaborationanddiscovery.Publishinganarticleinatraditionaljournalremainsimportantforgiving
recognitionandcreatingcareeropportunitiesforresearchers.Butresearchersincreasinglyneedaccess
totheunderlyingdata;theywanttociteitandincorporateitintonewresearch.
Thiswillnotbeaneasytransitionforscholarlycommunications,astherearemanychallenges
associatedwithpublishingresearchdata.Therearevalidconcernsregardingthechallengesofpeer
reviewingdataandrisksofdatapiracy.Sufficientmetadataisneededtoprovidethecontextofits
collectionandtosupportdiscoveryandreuse.Thelistofchallengesislongandvalid,andtheseissues
mustbeaddressed.Researchersneedanddeserverecognitionfortheirwork,andtheyneeda
publicationandcitationenvironmentthatwillworkinthischangingresearchenvironment.
PublishingandCitingData
-3-
Giventhatscholarlyresearchisallaboutcollectingandanalyzingdata,weneedtodeveloprigorous
practicesforsharingandcitingdatacollections.Botanicalgardens,withtheirvaststoresofpressed
plantleaves,areanearlyacknowledgementofthevalueinpreservingdatacollections.Aselectronic
datasetsgrowinsizeandcomplexity,thechallengesformakingthemcitablealsogrow.
Datasetsaremorecomplexthantext,andsincetheyarenotstatic(andriskfailingDr.Hayes’stestfor
reliablecitation),weneedprovisionsforversioning.Butfordatasetstobecitablerequiresmorethan
justthedevelopmentofadata-citationformat.Weneedaninfrastructurethatsupportsstorage,
curation,distribution,discovery,andaccesscontrolforthesenewknowledgeassets.
Universitylibrarieshaveservedthatroleforjournalarticlesthroughtheprintandelectronicpublishing
eras.Thevaluetheybringtoauniversityanditsresearchersiswellunderstood.Yes,thereareissues
aboutjournalpricing,andyes,therearedriversforopen-accesspublishing.Butthecostsofthis
infrastructurehavebeenconsideredalegitimatepartofauniversity’sroleinservingresearchersand
students.
Thereisnocomparableinfrastructureavailableforarchiving,curatingandprovidingaccess(andrights
management)forourgrowingcollectionsofdatasets.Thegovernment,ofcourse,playsacriticalrolein
supportingdatacollectionssuchasGenBank,butgovernmentdatarepositoriesarenotsufficient.
Eachresearchsubjectareahasdifferentrequirementsandpractices.Consider,forexample,the
differencesbetweengeneticandastronomicaldata.Consideralsothedifferencesbetween
experimentaldataandfielddata.Experimentscanberepeatedandtheresultsvalidated;fielddata
cannotbecollectedagainunderidenticalcircumstances.MarciaMcNutthasrecentlypublisheda
usefuldiscussiononthistopicincludingtheissueofcitationstandardsfordata(McNutt,2016).
Whilethereiscurrentlynobusinessmodelforconstructingandmaintainingthenecessary
infrastructurefordatacollections,scholarshaven’twaitedforofficialsolutions:insteadtheysawthe
needandanopportunitytoadvanceknowledge,andhavemovedforward.Infrastructureand
standardsmustfollow.Atpresent,wehaveavarietyofsolutionsinplayandavarietyofdata-citation
practicesunderdevelopment.
In2008,NISOconducteda“ThoughtLeaderMeetingonResearchData”atwhichthetopicofdata
citationwasdiscussed.Amongtheobservationsmadewastheneedtoaccommodatethediffering
requirementsfoundacrossdisciplinesandfieldsofresearch.Oneofthegroup’srecommendationswas
forNISOtoworkcollaborativelywithotherorganizationstodevelopguidelinesfordatacitation.In
2009(rev.2010),TobyGreenpreparedawhitepaperforOECDexpressingconcernsaboutlongterm
discoverabilityandaccessibilityofdatasets:“WeNeedPublishingStandardsforDatasetsandData
Tables”(Green,2009).Styleguidesnowincludeprovisionsforcitingelectronicandwebresources.
-4-
MLAevenhasguidelinesforcitingaTwitterpost.Theycallfortypicalcitationmetadata,butwith
specialaccommodationssuchasusingthearchivewherethedatasetishousedinlieuofthepublisher.
WhenaDOIisunavailable,theyrecommenduseofapersistentURL.
SincetheDOISystemwasdesignedtosupporttraditionalpublishers,itisnotautomaticallysuitedto
accommodatelargenumbersofresearchersdirectly.In2009,DataCite.orgwasestablishedbythe
BritishLibrary,theTechnicalInformationCenterofDenmark,TUDelftLibrary,theNationalResearch
Council’sCanadaInstituteforScientificandTechnicalInformation(NRC-CISTI),CaliforniaDigital
Library,PurdueUniversity,andtheGermanNationalLibraryofScienceandTechnology.Thegroup
assistsmembersinusingtheDataCiteServiceformintingDOIsandregisteringassociatedmetadata.
DataCitealsooffersametadataschema(Starr,2011)andrecommendsthatanewDOIbeissuedwhen
changesaremadetothedatasetornewversionsarecreated.
InSeptember2013,CODATAandICSTIauthoredareportwiththeclevertitle:“OutofCite,Outof
Mind:TheCurrentStateofPractice,Policy,andTechnologyfortheCitationofData”(CODATA-ICSTI
TaskGrouponDataCitationStandardsandPractices,2013).Itsabstractechoesandupdatesthe
observationsofDr.Hayes:“Theuseofpublisheddigitaldata,liketheuseofdigitallypublished
literature,dependsupontheabilitytoidentify,authenticate,locate,access,andinterpretthem.Data
citationsprovidenecessarysupportforthesefunctions…”Thereportalsostressesthat“As
technologicalfactors,suchasfasterprocessors,betterstorage,andincreasedbandwidth,have
enabledthemuchgreaterproductionandcaptureofdata,thecreationofstandardstomanagethese
datahasnotkeptpace.”Thereportoffersasetof“guidingprinciples”aswellaschallengesto
implementation.
Publishersarealsoworkingtofacilitatedatapublishingandcitation.Nature.comproducesandhosts
ScientificData,anopen-access,peer-reviewedjournalfordescriptionsofscientificallyvaluable
datasetsinthenaturalsciences.Theprimaryarticle-typeisa“DataDescriptor”thatisdesignedto
makedatamorediscoverable,interpretable,andreusable.Itdoesnotstorethedatabutratherrelies
uponpublic,community-recognizedrepositories.Seehttp://www.nature.com/sdata/publish/forauthors#aims-scope
ThomsonReutersoffersaDataCitationIndex.TheDataCitationIndexcapturesallavailablemetadata
forthedatarepositoriesitindexes.Sincethemetadatainthoserepositoriescanvaryinformatand
detail,ThomsonReutersisworkingtoestablishamoreconsistent,descriptivedata-citationformat.
Seehttp://wokinfo.com//products_tools/multidisciplinary/dci/repositories/
Elsevieralsohasinitiativesforsupportingresearchdata.Thecompanyofferitsowndatarepositoryvia
Mendeley.EachdatasetisgivenitsownDOIandisarchivedthroughDataArchivingandNetworked
Services(DANS).ElsevieroffersanOpenDatapilotinitiativewhereresearchdatacanbemadeopenly
availableonScienceDirectunderaCC-BYlicense.ItalsohasaDataLinktoolthatsupportsdata
-5-
discoveryandincludesadatabasesearchengine,anautomaticdata-citationgenerator,adataarticle
writingtoolfortheGenomicsDatajournal,andadatavisualizationtool.See
https://www.elsevier.com/about/open-science/research-data
BiomedCentralisworkingwithDataCitetoaddressconcernsraisedabouttheirOpenDatapolicyand
thelegal(copyright)statusofdatapublishedintheirOpenAccessjournals.See
https://www.biomedcentral.com/about/policies/open-data
Dataverse.orgatHarvardisanopensourcewebapplicationtoshare,preserve,cite,explore,and
analyzeresearchdata.Harvard'sInstituteforQuantitativeSocialScience(IQSS),thecreatorofthe
application,isworkingonasetofguidelinesfortieredaccess.ThelevelsofaccessincludeOpen;
Guestbook;RequiredAcceptanceofTermsofUse;andRestrictedAccess,whichrequiresaspecific
accessrequest.Dataverse’sstatementofbestpractices(DataScienceatTheInstituteforQuantitative
andSocialScience,2015)isveryusefulinlayingoutthemanyfactorsthatcomeintoplaywhenusinga
dataset.
Force11hasdevelopedaJointDeclarationofDataCitationPrinciples(DataCitationSynthesisGroup,
2014)thatisendorsedbymanycommercialandscholarlypublishersincludingElsevier,Nature
PublishingGroup,AGU,AIP,APS,PLIOand,ofcourse,NISO.Theseprinciplesdescribewhatisneeded
foradatacitationtobefunctional:
1. Importance:Datashouldbeconsideredlegitimate,citableproductsofresearch.Datacitations
shouldbeaccordedthesameimportanceinthescholarlyrecordascitationsofotherresearch
objects,suchaspublications.
2. CreditandAttribution:Datacitationsshouldfacilitategivingscholarlycreditandnormativeand
legalattributiontoallcontributorstothedata,recognizingthatasinglestyleormechanismof
attributionmaynotbeapplicabletoalldata.
3. Evidence:Inscholarlyliterature,wheneverandwhereveraclaimreliesupondata,the
correspondingdatashouldbecited.
4. UniqueIdentification:Adatacitationshouldincludeapersistentmethodforidentificationthat
ismachineactionable,globallyunique,andwidelyusedbyacommunity.
5. Access:Datacitationsshouldfacilitateaccesstothedatathemselvesandtosuchassociated
metadata,documentation,code,andothermaterials,asarenecessaryforbothhumansand
machinestomakeinformeduseofthereferenceddata.
6. Persistence:Uniqueidentifiers,andmetadatadescribingthedata,anditsdisposition,should
persist–evenbeyondthelifespanofthedatatheydescribe.
7. SpecificityandVerifiability:Datacitationsshouldfacilitateidentificationof,accessto,and
verificationofthespecificdatathatsupportaclaim.Citationsorcitationmetadatashould
includeinformationaboutprovenanceandfixitysufficienttofacilitateverifyingthatthespecific
timeslice,versionand/orgranularportionofdataretrievedsubsequentlyisthesameaswas
originallycited.
8. Interoperabilityandflexibility:Datacitationmethodsshouldbesufficientlyflexibleto
accommodatethevariantpracticesamongcommunities,butshouldnotdiffersomuchthat
theycompromiseinteroperabilityofdatacitationpracticesacrosscommunities.
-6-
Asthevolumeofdatacontinuestogrow,agreed-uponstandardsandpractices,adaptedfortheneeds
ofindividualareasofresearch,becomeevermoreimportantinensuringthatdatacanbecitedand
discovered.Itisreassuringtoseesuchastrongcommunity-wideefforttowarddevelopingand
promotingreliabledata-citationpractices,butmuchworkisrequiredintheareaofcredentialingand
supportingthedatarepositoriesthatareneededtomaintain,curate,andprovideaccesstodatasets.
PublishingandCitingSoftware
Datasets,foralltheirchallenges,arenotthelasthurdleforsustainingourabilitytocitetheknowledge
objectsproducedbytoday’sresearchers.Wehavecomealongwaysinceitwassufficientfor
researcherstotucktheirdataintoExcelspreadsheets.Datasetsarenowmorecomplexandrequire
moresophisticatedtoolstoanalyzeandextractvalue.
Inmy1997ICSTIpaper,Ianticipatedthatnewformsofresearchoutputwouldinclude“computational
modelsandsimulationsalongwithothercollectionsoffunctionalinformation.”Asresearchbecomes
moredataintensive,softwareisdevelopedtoprocessthedata;thatsoftwareisanintegralpartof
makingdatasetsfunctional.Elsevierestimatesthat38percentofresearchersnowspendatleastone
dayperweekcreatingsoftwaretoanalyzethedatatheyhavecollected.
Ithasbecomeimportantforsoftwaretobetreatedasavalidpartofthescholarlyrecord.When
customsoftwareisthemeansbywhichthedataisprocessedandconclusionsaredrawn,likedata,it
needstobepublishedinafunctionalform.Itcannotbeusefullyreducedtotextinajournalarticleany
morethangeneticsequencescouldusefullybepublishedasprintedstringsofA-T-G-Ccombinations.
Versioningisacriticalissueforpublishingandcitingsoftware.Anditistrickytoaccomplish.Whilea
versionofsoftwarecanbecited,thereisapossibilitythatthesoftwareincludesacall-outtoacode
librarythatmayhavebeenchanged.
Tothisend,thesoftwarecommunityhasbeenactiveindevelopingsoftwarerepositoriesandversion
controltools.GitHub,oneofthemostwidelyused,supportsprivaterepositoriesandfree,open-source
accounts.Anotherapproachtomakingsoftwarereusableandcitablehasbeenthedevelopmentand
useof“reproduciblereports”(Visser,2014).ThemotivationforthisapproachwasofferedbyDonaldE.
Knuthbackin1984,whenhesaid,“Letuschangeourtraditionalattitudetotheconstructionof
programs.Insteadofimaginingthatourmaintaskistoinstructacomputerwhattodo,letus
concentrateratheronexplainingtohumanbeingswhatwewantacomputertodo”(Knuth,1984).
Reproduciblereportingprovidesawayforresearcherstopackagetogetherallthecomponentsoftheir
work,includingtheworkflow,data,andcodeintoasharable—andpotentiallycitable—package.
-7-
Amongthebetterknowntoolsthathavebeendevelopedtoaidresearchersinpreparingreproducible
reportsareGalaxy,JupyterNotebook(formerlyIPython),andknitr,adynamicreportgeneratorforR.
Wearealsoseeingcommercialpublishersprovidingsupportformakingsoftwarecitable.Articles
publishedinNatureMethodsincreasinglysupportsupplementarysoftwarefiles,mostofwhichinclude
sourcecode.NatureencouragestheuseofcoderepositoriessuchasGitHubpriortosubmissionofan
article.Usingtheserepositoriesexpeditesthepeer-reviewprocessandavoidsthenecessityfor
reviewerstotestthecodeontheirowncomputers.
ElsevierhasstartedOriginalSoftwarePublicationstodescribesignificantsoftwareand/orcode.The
softwarewillbepeer-reviewedandconsidered"onebodyofwork"forcitationandindexingpurposes.
Thesoftware/codeitselfwillbedepositedonthejournal’sGitHub,andElsevierstatesthat,“all
softwareandcodepublishedis,andwillremain,fullyownedbytheirdevelopers.”
LooseEndsandNewFormsofContent
Technologyhasmadeitpossibleforscholarlycontenttobedistributedinnewandlessformalways.In
thepastwetalkedaboutthe“InvisibleCollege”andgreyliterature.Nowwetalkaboutscholarly
collaborationnetworks(SCNs)andSocialSharingNetworks(SSNs)suchasMendeley(nowownedby
Elsevier)withfivemillionmembers,Academia.eduwith30millionmonthlyusers,andResearchGate
withsixmillionmembers—levelsofusagethatmaketheseimportantchannelsforscholarly
communication.Andtotheextentthatpublishingmeansmakinginformationpublic,thesenetworks
representanewformofpublishing.Thetopic,particularconcernsaboutsharingofjournalarticleson
thesenetworks,hasbeendiscussedontheScholarlyKitchensite(Meadows,2015).Itisdifficultto
conceiveofhowsuchcommunicationcanbemadecitable,butitisanissuethatwarrantscreative
consideration.
E-printserverssuchasarXiv(atCornell),bioRxiv(atColdSpringHarbor)andSSRN(SocialScience
ResearchNetwork,recentlyacquiredbyElsevier)arewell-regardedcontentdistributionchannelsin
theirfields.Likescholarlycollaborationnetworks,theyprovideforrapidcommunicationofresearch
results.Whiletheydonotofferthefinal,citableformofthepaper,theyarewidelyusedandimportant
componentsofthescholarlycommunicationprocess.
New,informalchannelscontinuetopop-up.Consider@scholarlycomonTwitter.It’spartof
Columbia’sscholarlycommunicationprogramandexploresnewwaystoshare,curate,andpreserve
newknowledge.TherearealsoTwitterhandlesfromSSP(@ScholarlyPub)andHarvard(@oscharvard),
amongothers.Whiletheydonotconstituteformalcommunication,theyhavebecomeimportant
channelsforsharingscholarlyinformation.Whilethesenew,informalchannelsdonotlendthemselves
totraditionalcitation,theyhavebecomeanintegralpartofhowtoday’sscholarscommunicate.Andas
-8-
mentionedearlier,theyaresufficientlyimportantthatstyleguidesnowprovideinstructionsonciting
thistypeofcontent.
Thisisjustasamplingofhowscholarscontinuetoworkandcommunicateinnewways.Someofthese
assetsandcommunicationswarrantcitationinbibliographies(sometimeswithURLsthoughnotDOIs).
Butoncetheyhavebeencited,howeverinformally,wearefacedwithDr.Hayesconcernabouthow
theywillbemadeavailableovertimefor“checking,verifying,validating.”
Myconcernsforthefutureofcitationarenotlimitedtothestandardsandstructureofcitationsfor
newtypesofcontent.Weneedtothinkseriouslyabouthowscholarlyresearchfindingswillbe
archived,curated,andmadeaccessibleforfutureuseandreference.Theroleoflibrariesasarchivesof
scholarshipischanging,asarepublishingbusinessmodels.Scholarlysocieties,oncesuchimportant
publishersofscholarlyresearch,arebeingovertakenbylargecommercialpublishersthathavethe
resourcestoinvestinnewfunctionalityneededtodealwithdatasetandsoftwarecitation.Wecan
hopethatthegovernmentwillcontinuetofillsomeofthisrole.Butthatwillnotbesufficient.
Commercialpublisherswillcertainlyplayasignificantrole.Butwillopensourcepublishingenterprises
likePLOShavetheresourcestopreservetheircollection?Whatrolewilllibrariesplayinthisnew
paradigm?
Archivingisnoteasy;curatingisnoteasy.Itrequiresalongcommitmentandtheresourcestosustain
thatcommitment.Theproblemexistsinmanyfieldsbeyondpublishingfromoldmoviestobacterial
cultures.Butforus,itisaproblemofthesustainabilityofthecitationswecreate.Creatingcitations,
andcreatingstandardsforcitationswillnotbeenoughiftheyallresolvetodeadlinks.
-9-
References
CODATA-ICSTITaskGrouponDataCitationStandardsandPractices.“OutofCite,OutofMind:The
CurrentStateofPractice,Policy,andTechnologyfortheCitationofData.”DataScienceJournal
12(September2013).http://doi.org/10.2481/dsj.OSOM13-043
DataCitationSynthesisGroup.“JointDeclarationofDataCitationPrinciples.”MartoneM.(ed.)San
DiegoCA:FORCE11.(2014).https://www.force11.org/group/joint-declaration-data-citationprinciples-final
DataScienceatTheInstituteforQuantitativeandSocialScience.“HarvardDataverseGeneralTermsof
Use.”(2015).http://best-practices.dataverse.org/harvard-policies/harvard-terms-of-use.html
Elsevier.“OriginalSoftwarePublications.”(2016).https://www.elsevier.com/books-andjournals/content-innovation/original-software-publications
Green,T.“WeNeedPublishingStandardsforDatasetsandDataTables.”OECDPublishingWhitePapers
(2009).doi:10.1787/787355886123http://dx.doi.org/10.1787/787355886123
Hayes,RobertM.“TheNeedsofScienceandTechnology.”ScienceandTechnologyLibraries12,no.4
(1992):3-33.
Kelly,MaureenC.“TheRoleofA&IServicesinFacilitatingAccesstotheE-ArchiveofScience.”ICSTI
Forum:TheQuarterlyNewsletteroftheInternationalScientificandTechnicalInformation,no.
26(November1997).http://www.informedstrategies.com/wpcontent/uploads/2015/10/Facilitating_Access_to_the_eArchive_of_Science_Nov_97_MCKelly_I
CSTI_.pdf
Knuth,DonaldE.LiterateProgramming.CenterfortheStudyofLanguageandInformation.(1984).
McNutt,M.“Liberatingfieldsciencesamplesanddata.”Science351,Issue6277(4March2016):10241026.http://science.sciencemag.org/content/351/6277/1024/DOI:10.1126/science.aad7048
Meadows,Alice.“ArticleSharingonScholarlyCollaborationNetworks–AnInterviewwithFredDylla
aboutSTM’sDraftGuidelinesandConsultation.”ScholarlyKitchen.(February24,2015).
https://scholarlykitchen.sspnet.org/2015/02/24/article-sharing-on-scholarly-collaborationnetworks-an-interview-with-fred-dylla-about-stms-draft-guidelines-and-consultation
TheRoyalSociety.“350YearsofScientificPublishing.”(2016).
https://royalsociety.org/journals/publishing-activities/publishing350/.
Starr,Joan.“isCitedBy:AMetadataSchemeforDataCite.”D-LibMagazine17,no.1/2.
(January/February2011).http://www.dlib.org/dlib/january11/starr/01starr.html
Visser,Ingmar.“WhyReproducibleReporting?”OpenScienceCollaboration.(October2014).
http://osc.centerforopenscience.org/2014/10/30/reproducible-reporting/
-10-