Citation Linking for Electronic Journal Articles CNI Fall Task Force Meeting Phoenix AZ December 1999 What we’re going to talk about • General model for reference linking (me) – NISO / DLF / SSP / NFAIS workshops – February 1999, Washington DC – June 1999, Boston MA • “Appropriate copy” problem (Dale) – DLF Architecture Committee – SFX Why talk about it? • Publicize the activity so far • Seeking interested parties – How can we move this effort forward? – Who can/should participate? What are we talking about? • Reference (or citation) linking – providing an actionable link from a reference to an object – focus on electronic journal articles • References – from index databases (A&I services, search services, citation databases) – from article “references” section (bibliography) What are we talking about (cont) • Links – maybe URL – maybe some other link-key (identifier) • Objects – works / manifestations / items => creations – content vs. surrogates / substitutes “Puddles” • Closed systems where single agency controls both citations and content • Publisher(s) – Elsevier’s ScienceDirect, Wiley’s InterScience • Aggregator service – OCLC ECO • Discipline – NASA Astrophysics Data System, PubMed Puddles (cont.) • User Community – OhioLink, University of Toronto • Problems with Puddles: – ok when everything a user wants is inside the puddle – not ok when content is limited, arbitrary, or incongruent with user needs Open Reference Linking • Any link to any object, regardless of which system the link, • the object, • or the user is in. • Assume multiplicity • Require interoperability WHAT WE ARE TRYING TO ACCOMPLISH Any old system Citation Citation LINK CLICK LINK Cited Article MAGIC Model for open reference linking Publisher Reference Database Location Database Identifiers Identifier Citation Client Content URLs URL Content Pieces of the problem • Get a link for a reference • Resolve the link to one or more locations of the target document • Identify the most appropriate copy or copies of the target document for the user URL or Identifier • Multiple locations • Persistence • Data management • Nearly all implementations find identifier necessary • Identifier = “name based” How to get a link: derived vs. dumb • Derived: Construct it from data in the reference – shared within a discipline (ADS) – national standard (SICI) – cope with multiplicity (S-Link-S) • Dumb: Look it up from data in the reference – e.g. DOI-X How to get a link: static vs. dynamic • Static: Pre-constructed – embedded in the source document – stored in a table associated with the source – Advantage: opportunity to review and correct • Dynamic: Supplied on-the-fly – looked up or calculated when citation or reference displayed – Advantage: currency and flexibility Static and Dynamic Linking Static Dynamic Index ISI OhioLink Article ADS OpenJournal Model: how to get a link Publisher Reference Database citation Identifier(s) Client Resolve the link to location(s) • For given identifier – look up in database mapping identifier to location(s) – return list of locations where items may be found – return additional information to distinguish between items (e.g. format) Model: how to resolve a link Publisher Location Database Identifier URL(s) Client How to resolve the link • In puddles – may be single type of link – may be handled by system software • In open reference linking – will be multiple types of links – need to find appropriate resolution service(s) – need protocol for communicating with resolution service How to find appropriate resolver • Currently – Browser plug-in – Proxy server – Tunnel identifier in URL • Future ? – URN model of distributed resolution – web browser support for user configuration of a hierarchy of identifier resolution services WHAT IF MORE THAN 1 COPY EXISTS? • Elsevier journals, for example, are available from – – – – – Elsevier ScienceDirect University of Michigan PEAK OhioLink University of Toronto Florida Center for Library Automation WHICH URL? Name Resolver NAME URL? Sciencedirect.com? Ohiolink.edu? Utoronto.ca? Umich.edu? FCLA.edu? IT SOMETIMES DEPENDS ON WHO THE USER IS... SOURCES OF MULTIPLE COPIES • Aggregators – OCLC, EBSCO, Bell & Howell, Lexis/Nexis, IAC… • “Local loading” – OhioLink, University of Toronto, University of Florida… • E-print – xxx (LANL), Cogprints, RePec…. WHY MULTIPLE COPIES • Performance -- may want highly used objects “closer” to the user in network terms • Different players can provide different service models using same content – e. g., gathering topically related materials into knowledge bases (Ovid) – published and unpublished articles in a single eprint service WHY MULTIPLE COPIES (continued) • Competition in repository services – Encourages functional innovation – Rationalizes prices for services • Archiving – Institutional failure is as great a danger as technological failure, particularly when dealing with commercial players CURRENT STATE • Few working solutions (Linkout @ NIH, SFX prototype @ UGhent and LANL) • DLF/CNRI discussion of the following 3 models – All intervene in the name resolution process to select the appropriate URL to return 1 Name Resolution Request Local Name Resolver 2. Address (if found locally) OR 3. Address 2. Name Resolution Request (if address not found locally) Universal Name Resolver LOCAL CACHE Filter Server 2. Name Resolution Request 1. Name Resolution Request 3. Addresses (URL1, URL2, URL3….) 6. Address 5. Bibliographic Data 4. Request Bibliographic Data (if appropriate source is ambiguous)) Reference Server PROFILE-BASED FILTER Universal Name Resolver Filter Server 2. Name Resolution Request 1. Name Resolution Request 8. Address 3. Addresses (URL1, URL2, URL3….) 4. Availability Query 4. Availability Query 6. Availability 7. Availability 4. Availability Query 5. Availability Content Service 1 Universal Name Resolver Content Service 2 Content Service 3 BROADCAST-RESULT- BASED FILTER SOME ISSUES • Ugly, ugly, ugly – In part because linking is to articles, most access based on serial title and year • All solutions require a lot of coordination • Users who are members of multiple “rights communities” are a major complication 1. Service cookie-pusher URL 7. Page of links 2. Cookie info 3. Cookie + redirect to service 6. Request for links SFX Server EXTERNAL “SFX AWARE” SERVICE 5. Article or citation + SFX links based on cookie 4. Service request +cookie “Cookie Pusher” Portal Service SFX LINKING SYSTEM SFX vs Name Based Linking • SFX – generalized for many kinds of links (including to paper copies…) – requires explicit cooperation of citation source • SFX does not simplify providing appropriate link – but can work with both algorithmic and namebased links – and methodology provides bibliographic context for link derivation So… • Different approaches have different strengths – mix and match possible • The big issue: who has the motivation to address this seriously? • Interested? Contact us! The requisite URLs: Report on NISO/DLF/SSP/NFAIS meetings: http://www.dlib.org/dlib/july99/caplan/07caplan.html Paper on DLF/CNRI “appropriate copy” discussion: http://www.niso.org/DLFarch.html and contact information: [email protected] [email protected]
© Copyright 2026 Paperzz