Using Literary Warrant to Define a Version of the DDC for Automated Classification Services OCLC Online Computer Library Center Diane Vizine-Goetz Research Scientist, OCLC Research Julianne Beall Assistant Editor, DDC ISKO Conference London, 13-16 July 2004 © 2004 OCLC Online Computer Library Center, Inc. Exploratory Study Defining a version of the DDC – To facilitate automatic assignment of DDC numbers to electronic documents – Based on literary warrant for topics in electronic resources 2 DDC for Automated Classification Machine classification service – A database of concepts used to classify a document – Software that generates a prioritized list of concepts that characterize the content of the document (Scorpion) 3 Checking Literary Warrant Primary source for checking literary warrant: BUBL – Ca. 12,000 Internet resources Canadian Information By Subject – Ca. 10,000 Internet resources KidsClick! – Ca. 6,400 Internet resources 4 http://bubl.ac.uk/link/ddc.html 5 BUBL Site Statistics Dewey Class Number of sites Site Status ok US Sites UK Sites 500 510 135 167 103 36 59 65 27 43 520 530 186 139 133 111 84 68 25 20 540 118 82 38 33 550 247 196 127 30 Total 992 761 441 178 6 http://www.nlc-bnc.ca/caninfo/esub.htm 7 http://sunsite.berkeley.edu/KidsClick!/dewey.html 8 Defining a Version of the DDC Starting point: classification numbers in Abridged Edition 14 True abridgment: the truncated number for a topic is always the same as the full number for the topic, except shorter, e.g.: – 551.64 Forecasting and forecasts of specific phenomena • Cut back to 551.6 Climatology and weather 9 Database Record Class number Caption Superordinate hierarchy Notes that describe what is found in a class Relative Index entries Mapped terminology 10 Keywords from 551.64 Added to 551.6; 551.64 Deleted Class-here note: methods of forecasting specific phenomena specific areas Relative Index entries, e.g., – Acid rain—weather forecasting – Hurricanes—weather forecasting – Rain—weather forecasting Subject Headings for Children LCSH – Storms—Forecasting 11 Enriching Terminology for Numbers Built from Table 1 Example: built number 520.6 520 Astronomy and allied sciences Relative Index terms that approximate the whole of 520: – – – – Astronomy Celestial bodies Outer space Space—astronomy 12 Built Number 520.6 Relative Index terms from T1—06, e.g.: – Associations – Organizations Combined entries for 520.6, e.g.: – – – – – – Astronomy—associations Astronomy—organisations Astronomy—organizations Celestial bodies—associations Celestial bodies—organisations Celestial bodies—organizations 13 Subdivisions Added or Enriched 505 506 507.2 507.4 509 509.2 510.28 510.5 510.6 510.71 510.9 520.6 526.06 530.05 530.06 530.071 540.5 540.6 540.71 550.5 550.6 550.71 551.4606 551.4607 Science Serial publications Science Organizations Science Research; statistical methods Science Museums, collections, exhibits Science Historical, geographic, persons treatment Science Persons Mathematics Auxiliary techniques and procedures; apparatus, equipment Mathematics Serial publications Mathematics Organizations Mathematics Education Mathematics Historical, geographic, persons treatment Astronomy Organizations Cartography Organizations Physics Serial publications Physics Organizations Physics Education Chemistry Serial publications Chemistry Organizations Chemistry Education Earth sciences Serial publications Earth sciences Organizations Earth sciences Education Hydrosphere and submarine geology Oceanography Organizations Hydrosphere and submarine geology Oceanography Education and research 1_05 1_06 1_072 1_074 1_09 1_092 1_0285 1_05 1_06 1_071 1_09 1_06 1_06 1_05 1_06 1_071 1_05 1_06 1_071 1_05 1_06 1_071 1_06 1_071; 1_072 14 Added UK Spellings for Index Entries 512.7 519.6 Number theory Mathematical optimization Factorisation—number theory Factorization—number theory Number theory Mathematical optimisation Mathematical optimization Optimisation—mathematical Optimization—mathematics Prime numbers 15 Results: Scorpion & BUBL Match Type Exact Partial Exact or Partial Non-match Total A14 base A14.v1 A14.v2 A14.v3 139 139 129 183 155 155 186 167 294 294 315 350 455 455 434 399 749 749 749 749 A14.v1 base file + UK spelling A14.v2 base file + UK spelling + SS added/enriched A14.v3 base file + UK spelling + SS added/enriched + truncation 16 17 18 19 20 21 Next Steps Analyze where the truncation and the enriched terminology were useful and where not; revise the v3 database accordingly Extend approach to additional classes and projects (ePrints UK) 22 Links Research : Projects : ePrints-UK – http://www.oclc.org/research/projects/ mswitch/epuk.htm Dewey – http://www.oclc.org/dewey/ 23
© Copyright 2026 Paperzz