Bilgi Erişim: Temel Kavramlar Yaşar Tonta Hacettepe Üniversitesi [email protected] yunus.hacettepe.edu.tr/~tonta/ DOK324/BBY220 Bilgi Erişim İlkeleri AB 2005, Gaziantep 2-4 Şubat 2005 - 1 Plan • • • • • • Bilgi tanımı Belge tanımı Bilgi erişim sistemlerinin mantıksal yapısı Temel kavramlar Erişim kuralları Performans ölçümleri AB 2005, Gaziantep 2-4 Şubat 2005 - 2 Felsefede Bilgi (Knowledge) • Bilgi – Bilme etkinliği – Bu etkinlik sonucu elde edilen çıktı • Bilgi etkinlikleri – – – – – – – – algılama anlama düşünme muhakeme etme yorumlama açıklama doğrulama değerlendirme Kaynak: Kuçuradi, 1995, s. 97 AB 2005, Gaziantep 2-4 Şubat 2005 - 3 Bilgi Araştırmalarında Bilgi (Information) • Süreç olarak bilgi (information-asprocess) • Bilgi olarak bilgi (information-asknowledge) • Nesne olarak bilgi (information-asobject) AB 2005, Gaziantep 2-4 Şubat 2005 - 4 Bilgiye Farklı Bakış Açıları SOYUT SOMUT Bilgi olarak bilgi VARLIK Bilgi (knowledge) Nesne olarak bilgi Veri, belge, kayıtlı bilgi Süreç olarak bilgi SÜREÇ Bilgilenme Bilgi işleme, veri işleme, belge işleme, bilgi mühendisliği Kaynak: Buckland, 1991, s. 6 AB 2005, Gaziantep 2-4 Şubat 2005 - 5 Belge • docere: öğretmek, bilgilendirmek • –ment: araçlar • “bir fiziksel ya da entellektüel olguyu temsil etmek, yeniden yaratmak ya da ispatlamak için korunan ya da kaydedilen tüm somut ve sembolik dizinsel işaretler” (Suzanne Briet) • Belge örnekleri: kil tablet, yontu, papirüs, harita, yazma, kitap, dergi, resim, film, kaset, CD-ROM, DVD, Web sayfası, dijital belgeler, vs. AB 2005, Gaziantep 2-4 Şubat 2005 - 6 Farklı Disiplinlerde Belge • Belge: biçim + işaret + ortam • Biçim: – Hattatlar, müzik ve sinema yapımcıları, örüntü tanıma uzmanları, kütüphaneciler, arşivciler, müzeciler • İşaret: – Dilbilimciler, bilgisayarcılar, yapay zeka uzmanları • Ortam: – Arşivciler, tarihçiler, hukukçular, diplomatik bilimciler, yayıncılar, kütüphaneciler, vd. AB 2005, Gaziantep 2-4 Şubat 2005 - 7 Bilgi Yönetimi (Information Management) • her türlü örgütün etkin olarak işletilmesiyle ilgili bilginin sağlanması, düzenlenmesi, denetimi, yayımı ve kullanımına yönetim ilkelerinin uygulanması • “doğru karar vermek için doğru formda, doğru kişiye, doğru maliyetle, doğru zamanda, doğru yerde, doğru bilgiyi sağlamak” AB 2005, Gaziantep 2-4 Şubat 2005 - 8 Bilgi Yönetimi (Knowledge Management) • bir örgütün misyonunu gerçekleştirmesi için örgütün entellektüel sermayesinin kullanımına dayanan bir yönetim uygulaması • Entellektüel sermaye: örgüt çalışanlarının geliştirdiği ya da biriktirdiği deneyim, hizmet ve ürünlerden sağlanan bilgi (knowledge). • Bilgi (knowledge): – Belirtik (nesne olarak bilgi) – Örtük (bilgi olarak bilgi) AB 2005, Gaziantep 2-4 Şubat 2005 - 9 Bilgi Yöneticisi Neyi Yönetir? • İnsan beyninde saklı örtük bilgileri mi? • Üzerinde bilgi taşıdığı varsayılan nesneleri (belgeleri) mi? • Yoksa her ikisini de mi? – Kütüphanecilik – Arşivcilik – Dokümantasyon - Belge yönetimi – Kayıt yönetimi - İdari dokümantasyon (records management, document management) – Veri yönetimi, Bilgi kaynakları yönetimi, Bilgi teknolojisi yönetimi – Bilgibilim, bilgi araştırmaları – Bilgi yönetimi (üzerinde bilgi taşıyan belgelerin yönetimi) AB 2005, Gaziantep 2-4 Şubat 2005 - 10 Bilgi Yönetimi (Information Management) • Belgelerin sağlanması, düzenlenmesi, yaşatılması, kullanımı, korunması, arşivlenmesi • Kullanıcıların bilgi gereksinimlerinin saptanması ve karşılanması • Bilgi sistemlerinin tasarlanması, kurulması ve işletilmesi • Bilgi teknolojisi yönetimi AB 2005, Gaziantep 2-4 Şubat 2005 - 11 Bilgi Erişim • “bilgi toplama, sınıflama, kataloglama, depolama, büyük miktardaki verilerden arama yapma ve bu verilerden istenen bilgiyi üretme (veya gösterme) tekniği ve süreci” AB 2005, Gaziantep 2-4 Şubat 2005 - 12 Bilgi Erişimin Temel İkilemi • “Hakkında bilgi bulmak için bilmediğin bir şeyi tanımlama gereği” (Hjerrpe) AB 2005, Gaziantep 2-4 Şubat 2005 - 13 Bilgi Keşfetme, Tanımlama, Düzenleme ve Erişim Keşfetme Keşfetme Tanımlama Tanımlama Düzenleme Düzenleme Erişim AB 2005, Gaziantep Erişim 2-4 Şubat 2005 - 14 Belge Erişim Sisteminin Mantıksal Düzenlemesi Belgeler Kullanıcılar Dizinleme Gömü Sözlük Sorgu formülasyonu Dizin tutanakları Erişim kuralı Formel sorgu cümlesi Kaynak: Maron, 1984 AB 2005, Gaziantep 2-4 Şubat 2005 - 15 İdeal Bilgi Erişim Sistemi • İlgili belgelerin tümüne ve salt ilgili belgelere erişim sağlamalı • “İlgililik” kavramı – Nesnel ilgililik – Öznel ilgililik • Birbirine benzeyen bilgileri bir araya getirmek, benzemeyenleri ayırmak AB 2005, Gaziantep 2-4 Şubat 2005 - 16 Background Concepts for IR • User Information Needs • Controlled Vocabularies (Pre and Postcoordination) • Indexing Languages • IR definitions and concepts – – – – – Documents Queries Collections Evaluation Relevance AB 2005, Gaziantep 2-4 Şubat 2005 - 17 User Information Need • Why build IR systems at all? • People have different and highly varied needs for information • People often do not know what they want, or may not be able to express it in a usable form – Boulding’s “Image” • How to satisfy these user needs for information? AB 2005, Gaziantep 2-4 Şubat 2005 - 18 Controlled Vocabularies • Vocabulary control is the attempt to provide a standardized and consistent set of terms (such as subject headings, names, classifications, etc.) with the intent of aiding the searcher in finding information. • Controlled vocabularies are a kind of metadata: – Data about data – Information about information AB 2005, Gaziantep 2-4 Şubat 2005 - 19 Pre- and Postcoordination • Precoordination relies on the indexer (librarian, etc.) to construct some adequate representation of the meaning of a document. • Postcoordination relies on the user or searcher to combine more atomic concepts in the attempt to describe the documents that would be considered relevant. AB 2005, Gaziantep 2-4 Şubat 2005 - 20 Structure of an IR System Search Line Interest profiles & Queries Formulating query in terms of descriptors Information Storage and Retrieval System Rules of the game = Rules for subject indexing + Thesaurus (which consists of Lead-In Vocabulary and Indexing Language Storage of profiles Store1: Profiles/ Search requests Indexing (Descriptive and Subject) Storage of Documents Comparison/ Matching Potentially Relevant Documents AB 2005, Gaziantep Documents & data Store2: Document representations Adapted from Soergel, p. 19 2-4 Şubat 2005 - 21 Storage Line Uses of Controlled Vocabularies • Library Subject Headings, Classification and Authority Files. • Commercial Journal Indexing Services and databases • Yahoo, and other Web classification schemes • Online and Manual Systems within organizations – SunSolve – MacArthur AB 2005, Gaziantep 2-4 Şubat 2005 - 22 Types of Indexing Languages • Uncontrolled Keyword Indexing • Indexing Languages – Controlled, but not structured • Thesauri – Controlled and Structured • Classification Systems – Controlled, Structured, and Coded • Faceted Classification Systems AB 2005, Gaziantep 2-4 Şubat 2005 - 23 Thesauri • A Thesaurus is a collection of selected vocabulary (preferred terms or descriptors) with links among Synonymous, Equivalent, Broader, Narrower and other Related Terms AB 2005, Gaziantep 2-4 Şubat 2005 - 24 Thesauri (cont.) • National and International Standards for Thesauri – ANSI/NISO z39.19--1994 -- American National Standard Guidelines for the Construction, Format and Management of Monolingual Thesauri – ANSI/NISO Draft Standard Z39.4-199x -- American National Standard Guidelines for Indexes in Information Retrieval – ISO 2788 -- Documentation -- Guidelines for the establishment and development of monolingual thesauri – ISO 5964-- Documentation -- Guidelines for the establishment and development of multilingual thesauri AB 2005, Gaziantep 2-4 Şubat 2005 - 25 Development of a Thesaurus • Term Selection. • Merging and Development of Concept Classes. • Definition of Broad Subject Fields and Subfields. • Development of Classificatory structure • Review, Testing, Application, Revision. AB 2005, Gaziantep 2-4 Şubat 2005 - 26 Categorization Summary • Processes of categorization underlie many of the issues having to do with information organization • Categorization is messier than our computer systems would like • Human categories have graded membership, consisting of family resemblances. • Family resemblance is expressed in part by which subset of features are shared • It is also determined by underlying understandings of the world that do not get represented in most systems AB 2005, Gaziantep 2-4 Şubat 2005 - 27 Classification Systems • A classification system is an indexing language often based on a broad ordering of topical areas. Thesauri and classification systems both use this broad ordering and maintain a structure of broader, narrower, and related topics. Classification schemes commonly use a coded notation for representing a topic and it’s place in relation to other terms. AB 2005, Gaziantep 2-4 Şubat 2005 - 28 Classification Systems (cont.) • Examples: – The Library of Congress Classification System – The Dewey Decimal Classification System – The ACM Computing Reviews Categories – The American Mathematical Society Classification System AB 2005, Gaziantep 2-4 Şubat 2005 - 29 Central Concepts in IR • • • • • Documents Queries Collections Evaluation Relevance AB 2005, Gaziantep 2-4 Şubat 2005 - 30 Documents • What do we mean by a document? – Full document? – Document surrogates? – Pages? • Buckland “What is a Document”, “What is a ‘Digital Document’” • Are IR systems better called Document Retrieval systems? • A document is a representation of some aggregation of information, treated as a unit. AB 2005, Gaziantep 2-4 Şubat 2005 - 31 Collection • A collection is some physical or logical aggregation of documents – A database – A Library – An index? – Others? AB 2005, Gaziantep 2-4 Şubat 2005 - 32 Queries • A query is some expression of a user’s information needs • Can take many forms – Natural language description of need – Formal query in a query language • Queries may not be accurate expressions of the information need – Differences between conversation with a person and formal query expression AB 2005, Gaziantep 2-4 Şubat 2005 - 33 Evaluation • Why Evaluate? • What to Evaluate? • How to Evaluate? AB 2005, Gaziantep 2-4 Şubat 2005 - 34 Why Evaluate? • Determine if the system is desirable • Make comparative assessments • Others? AB 2005, Gaziantep 2-4 Şubat 2005 - 35 What to Evaluate? • How much of the information need is satisfied. • How much was learned about a topic. • Incidental learning: – How much was learned about the collection. – How much was learned about other topics. • How inviting the system is. AB 2005, Gaziantep 2-4 Şubat 2005 - 36 What to Evaluate? What can be measured that reflects users’ ability to use system? (Cleverdon 66) effectiveness – – – – – Coverage of Information Form of Presentation Effort required/Ease of Use Time and Space Efficiency Recall • proportion of relevant material actually retrieved – Precision • proportion of retrieved material actually relevant AB 2005, Gaziantep 2-4 Şubat 2005 - 37 Relevance • In what ways can a document be relevant to a query? – Answer precise question precisely. – Partially answer question. – Suggest a source for more information. – Give background information. – Remind the user of other knowledge. – Others ... AB 2005, Gaziantep 2-4 Şubat 2005 - 38 Relevance • “Intuitively, we understand quite well what relevance means. It is a primitive ‘y’ know’ concept, as is information for which we hardly need a definition. … if and when any productive contact [in communication] is desired, consciously or not, we involve and use this intuitive notion or relevance.” » Saracevic, 1975 p. 324 AB 2005, Gaziantep 2-4 Şubat 2005 - 39 Relevance • How relevant is the document – for this user, for this information need. • Subjective, but • Measurable to some extent – How often do people agree a document is relevant to a query? • How well does it answer the question? – Complete answer? Partial? – Background Information? – Hints for further exploration? AB 2005, Gaziantep 2-4 Şubat 2005 - 40 Relevance Research and Thought • Review to 1975 by Saracevic • Reconsideration of user-centered relevance by Schamber, Eisenberg and Nilan, 1990 • Special Issue of JASIS on relevance (April 1994, 45(3)) AB 2005, Gaziantep 2-4 Şubat 2005 - 41 Saracevic • Relevance is considered as a measure of effectiveness of the contact between a source and a destination in a communications process – – – – – – Systems view Destinations view Subject Literature view Subject Knowledge view Pertinence Pragmatic view AB 2005, Gaziantep 2-4 Şubat 2005 - 42 Define your own relevance • Relevance is the (A) gage of relevance of an (B) aspect of relevance existing between an (C) object judged and a (D) frame of reference as judged by an (E) assessor • Where… From Saracevic, 1975 and Schamber 1990 AB 2005, Gaziantep 2-4 Şubat 2005 - 43 A. Gages • • • • • • • Measure Degree Extent Judgement Estimate Appraisal Relation AB 2005, Gaziantep 2-4 Şubat 2005 - 44 B. Aspect • • • • • • • Utility Matching Informativeness Satisfaction Appropriateness Usefulness Correspondence AB 2005, Gaziantep 2-4 Şubat 2005 - 45 C. Object judged • • • • • • • Document Document representation Reference Textual form Information provided Fact Article AB 2005, Gaziantep 2-4 Şubat 2005 - 46 D. Frame of reference • • • • • • • Question Question representation Research stage Information need Information used Point of view request AB 2005, Gaziantep 2-4 Şubat 2005 - 47 E. Assessor • • • • • • • Requester Intermediary Expert User Person Judge Information specialist AB 2005, Gaziantep 2-4 Şubat 2005 - 48 Schamber, Eisenberg and Nilan • “Relevance is the measure of retrieval performance in all information systems, including full-text, multimedia, questionanswering, database management and knowledge-based systems.” • Systems-oriented relevance: Topicality • User-Oriented relevance • Relevance as a multi-dimensional concept AB 2005, Gaziantep 2-4 Şubat 2005 - 49 Schamber, et al. Conclusions • “Relevance is a multidimensional concept whose meaning is largely dependent on users’ perceptions of information and their own information need situations • Relevance is a dynamic concept that depends on users’ judgements of the quality of the relationship between information and information need at a certain point in time. • Relevance is a complex but systematic and measureable concept if approached conceptually and operationally from the user’s perspective.” AB 2005, Gaziantep 2-4 Şubat 2005 - 50 Froehlich • Centrality and inadequacy of Topicality as the basis for relevance • Suggestions for a synthesis of views AB 2005, Gaziantep 2-4 Şubat 2005 - 51 Janes’ View of Relevance Satisfaction Topicality Relevance Utility Pertinence AB 2005, Gaziantep 2-4 Şubat 2005 - 52
© Copyright 2026 Paperzz