Searching 1 © Copyright 2006 Haim Levkowitz Outline • • • • • • • • • • • Goals and Objectives Topic Headlines Introduction Directories Open Directory Project Search Engines Metasearch Engines Search techniques Intelligent Agents Invisible Web Summary 2 © Copyright 2006 Haim Levkowitz 1 Goals and Objectives • Goals • Understand searching • find relevant information fast • know what / how / where to search • Objectives … 3 © Copyright 2006 Haim Levkowitz Objectives • • • • • • • • Subject Directories Open Directory project Search and metasearch engines Search techniques Intelligent agents The visible web The invisible web Search techniques for the invisible web 4 © Copyright 2006 Haim Levkowitz 2 Topic Headlines • Introduction • directory / search engines • Directories • subject tree manually / use engine • Open Directory Project •… 5 © Copyright 2006 Haim Levkowitz • Search Engines • ranking search results • How do search engines do their job • Metasearch Engines • multiple search engines at once • Search internet more effectively • Search Techniques • Intelligent Agents • Invisible Web 6 © Copyright 2006 Haim Levkowitz 3 Introduction • what / how / where to search • Search results web pages • main tools • Directory – subject guide organized by major topics and subtopics • Search Engines – “crawler / bot” • Each database • Directory compiled by humans • Engine’s generated automatically 7 © Copyright 2006 Haim Levkowitz Directories • • • • human-powered search engines organize information in hierarchical tree by subjects general specific two ways to search directory • Manual – browse subjects hierarchically • Search engine – enter search terms • Example –Yahoo Directory 8 © Copyright 2006 Haim Levkowitz 4 Directories 9 © Copyright 2006 Haim Levkowitz Open Directory Project • Search results ranked • human editors to rank web pages • As number of pages for topic increase more time-consuming and cost-bearing to rank • Open Directory Project • ranking system to users • Users become editors & evaluate web sites in area of expertise • lot more content • http://dmoz.org 10 © Copyright 2006 Haim Levkowitz 5 Search Engines • Examples • Google google.com • Yahoo yahoo.com • Ask : ask.com • MSN Search search.msn.com • AOL Search search.aol.com • Answers answers.com • Tips on using search engine and much more http://www.searchenginewatch.com 11 © Copyright 2006 Haim Levkowitz Search Engines • based • three important parts • “Spider / crawler / robot”: follow links in databases • Indexer: identify web page content + store in database • Searcher: sift through engine’s index to find matches query + rank matches • Relevance ranking algorithm crucial • Different engines different results 12 © Copyright 2006 Haim Levkowitz 6 Metasearch Engines • • • • multi-engine search skip engine that is down no own database Examples : • http://www.dogpile.com • http://www.metacrawler.com • http://www.profusion.com 13 © Copyright 2006 Haim Levkowitz Search Techniques • Searching guidelines: • Change query to improve results • Search string = key words • not exact phrase • Advanced Searching techniques: • Words and exact phrase • Boolean search – AND, OR, NOT • Title search –web page title • Site search –limit search to particular site • URL search, Link search • Wildcard (fuzzy) search –* • Features search –special features of engines 14 © Copyright 2006 Haim Levkowitz 7 Search Techniques 15 © Copyright 2006 Haim Levkowitz Intelligent Agents • Three retrieval paradigms: • Statistical – correlations of word counts in documents • Semantic – natural language processing and artificial intelligence • Contextual – use thesaurus and encoded relationships • intelligent agent: gather information / perform tasks based on human input • E.g., Spider part of search engine 16 © Copyright 2006 Haim Levkowitz 8 Intelligent Agents • Advantages: • More intelligent search • Create and update own knowledge database • Perform tasks quicker • Communicate & co-operate w/ other agents • Customizable • Continuously scan internet for information • Free user from mundane tasks 17 © Copyright 2006 Haim Levkowitz Invisible Web • hidden web content • Database contents • Dynamically generated pages • estimated to be larger than visible web • search invisible web: • Directories (Invisible Web Catalogue) • Databases 18 © Copyright 2006 Haim Levkowitz 9 Summary • Search engines • Directories • Open Directory Project • most popular search engines • Metasearch engines • Intelligent agents • Invisible web 19 © Copyright 2006 Haim Levkowitz 10
© Copyright 2025 Paperzz