SEARCH ENGINES:
SEARCHING?
FIND IT!
Foong Soon Fook
This article was compiled to give you a clear point of view on using one of the most
important tools of the World Wide Web (WWW) - Search Engines. Most users of
search engines do not use them effectively. If you are a searcher, one of the questions
that you would have most probably asked would be "Why won't this search engine
just give me what I asked for?" It's a widely accepted reality that a half of the time,
the subject that you searched for will be totally incoherent with the results that you
receive. First of all, I'll start off with giving a moderately technical guide to the
search engines and the way they work. This way, those of you really interested in the
mechanics of the search engine will not get left out. After that, I will basically focus
on making you an efficient searcher - getting relevant and high precision search
results.
What are search engines?
I assume you already have an impression of what a search engine is. To put it
plainly, it is an Internet tool which will search a database of collected
websites containing WORDS that you have searched for. It will then show
you the search results as a list of links to those websites. There are two types
of search engines. They are the individual search engines and the metasearch
engines. The difference between the two is that the individual search engines
compile their own databases while the metasearch engines do not compile
databases. Instead, they search the databases of multiple individual search
engines simultaneously.
How do search engines work?
So, how do these wonderful tools know about things on the web? The secret
is, they send out "spiders" or "bots" to crawl through the web space from link
to link, identifying and compiling their database as they go along. Sites with
no links to other pages may be missed by "spiders" altogether. Once the
spiders get to a web site, they typically index most of the words on the
publicly available pages at the site.
Then, when you search the web with a particular search engine, the engine
will scan its index of sites and match your keywords and phrases with those
in the texts of documents within the engine's database. Now, this brings us to
a revelation that when you are using a search engine, you are not searching
71
the entire web as it exists at this moment. You are actually searching a
portion of the web, captured in a fixed index created at an earlier date.
How much earlier? It's hard to say. Spiders regularly return to the web pages
they index to look for changes. When changes occur, the index is updated to
reflect the new information. However, the process of updating can take a
while, depending upon how often the spiders make their rounds and then,
how promptly the information they gather is added to the index. Until a page
has been both "spidered" AND "indexed," you won't be able to access the
new information.
Are all search engines the same?
Well, we've already known one difference between search engines, the
content of their databases. The fact here is that no two search engines are
identical. Search engines use selected software programs to search their
indexes for matching keywords and phrases, presenting their findings to you
in some kind of relevance ranking. Although software programs may be
similar, no two search engines are exactly the same in terms of size, speed
and content; no two search engines use exactly the same ranking schemes,
and not every search engine offers you exactly the same search options.
Therefore, your search is going to be different on every engine you use. The
difference may not be a lot, but it could be significant. Recent estimates put
search engine overlap at approximately 60 percent and unique content at
around 40 percent.
How do search engines rank websites?
In ranking web pages, search engines follow a certain set of rules. These may
.vary from one engine to another. Their goal, of course, is to return the most
relevant pages at the top of their lists. To do this, they look for the location
and frequency of keywords and phrases in the web page document and,
sometimes, in the HTML META tags. They check out the title field and scan
the headers and text near the top of the document. Some of them assess
popularity by the number of links that are pointing to sites; the more links,
the greater the popularity.
What are the pros and cons of search engines?
Search engines provide access to a fairly large portion of the publicly
available pages on the Web, which it self is growing exponentially. Search
72
engines are the best means devised yet for searching the web. Stranded in the
middle of this global electronic library of information without either a card
catalog or any recognizable structure, how else are you going to find what
you're looking for? On the down side, the sheer number of words indexed by
search engines increases the likelihood that they will return hundreds of
thousands of irrelevant responses to simple search requests. Remember, they
will return lengthy documents in which your keyword appears only once.
What are examples of search engines?
Here are some examples of search engines:
• Google.com - www.google.com
• AllTheWeb - www.alltheweb.com
• Altavista - aV.com
• Hotbot - www.hotbot.com
• Excite - www.excite.com
So, when do I use search engines?
Search engines are best at finding unique keywords, phrases, quotes, and
information buried in the full-text of web pages. Because they index word by
word, search engines are also useful in retrieving tons of documents. Now,
this question leads us to another question "When do you NOT use search
engines?" Well, we have an alternative - Subject Directories.
What are subject directories?
Subject directories, unlike search engines, are created and maintained by
human editors, not electronic spiders or robots. The editors review and select
sites for inclusion in their directories on the basis of previously determined
selection criteria. The resources they list are usually annotated. Directories
tend to be smaller than search engine databases, typically indexing only the
home page or top level pages of a site. They may include a search engine for
searching their own directory (or the web, if a directory search yields
unsatisfactory or no results). Today, the line between subject directories and
search engines is blurring. Most subject directories have partnered with
search engines to query their databases and search the web for additional
sources, while search engines are acquiring subject directories or creating
their own.
73
How do subject directories work?
When you initiate a keyword search of a directory's contents, the directory
attempts to match your keywords and phrases with those in its written
descriptions. It's not far from the way a search engine works but instead of
searching spider-compiled databases, subject directories searches humancompiled databases. Subject directories come in assorted flavors. There are
general directories, academic directories, commercial directories and various
other categories.
What are the pros and cons of subject directories?
Directory editors typically organize directories hierarchically into browsable
subject categories and sub-categories. When you're clicking through several
subject layers to get to an actual Web page, this kind of organization may
appear cumbersome, but it is also the directory's strength. Because of the
human oversight maintained in subject directories, they usually deliver a
higher quality of content and fewer results out of context than search engines.
Unlike search engines, most directories do not compile databases of their
own. Instead of storing pages, they point to them. This situation sometimes
creates problems because, once accepted for inclusion in a directory, the Web
page could change content and the editors might not realize it. The directory
might continue to point to a page that has been moved or that no longer
exists. Dead links are a real problem for subject directories, as is a perceived
bias toward e-commerce sites.
What are examples of subject directories?
Here are some examples of subject directories:
•
•
•
•
•
•
74
About.com - about.com
Encyclopedia Britannica's Internet Guide - www.brittanica.com
Infomine: Scholarly Internet Resource Collections infomine.ucr.edu
Librarians' Index to the Internet - Iii.org
Looksmart - www.looksmart.com
Yahoo! - www.yahoo.com
When do I use subject directories?
Subject directories are best for browsing and for searches of a more general
nature. They are good sources for information on popular topics,
organizations, commercial sites and products. When you'd like to see what
kind of information is available on the web in a particular field or area of
interest, go to a directory and browse through the subject categories.
Why won't this search engine just give me what I asked for?
At last, we've come down to this ever present question. Let's go back up a
bit. Well, all the way up actually, to "What are search engines?" You will see
that I have put the word "WORDS" in capitals and that wasn't a typing error.
I just want to stress the point that these search engines (and very often
subject directories) basically search for WORDS that you key in. It will not
think for you or try to categorize what you have just typed in. If you typed in
the word "cats", you'd get EVERYTHING on cats no matter if it's about
feeding cats or washing cats or copy cats. Any website containing the word
"cats" will be listed. In that case, you'll get hundreds of thousands of "hits"
or "matches". This would be really helpful if you wanted to know
EVERYTHING about something but what if you wanted to narrow down
your search to a certain subject that you want?
How do I narrow down my search?
The best tip to be used is to be more specific. For example, if you would
want to know about brands of cat shampoo, don't search for "keeping cats
clean", type in "cat shampoo". Here are a few more tips to get you going.
They work for most search engines.
•
•
•
Use the plus (+) and minus (-) signs in front of words to force
their inclusion and/or exclusion in searches.
EXAMPLE:
+anorexia -bulimia
(NO space between the sign and the keyword)
Use double quotation marks (" ") around phrases to ensure they are
searched exactly as is, with the words side by side in the same
order.
EXAMPLE:
"computer technology in science fair"
(Do NOT put quotation marks around a single word.)
Put your most important keywords first in the string.
EXAMPLE:
+ science projects computer
75
•
•
•
•
Type keywords and phrases in lower case to find both lower and
upper case versions. Typing capital letters will usually return
only an exact match.
EXAMPLE:
president retrieves both president and President
Use truncation and wildcards (e.g., *) to look for variations in
spelling and word form.
EXAMPLE:
Iibrar* returns library, libraries, librarian, etc.
Combine phrases with keywords, using the double quotes and the
plus (+) and/or minus (-) signs.
EXAMPLE:
+"lung cancer" +bronchitis -smoking
(In this case, if you use a keyword with a + sign, you must put the
+ sign in front of the phrase as well. When searching for a phrase
alone, the +sign is not necessary.)
When searching a document for your keyword(s), use the "find"
command on that page.
You may also use something called Boolean Searches. In order to use this,
the search engine must be Boolean enabled. Boolean operators, such as
AND, OR and NOT, are used to combine search sets in a variety of ways
and appear within Internet search engines in a range of disguises. A very
brief overview:
•
•
•
Search phrase: cats and dogs
means find web pages in which both terms occur
Search phrase: cats or dogs
means find web pages in which either term occurs
Search phrase: cats not dogs
means find web pages in which the term cat appears but not dog
Google.com - an example of a search engine
Now that you know the basics of searching, let me introduce you to my
personal favourite search engine. Google was launched in 1999 by some
students at Standford University and it has come to be my preferred choice. It
has a clean and uncluttered interface. Everything is placed there for a reason
and there are no irrelevant distractions. This straight forward, easy-to-use
engine is noted for-it highly relevant results. It returns pages based on the
number of sites linking to them and how often they are visited, indicating
their popularity, Google also saves the last copy of each page it visits under
the "Cached" link. If you're seeking a page that no longer exists, you may
76
still be able to find a copy of it at Google. When searching for a specific Web
site, try Google's "I'm feeling lucky" button on the main search page. It's very
good at coming up with exactly what you're looking for.
Search options for Google
Main Search page supports:
• (+) sign
• (-) sign
• Double quotes (" ") for phrases
Advanced Search page supports:
• Boolean type searching with boxes (all = AND; any = OR;
without = NOT)
• Limiting results to different fields (text, title, URL) on a page
• Limiting by language, domain and content
• Displaying results from 10-100 per page
Other search options and features
• "I'm Feeling Lucky" button (goes directly to top-ranked site III
your query)
• "Similar pages" search (brings up list of related sites)
• Field searching options, by link: (type link:) by title (type
allintitle:), by URL (allinurl:)
• PDF files included in indexing
• Cached page archives (showing copy of page from the last
time Google indexed it)
• Results from related .pages cl ustered by indentation
• Customized display options
• No truncation/wildcards/stemming
• No case sensitivity
• Universities (finds university sites)
• Apple/Macintosh (finds Mac sites)
Yahoo! - an example of a subject directory
Yahoo! is a human-compiled subject directory and commercial portal. It is
the oldest major directory on the web, launched in mid 1994, and is a good
starting point for information of general appeal. The Main Page contains
many links and has the inevitable "busy" appearance. The Advanced Search
pag e has a cleaner, less cluttered look. Recently Yahoo! partnered with
77
Google, in order to provide Web page matches for search terms falling
outside the realm of Yahoo! sites and categories.
Search options for Yahoo!
Main page supports:
• Yahoo's subject category searches (defaulting to Google for web
searches)
• (+) and (-) signs
• Double quotes (" ") for phrases
• Field searching of title (t:) and URL (u:)
• Advanced Search Page (labeled "Search Options") supports: (+) and
(-) signs
• Double quotes (" ") for phrases
• Field searching of title (t:) and URL (u:)
• Boolean-type searching with radio buttons
• (all == AND; any == OR; "exact phrase match" and "Intelligent
default")
• Yahoo subject category searches
• Google web searches
• Usenet newsgroups searches
Other search options and features
•
•
•
•
•
Yahoo! News, for breaking news and headlines
Topic and region-specific Yahoos!
Automatic truncation, with wildcard (*)
No case sensitivity
No stop words
As of April 2002, the ranking of search engines and subject directories
according to search ratings conducted by "Spidap"are as listed:
•
•
•
•
78
Biggest, Fastest: FAST (alltheweb)
Runner-up: Google
Coolest, Easiest, Most Fun: Ask Jeeves
Most Comprehensive Results: Google
Highest Overall Usability Rating: Google
Runner-up: Yahoo
•
•
•
Best Search Engine For Kids: Ask Jeeves For Kids
Most Relevant Results: Google
Runner-up: AltaVista
Most Likely to Find a Hit When Others Can't: Northern Light
Runner-up.Alta Vista
Below are some proposed search engines for more specific information
needs:
High quality images
Google image search
http://www.google.com/imghp?hl=en
most
comprehensive image search on the web.
.
Ditto - http://www.ditto.com/purely an image search engine.
Altavista image search - http://www.a:\tavista.com/sites/search/simage
offers
the option to go directly to the directory category related to the search term.
Ixquick - http://ixquick.com/metasearch
engine. It searches a number of
prominent engines simultaneously including Altavista, Fast search and
Yahoo.
A few good hits fast
Google - http://www.google.com/fast search, large index.
Ixquick - http://www.lxquick.com
metasearch using phrases, Boolean,
wildcards, capitals. Weighs value of hits by using major engines' top ten
results.
Quality, evaluated pathfinders prepared by a subject expert
Pinakes - http://www.hw.ac.uk/libWWW/irn/pinakes/pinakes.htmla
launchpad to major academic subject and multi-subject gateways.
About.com - http://home.about.com/index.htm
screened and trained
volunteers create general-interest subject guides.
WWW Virtual Library - http://vlib.orgiOverview.html worldwide volunteers
maintain oldest academic subjec - organized catalog of links to full-text,
databases, and gateways.
Argus Clearinghouse - http://www.clearinghouse.net/index.html
academic
guides identify, describe, and evaluate information resources. States criteria
for collection development.
BUBL LINK / 5: 15 - http://bubl.ac.uk/link/index.html academic catalog
(European focus) organized by Dewey number and subject terms.
Balanced informationfrom
take home.
verified sources for a school research project to
79
Nueva's Library Catalog - http://libcat.nuevaschool.org/uhtbin/webcat/
geared to the school's curriculum.
is
Biographical information.
Biographical Dictionary - http://www.s9.com/biography/
search for quick
identification of a name.
Biography.com - http://www.biography.com/search/
search for paragraphlength biographies.
Lives - http://amillionlives.com/alphabetical
links of biographies,
autobiographies, memoirs, diaries, letters, narratives and oral histories.
Perspectives from other countries and regions.
Abyz News Links - http://www.abyznewslinks.com/index.htmllinks
to
international newspapers, news media, internet services, magazines, and
press agencies.
World Press Review - http://www.worldpress.org/index.shtm
succinct
overviews of issues from international perspectives.
Opinion-Pages - http://www.opinion-pages.org/ searches editorials, opinions,
commentaries and columnists.
News Directory - http://www.newsd.com/list
of English-language media by
type (newspapers, magazines, television stations), and then by topic or
region.
Statistical data.
Statistical Information http://nuevaschool.org/-debbie/library/cur/math/stats.html
help page at Nueva
Primary academic sources.
Academic Info
http://www.academicinfo.netlindex.htmlbrowse
by
subjects, or search by keywords, of digital collections offering unique online
content, including annotated subject directories.
Information found in presentations, spreadsheets and other formats
Google
Advanced
Search - http://www.google.com/advanced_search
includes Adobe Acrobat, Word documents, Excel spreadsheets, PowerPoint
presentations and Rich Text Format.
Search Adobe PDF Online - http://searchpdf.adobe.com/see summary before
downloading.
80
Photos, art, designs, videos, music, noises), media types (Java, mp3) or file
extensions [gifJ.
FAST
Multimedia
Search
httpv/multimedia.alltheweb.corn/cgibin/advsearch; Google Image Search ~ http://images.google.com/over
150
million digital images.
ClassroomClipart ~ http://classroomclipart.com/browse
categories suitable
forK~12.
FindSounds ~ http://findsounds.com/locates
sound effects and sample
sounds.
Free or inexpensive software.
CNET ~ http://shareware.cnet.com/meta
search engine for shareware.
Geographical maps.
Cornell Digital Earth - http://atlas.geo.comell.edu/webmap/
interactive maps
displaying geological, geographical, and geophysical data.
National Geographic Maps http://plasma.nationalgeographic.com/mapmachine/
physical, political.
Translation assistants
SYSTRANET for the Web ~ http://www.systransoft.com/Systranet.html
WordReference for your Website - http://www.wordreference.com/for
Websiteshtm immediately translate any word to another language by
double clicking on any selected word - Travlang Translating Dictionaries
- http://dictionaries.travlang.com/provide
a variety of useful tools for
individuals interested in learning a foreign language.
Making pages available for offline viewing.
After locating some important materials, you might want to make a Web
page available offline so that you can read its content when your computer is
not connected to the Internet. For example, you can view Web pages on your
laptop computer when you don't have a network or Internet connection. Or
you might want to read Web pages anywhere but do not want to be link up a
phone line.
To make the c~rrent web page available offline
1. On the File menu, click Save As
2. Double-click the folder you want to save the page in.
81
3.
4.
In the File name box, type a name for the page.
In the Save as type box, select a file type
To make the current web page with all its links available offline
1. On the Favorites menu, click Add to Favourites
Select the Make available offline check.
2.
3.
4.
To specify a schedule for updating that page, and how much content
to download,
click Customize.
Follow the instructions
on your
screen.
Before you go offline, make sure you have the latest version of your
pages by clicking the Tools menu and then clicking Synchronize.
To make a remote web site with complete directory structure
offline
available
Finally. let me introduce you to an impressive internet tool - "Websnake".
This tool is unique in that it uses an "intelligent pull" technology to search
and retrieve files from the World Wide Web. Websnake is able to copy a
remote website including the complete directory structure (inclusive of all
links) of the site to your hard drive for local viewing through a web browser.
A trial version of web snake is available for downloads at www.pcworld.corn
Idownloadslfile_download/O,fid,3782,fiIeidx,1,OO.asp
References
http://nuevaschool.orgl-debbie/library/research/adviceengi
http://searchenginewatch.com/l
inks/maj or. html
http://www.niss.ac.ukllis/search-engines.html
http://www.sc.edu/beaufort/I
ibrary
82
ne. htm I
© Copyright 2026 Paperzz