Search Engines - A Figure of Speech, Inc.

A WHITE PAPER ON
SEARCH ENGINES
June, 2002
TABLE OF CONTENTS
Overview _________________________________________________________________ 3
Search Engines – Introduction ________________________________________________ 3
Crawler-Based Search Engines ______________________________________________ 3
Human-Powered Directories ________________________________________________ 3
"Hybrid Search Engines" Or Mixed Results _____________________________________ 4
Search Engine Submission vs. Optimization______________________________________ 4
Submitting to Crawlers (Google, Inktomi, FAST, Teoma & AltaVista) __________________ 4
Submitting to Directories (i.e., Yahoo, LookSmart, and The Open Directory) ____________ 4
Submitting to Yahoo! ______________________________________________________ 5
Submitting To The Open Directory ___________________________________________ 5
Other Quick Listings_______________________________________________________ 6
Search Engine Submission Services ____________________________________________ 7
How Search Engines Rank Pages_______________________________________________ 7
Getting Ranked – Meta Tags __________________________________________________ 7
TITLE tag _______________________________________________________________ 7
META NAME="Keywords"___________________________________________________ 8
META NAME="Description" _________________________________________________ 8
Page Text _______________________________________________________________ 8
How to maximize your traffic for the least effort __________________________________ 9
In Conclusion ______________________________________________________________ 9
The Major Search Engines ___________________________________________________ 10
2
Overview
“How can I get my site listed with the major search engines?” This is a common question
for companies want prospects and customers to find their Web site through an online
search. Search engines might seem simple, but in reality, they are very complicated. This
paper details how search engines work and how to rank at the top.
Note that some of the material in this white paper was found at
http://www.searchenginewatch.com/webmasters/submit.html and
http://hotwired.lycos.com/webmonkey/01/23/index1a.html?tw=e-business
Please visit these sites if you have additional questions or need further information.
The most important thing I tell my Web clients is not to rely on search engines to drive
traffic to their site, unless they are willing to set a budget and pay to rank. Proactive
marketing can prove to be more valuable and provide better results. Unless you are an ecommerce company (where search engine optimization would be key because sales
revenues depend on Internet traffic), learn to proactively market your Web site and the
content you have provided in it. Basic steps would include putting your Web address on all
printed materials, including it in your voicemail/phone recordings, and placing links to your
site on partner or vendor sites. More aggressive pr/marketing would include TV., radio,
direct mail campaigns, advertising, promotional materials, etc.
Search Engines – Introduction
The term "search engine" is often used generically to describe both crawler-based search
engines and human-powered directories. These two types of search engines gather their
listings in radically different ways. For a listing of the major search engines, please refer to
the last three pages of this document.
Crawler-Based Search Engines
Crawler-based search engines (such as HotBot) create their listings automatically. They
"crawl" or "spider" the web, then people search through what they have found. If you
change your web pages, crawler-based search engines eventually find these changes, and
that can affect how you are listed.
Human-Powered Directories
A human-powered directory (such as Yahoo) depends on humans for its listings. You submit
a short description to the directory for your entire site, or editors write one for sites they
review. A search looks for matches only in the descriptions submitted. Changing your web
pages has no effect on your listing.
Note that things that are useful for improving a listing with a search engine have nothing to
do with improving a listing in a directory.
3
"Hybrid Search Engines" Or Mixed Results
In the Web's early days, it used to be that a search engine either presented crawler-based
results or human-powered listings. Today, it extremely common for both types of results to
be presented. Usually, a hybrid search engine will favor one type of listings over another.
For example, Yahoo is more likely to present human-powered listings. However, it does also
present crawler-based results (as provided by Google), especially for more obscure queries.
Search Engine Submission vs. Optimization
As far as search engines go, submission and optimization (or maximization) are definitely
two different things. Getting the search engines to ‘know’ your pages are online is one
challenge, but having your site rank at the top takes a lot more.
Listing or submitting your site to the different search engines should result in some traffic
for your site. Many people opt to use the free listing services that search engines provide,
but paid submissions will speed up the listing process and almost certainly generate more
search engine related traffic for your web site. Given this, it is highly recommended that
any site owner establish a search engine submission budget, especially if the site is a
commercial site (note that some search engines/directories don’t allow free listings for
commercial sites).
Submitting to Crawlers (Google, Inktomi, FAST, Teoma & AltaVista)
Crawler-based search engines automatically visit web pages to compile their listings. This
means that, unlike directories, you are likely to have several if not many pages listed with
them.
Google, once considered a niche site for nerds, is the Wall Street Journal's pick for best
search engine on the Net, and the traffic numbers seem to agree. Inktomi, the number two
traffic generator, doesn't run its own search site. Instead, the company provides the
technology behind MSN Search and AOL Search, two top referrers, as well as Hotbot and
over a dozen more. Portal sites like Excite, Lycos, and AltaVista still draw lots of traffic, but
together Google and Inktomi outweigh the entire rest of the field.
Note that beyond Yahoo!, most search engine traffic comes from two places: Google and
Inktomi. Inktomi submission costs $39/year, but will rank search results largely on the
links to your page from other domains, so if you don't have reciprocal links, you will most
likely rank at the bottom.
Submitting to Directories (i.e., Yahoo, LookSmart, and The Open Directory)
As mentioned above, directories are search engines powered by human beings. Human
editors compile all the listings that directories have. Getting listed with the Web's key
directories is very important, because many people see their listings.
In addition, if you are listed with them, then crawler-based search engines are more likely
to find your site and add it to their listings for free.
4
Submitting to Yahoo!
One of the web's oldest and most important directory is Yahoo! Getting listed with Yahoo is
absolutely essential to any site owner. Yahoo has two submission options: "Standard,"
which is free, and "Yahoo Express," which involves a submission fee.
Note that anyone can use Standard submission to submit for free to a non-commercial
category; however, commercial listings must pay the submission fee to get listed.
It is also important to note that Google signed a contract with Yahoo to be the secondary
search engine of the Yahoo directory based search engine. What this means is that when
you search on Yahoo, their directory listings come up first. The results then default to the
secondary search engine, which now is Google (this use to be under the care of the
database of Inktomi).
The flat $300 annual fee that Yahoo charges should easily pay for itself in traffic for most
people, especially when compared to other paid listing programs. Note that this fee doesn't
guarantee that you will be listed, only that you'll get a yes or no answer about being
accepted within seven business days (note that the vast majority of most decent sites are
accepted).
As mentioned above, submitting your site with Yahoo ($300/year) should be the bare
minimum. According to one specialist, the Yahoo directory accounts for half the traffic
referred to most sites. So get your site listed on Yahoo, and your traffic can literally double
overnight.
Submitting to LookSmart
Another important directory is LookSmart. This is because LookSmart provides the main
listings used by the popular MSN Search service. LookSmart's listings are also distributed
to other search engines. As with Yahoo, getting listed with LookSmart is essential for any
site owner. As with Yahoo, LookSmart has a free submit option for its non-commercial
categories and a paid option for its commercial ones.
Submitting To The Open Directory
The Open Directory is a volunteer-built guide to the web. It provides the main results to
Netscape Search and powers the Google Directory. It also powers some results for a variety
of other services. Given this, being listed with The Open Directory is essential to any site
owner.
The good news about submission with The Open Directory is that it’s absolutely free. The
bad news is that this means there's no guaranteed turnaround time to getting a yes or no
answer about whether you've been accepted.
5
Other Quick Listings
If you want to be listed with the search engines quickly, you'll need to have budget the
following fees. By paying an "inclusion" fee to some of the crawler-based search engines,
you can shorten the usual month delay of appearing to only a few days. The fees shown
below are by the major crawlers that offer such programs:
Crawler Budget
Inktomi
$40
AltaVista
$40
FAST (via Lycos)
$30
Ask Jeeves/Teoma $30
Total
$140
Another key option to getting listed faster is to consider using paid listing programs. The
budget below will get you going for at least a month, in most cases. For LookSmart, it will
cover you for a year.
Paid Listings Budget
LookSmart
$230
Overture (GoTo)
$50
Google
$25
FindWhat
$25
Total
$330
Now let's put it all together. Here's the ideal amount you would budget, if you want to show
up in the widest range of important search engines within a matter of days:
Search Engine Submission Budget
Yahoo
$300
Crawlers
$140
Paid Listings
$330
Again, you might be able to get listed without spending a penny; however, if your goal is to
be seen right away in as many places as possible, it will cost you. At the very least, it is
highly recommended to budget enough to cover Yahoo.
6
Search Engine Submission Services
There are many automated search engine submission services that you can use to submit
your site to as many search engines as possible. One recommendation is Submit It, an early
player that did so well, Microsoft bought them — Submit It is now part of MSN bCentral, and
it charges a minimum fee of US$59 to keep a few URLs submitted for a year.
Submit It does submit your site to the busiest directory sites, except for the biggies: Yahoo,
LookSmart (which MSN serves under its logo), and the Open Directory Project (which
powers Lycos, Hotbot, and Netcenter categories). Some of these directories charge for
submission, but $400-500 total will get your most important pages into the most trafficked
places.
Once you've submitted your pages, be ready to wait a month, two, or three before they're
crawled and indexed. It's frustrating, but processing a billion Web pages takes time — at a
nonstop rate of one hundred per second, it would still take almost four months.
How Search Engines Rank Pages
How do crawler-based search engines go about determining relevancy, when confronted
with billions of web pages to sort through? They follow a set of rules, known as algorithms.
Exactly how a particular search engine's algorithm works is a closely-kept trade secret.
However, all major search engines follow some of the same general rules, including:
•
•
•
•
location and frequency of keywords on a web page
meta tags (keywords and description)
link analysis (how pages link to each other)
click-through measurement (what results are chosen more often)
It is definitely important to add meta description and meta keyword tags to your web pages.
Some search engines will give you a boost if you have them. But don't expect that to
necessarily be enough to put you in the top ten. Meta tags are mainly a design element you
can tap into, a crutch for helping information-poor pages better be acknowledged by the
search engines.
Getting Ranked – Meta Tags
Most people that are concerned with search engine optimization focus obsessively on
keywords and HTML tags. But when it comes to getting ranked by search engines, the only
tags that matter are TITLE, and the Meta Tags Keywords and Description. And you have to
be very careful about how you handle each one.
TITLE tag
The Title tag makes a big difference, especially with Google. It should be short (less than 40
characters seems to work best) and, most importantly, should match the search queries
people will be using to find your site. This could lead to a struggle with the marketing
7
managers: They'll want your site's page titles to contain the company name and/or a
positioning statement. Ask them what good that will do if no one ever sees the pages.
This is a good TITLE tag that will generate traffic from people searching for "picasso":
<TITLE>Pablo Picasso</TITLE>
This is a mediocre one:
<TITLE>Artstuff: Pablo Picasso</TITLE>
This one will put you out of business:
<TITLE>Artstuff: Your Number One Online Resource for Fine Art
Solutions!!!</TITLE>
META NAME="Keywords"
Keyword spamming is the number one favorite trick for search engine optimization. But
many of the sites that stuff a zillion keywords into their pages are hoping to get clicks to
their pages just to show ads — they don't care if they get any repeat business. But if you
want to draw real customers, focus on the keywords you think your users will be searching
for.
For our Picasso page, something like this would work (note that uppercase letters don't
matter):
<META NAME="keywords" content="Pablo Picasso, Pablo, Picasso, painting, cubist,
painting, ceramics, collage, Spain, Guernica, Paris, 20th century, Girl Before a Mirror">
Repeating the most important keyword twice seems to work with some search engines, but
repeating more than that will cause some of them to ignore the whole page. Although none
of the representatives from the search companies would confirm specific behavior, it seems
that they tend to ignore keyword lists longer than 1024 characters.
META NAME="Description"
This field gets used for the page summary on Inktomi and some other engines, so don't
cram it with keywords: A scary-looking description on a search engine's results page could
discourage people from clicking through to your page, even if it scores high.
Page Text
It never hurts to have the search terms you want to match near the top of the page. But
cramming in a list of spam-style keywords can also backfire — Google will display them
under the page title on its results page, and Inktomi will show them (as do many others) if
there is no Description tag.
8
Stuffing long strings of repeated keywords into pages used to magically get them to the top
of search engine results, but that was before the search engineers realized what was going
on and learned how to prevent this from happening. Once in a while you'll see a
"spamdexed" page near the top of your results, but this trick works less and less frequently
these days.
It is important to note that some search engines index more web pages than others. Some
search engines also index web pages more often than others. The result is that no search
engine has the exact same collection of web pages to search through. That naturally
produces differences, when comparing their results.
How to maximize your traffic for the least effort
•
•
•
Get yourself into Yahoo's directory.
Make sure your site is thoroughly crawled by Google and Inktomi.
Get lots of links to your site from domains that a lot of other sites link to — that's
how Google and Inktomi determine relevance when ranking search results.
For all other search engines, implement a blanket strategy that gets you reasonable results.
By not chasing each one of them separately, you can put your company's time and money
to more important uses.
In Conclusion
Developing a budget is important and so is monitoring traffic (so that you can determine if
paying to list is giving you a positive ROI or if other options should be reviewed).
Offline businesses all have basic start-up costs that must be met, such as business licenses,
phone bills, Yellow Pages ads and so on. For online businesses, directory submission fees
should also be considered basic start-up costs, just as domain name registration and web
hosting fees are a crucial part of anyone's budget.
9
The Major Search Engines
AllTheWeb.com (FAST Search)
http://www.alltheweb.com
AllTheWeb.com (also known as FAST Search) consistently has one of the largest indexes
of the web. FAST also offers large multimedia and mobile/wireless web indexes,
available from its site. The site, also known as AllTheWeb.com, is a showcase for FAST's
search technologies. FAST's results are provided to numerous portals, including those
run by Terra Lycos. FAST Search launched in May 1999.
AltaVista
http://www.altavista.com
AltaVista is one of the oldest crawler-based search engines on the web. It has a large
index of web pages and a wide range of power searching commands. It also offers news
search, shopping search and multimedia search. AltaVista opened in December 1995. It
was owned by Digital, then run by Compaq (which purchased Digital in 1998), then spun
off into a separate company that is now controlled by CMGI.
AOL Search
http://search.aol.com/
AOL Search allows its members to search across the web and AOL's own content from
one place. The "external" version, listed above, does not list AOL content. The main
listings for categories and web sites come from the Open Directory (see below). Inktomi
(see below) also provides crawler-based results, as backup to the directory information.
Ask Jeeves
http://www.askjeeves.com
Ask Jeeves is a human-powered search service that aims to direct you to the exact page
that answers your question.
Google
http://www.google.com
Google is a top choice for web searchers. It offers the largest collection of web pages of
any crawler-based search engine. Google makes heavy use of link analysis as a primary
way to rank these pages. This can be especially helpful in finding good sites in response
to general searches such as "cars" and "travel," because users across the web have in
essence voted for good sites by linking to them. The system works so well that Google
has gained widespread praise for its high relevancy. Google provides web page search
results to a variety of partners, including Yahoo and Netscape Search (see below).
Google also provides the ability to search for images, through Usenet discussions and its
own version of the Open Directory (see below).
10
HotBot
http://www.hotbot.com
In most cases, HotBot's first page of results comes from the Direct Hit service (see
above), and then secondary results come from the Inktomi search engine, which is also
used by other services. It gets its directory information from the Open Directory project
(see below). HotBot launched in May 1996 as Wired Digital's entry into the search
engine market. Lycos purchased Wired Digital in October 1998 and continues to run
HotBot as a separate search service.
iWon
http://www.iwon.com
iWon's results come from both Overture & Inktomi. iWon gives away daily, weekly and
monthly prizes in a marketing model unique among the major services. It launched in
Fall 1999.
Inktomi
http://www.inktomi.com
Originally, there was an Inktomi search engine at UC Berkeley. The creators then formed
their own company with the same name and created a new Inktomi index, which was
first used to power HotBot. Now the Inktomi index also powers several other services.
All of them tap into the same index, though results may be slightly different. This is
because Inktomi provides ways for its partners to use a common index yet distinguish
themselves. There is no way to query the Inktomi index directly, as it is only made
available through Inktomi's partners with whatever filters and ranking tweaks they may
apply.
LookSmart
http://www.looksmart.com
LookSmart is a human-compiled directory of web sites. In addition to being a standalone service, LookSmart provides directory results to MSN Search, Excite and many
other partners. Inktomi provides LookSmart with search results when a search fails to
find a match from among LookSmart's reviews. LookSmart launched independently in
October 1996, was backed by Reader's Digest for about a year, and then company
executives bought back control of the service.
Lycos
http://www.lycos.com
Lycos started out as a search engine, depending on listings that came from spidering the
web. In April 1999, it shifted to a directory model similar to Yahoo. Its main listings
come from AllTheWeb.com with some results from the Open Directory project. In
October 1998, Lycos acquired the competing HotBot search service, which continues to
be run separately.
11
MSN Search
http://search.msn.com
Microsoft's MSN Search service is a LookSmart-powered directory of web sites, with
secondary results that come from Inktomi. Direct Hit data is also made available.
Netscape Search
http://search.netscape.com
Netscape Search's results come primarily from the Open Directory and Netscape's own
"Smart Browsing" database, which does an excellent job of listing "official" web sites.
Secondary results come from Google. At the Netscape Netcenter portal site, other search
engines are also featured.
Open Directory
http://dmoz.org/
The Open Directory uses volunteer editors to catalog the web. Formerly known as
NewHoo, it was launched in June 1998. It was acquired by Netscape in November 1998,
and the company pledged that anyone would be able to use information from the
directory through an open license arrangement. Netscape itself was the first licensee.
Netscape-owner AOL also uses Open Directory information, as does Google and Lycos.
Yahoo
http://www.yahoo.com
Yahoo is the web's most popular search service and has a well-deserved reputation for
helping people find information easily. The secret to Yahoo's success is human beings. It
is the largest human-compiled guide to the web, employing about 150 editors in an
effort to categorize the web. Yahoo has well over 1 million sites listed. Yahoo also
supplements its results with those from Google. If a search fails to find a match within
Yahoo's own listings, then matches from Google are displayed. Google matches also
appear after all Yahoo matches have first been shown. Yahoo is the oldest major web
site directory, having launched in late 1994.
12