SEEM4540 Open Systems for E-Commerce

SEEM4540
Open Systems for
E-Commerce
Lecture 10 – SEO
SEO
SEO – Search Engine Optimization
The process of improving the visibility of a website to search
engines.
Getting traffic from search engine.
It's about Keywords and Links.
Although a simple idea, putting SEO concepts to work is extremely
difficult.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
How a Search Engine Works?
Although nobody (except the creators) knows exactly how a
search engine calculates the popularity, reputation, similarity,
etc. of a web page, the general idea of how a search engine
works is pretty much similar.
Getting a site to the top of a search engine eventually comes
down to your text content on a page and external & internal
links. Two important concepts here:
The relevance of a web page to a search query
Related technique: tf-idf schema
The trustworthiness of a web page to rank high
Related technique: Page content + Network Graph
Copyright (c) 2012. Gabriel Fung. All rights reserved.
Bear In Mind That…
Don't think YOU can fool a search engine…
There are hundreds to thousands of genius/geeks/experts
working in a search engine company, and you just have you (or a
team of less than 5 people…)
If you can "search" a "tactic" that can boost your site ranking on
the Internet, then the employees in a search engine company
can also search that tactic on the Internet, and will block that
"hole" soon.
And then, if you continue to use the "blocked tactic", your site will
definitely rank lower…
Copyright (c) 2012. Gabriel Fung. All rights reserved.
Not To Do – Buy Traffic Besides PPC
If a search engine catches you trying to “modify your rank”,
then they will class you a spammer, and your site will be
penalized.
You can buy "pay for click" from the search engine
Seems that search engine don't like "buying traffic".
E.g. You can buy traffic in eBay, Taobao, etc.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
Not To Do – Unrelated Link
Try to avoid having links that are not related to your web
page/your web site.
Most search engine tries to classify your web page based on this
information!
Copyright (c) 2012. Gabriel Fung. All rights reserved.
Not To Do – Others
Do not build a site in Flash
Do not build a site with frame
Do not have too many links (no matter link in or out)
Do not hide lots of text from visitors but show to search
engines
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Have A Nice Page Title
The page title is the most important factor for SEO:
Keywords in page titles may help your pages to be ranked higher.
Make sure a page title is highly relevant to its content.
Some people suggests a page title should be no more than 12
words. The first 8 words should contain one or more of the
important keywords.
SEO is not branding. The company name should but at the end
but not at the beginning.
Make it a keyword phase that is human readable, rather than
just a set of independent keywords.
If possible, try to make every page title unique in your web site.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Have A Nice Page Title (cont'd)
Page title is also often used by search engines in the search
results pages.
To make sure most search engines will display your full page title
in the search results pages, make sure that your title contains not
more than 65 characters.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Have a Nice Description
Some search engines look at the information in the metadata
"description":
Make sure it is human readable, but not just a set of keywords.
Make sure every page has a unique page description.
Make sure it is written by human but not machine.
Google seems put very little emphasis on this data. Yet, Yahoo
and Bing seems put lot more weights.
Although there is no length limit, a general practice is that
description should be limited to 256 characters.
A more "realistic" limit is 156 characters. Most search engine put
higher weights to the first 156 characters.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Original Content
Most search engines put higher ranking for those contents it
has not found before.
If you copy some contents from another web page X, this most
likely means (not only for search engines, but also for human
beings) that the contents you copied are valuable (otherwise, you
won't copy, right?)
As a result, page X should receive a higher ranking than you (as
the content is useful, whereas X is the original author)
Pretty obvious and true. Right?
If possible, make sure your site do not have duplicate content
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Get Backlinks
If possible, try to let domain authority (a.k.a. online business
authority) to index your site or to point/link to some pages of
your site.
A domain authority linking back to your site is sometimes
referred as trusted backlink.
Examples of authority domains include Wikipedia, the W3C,
Apple, Windows, etc., those very well-established web sites.
The more trusted backlinks you have, the more trust a search
engine will have in your site. You need multiple trusted
backlinks from multiple domain authority to get the most
from a search engine.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Get Backlinks (cont'd)
Try to get trusted backlinks within page text pointing to your
site with keywords on it
A link to a page with actual word on the link will help a great
deal in all search engines.
E.g., "Information Systems Management" as opposed to
"http://www.se.cuhk.edu.hk/~seem3490/" or "click here".
The trusted backlinks are not in, for example, the sidebar,
bottom of pages with many links in those areas, in the banner
areas, etc.
Ensure the links are not "machine generated"
Get trusted backlinks from pages, that in turn, have a lot of
trusted backlinks linking to them.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Have Inter-Link
Within your own web site, consider linking to your other
pages by linking to them within text and having words on the link
Make sure you type the link manually.
Most search engine do not like machine generated links.
A search engine will decide which page on your web site is the
most important or most relevant. You should help search
engines by linking to your important pages.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Make Sure Every Link is Visible
Search engines will crawl your entire site by following all the
links on your site (most likely the links on the index page), just
like a human being would do by clicking on the links of your
pages.
Make sure links are easily understandable by a machine. Do
not use very complicated scripts to hide or manipulates the
links.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Make Sure Every Link Valid
Think about it: if a page have many invalid links, if you are the
user who are browsing the page, your user experience will not
be good, right?
This is also identical to the "robot experience" of the search
engine.
If a page has broken link, it will harm it ranking definitely.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Use ALT Tag (Wisely)
In HTML, "alt" is a tag meaning "alternative description". Try
to use it for some non-text elements, such as image, video, etc.
Similar rules as the metadata "description" (e.g., unique, short,
etc)
Do not over-optimize it! If you put too many information on it,
your page may in fact be penalized.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Use ALT Tag (Wisely) (cont'd)
John Mu from Google
alt attribute should be used to describe the image. So if you have
an image of a big blue pineapple chair you should use the alt tag
that best describes it, which is alt="big blue pineapple chair."
title attribute should be used when the image is a hyperlink to a
specific page. The title attribute should contain information
about what will happen when you click on the image. For
example, if the image will get larger, it should read something
like, title="View a larger version of the big blue pineapple chair
image."
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Your Purpose
Most well known search engines will classify your site
(automatically) when it crawls and indexes your site
Quite obvious, this auto-classification will have a huge impact
on the rankings.
Make sure the intention of your web site can be easily
understandable (at least by human beings)
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Authorship
Google support "authorship" tag. It seems that Google may
place higher ranking if your site have proper authorship:
If you have a Google profile (or Google Plus) you can implement
this so that you can get a more eye-catching in Google results
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Optimize the Existing HTML Elements
Try to utilize the existing HTML elements.
Search engines (e.g. Google, Bing, Yahoo) will place higher
values on the text inside the tags such as <h1></h1> (title),
<h2></h2> (subtitle).
<p>, <ul> may be important as well.
Make sure every page should have <h1></h1> (title).
Technically, no matter what is the look and feel of your web
page, this is always possible to do. You can always modify the
look and feel of <h1> using style.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Others
Try to optimize and understand the "robot" tag:
<meta name="robots" content="index, follow">
<meta name="robots" content="noindex, nofollow">
<meta name="robots" content="noarchive">
:
Try to have URL Rewrite.
Instead of "http://xxxxx/?page_id=8", try to have
"http://xxxxx/article/seo_optimization/"
It means more to the search engines
It is very technical. In Apache, it is related to "htaccess" (hypertext
access)
Copyright (c) 2012. Gabriel Fung. All rights reserved.
To Do – Others (cont'd)
Maker sure the www version and non www version for your
site both are identical!
E.g., http://www.cuhk.edu.hk and http://cuhk.edu.hk should
have the same content (although at the time of writing this
lecture not, this is not the case for this particular case…)
Copyright (c) 2012. Gabriel Fung. All rights reserved.
Importance of Your keywords
In general, every word in your website has a score!
So that when a user input a query, then the web sites can be
ranked based on their “overall scores”.
Certainly, except the search engine creators, no one knows
how they compute an “overall score”. Yet, in general, we know
that they follow an important schema called “tf-idf schema”.
tf: term frequency
idf: inverse document frequency
http://www.tfidf.com/
Copyright (c) 2012. Gabriel Fung. All rights reserved.
TF-IDF Schema
TF-IDF stands for "Term Frequency, Inverse Document
Frequency". It is a way to score the importance of words (or
"terms") in a document based on how frequently they appear
across multiple documents.
Intuitively...
If a word appears frequently in a document, it's important. Give
the word a high score.
But if a word appears in many documents, it's not a unique
identifier. Give the word a low score.
Therefore, common words like "the" and "for", which appear in
many documents, will be scaled down. Words that appear
frequently in a single document will be scaled up.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
TF-IDF Schema (cont’d)
A graphical representation:
Copyright (c) 2012. Gabriel Fung. All rights reserved.
Some Myths
A search engine will not allow new web sites to rank well
It all depends on the quality of the incoming links. In fact,
sometimes your site will rank high for a while then disappear for
months. This is called the “honeymoon period”.
You need to submit your site to a search engine to rank high,
otherwise they will not rank you high or they may not have
your record
A search engine will know your site from backlinks. It will crawl
the Internet constantly as well. The ranking does not depends
on whether you have submitted your site or not.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
Some Myths (cont'd)
The metadata "Keyword" is important in page query and
ranking
The fact is that, nowadays, most search engines don't use it
anymore!
Google and Bing do not care about them. Yahoo may. Other
search engine depends.
You should have many same keyword on your page!
For me, I prefer to have many related but different keywords
You should have a keyword per N number of words in the
document
Well… I never feel that this is important…
Copyright (c) 2012. Gabriel Fung. All rights reserved.
Some Myths (cont'd)
Have links linking to some external pages will only deteriorate
the ranking of your web page.
I don't think so.
When you try to link to some directly related sites or link to
some directly related domain authorities, you are trying to set up
a "community", and it helps the search engine to "classify" the
nature of your web site.
An XML navigation page is useful in ranking
No. There is no relationship with ranking and XML navigation
page. However, it will be good to have one as it may display on
the search results pages.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
Some More Information
A search engine is just a piece of software. It does not
"understand" the contents. So it can never understand
whether a piece of content is good. Instead, it tries to
understand whether a piece of content is popular.
Copyright (c) 2012. Gabriel Fung. All rights reserved.
References
http://www.w3.org/QA/Tips/good-titles.html
http://searchengineland.com/writing-html-title-tags-humansgoogle-bing-59384
http://support.google.com/webmasters/bin/answer.py?hl=en
&answer=35624
https://support.google.com/webmasters/answer/35769?hl=en
http://aimotion.blogspot.hk/2011/12/machine-learning-withpython-meeting-tf.html
Copyright (c) 2012. Gabriel Fung. All rights reserved.