Google`s PageRank

Google’s PageRank
By Zack Kenz
Outline
Intro to web searching
 Review of Linear Algebra
 Weather example
 Basics of PageRank
 Solving the Google Matrix
 Calculating the PageRank
 Wrapping up

Some Search Engine History
Early basis of searching was on page
content only
 Bonuses for word placement
 Paying for placement
 Natural language searches (Think: Ask
Jeeves)
 Meta search engines

Why Google?

No one exploited the link
structure of the internet
 Relatively easy to exploit
content-based engines
with concealed text
 Adaptive to a growing
internet
 Simpler, faster
PageRank, According to Google

“PageRank relies on the uniquely democratic nature of
the web by using its vast link structure as an indicator
of an individual page's value. In essence, Google
interprets a link from page A to page B as a vote, by
page A, for page B.”

“Google looks at considerably more than the sheer
volume of votes, or links a page receives; for example,
it also analyzes the page that casts the vote. Votes
cast by pages that are themselves ‘important’ weigh
more heavily and help to make other pages
‘important.’”
Linear Algebra Terms
Row Stochastic Matrix
 Eigenvector: A nonzero vector x such
that Ax=λx for a scalar λ
 Eigenvalue: A scalar λ that gives a
nontrivial solution x for Ax=λx
 Dominant Eigenvalue (eigenvector)

Tomorrow’s Weather

Example on the board
Scoring Web Pages

Random web surfer

Goal: Assign a score to over 25 billion
web pages, store the scores

Score based on the probability of going
to a particular page
Surf’s Up!
Hyperlink Matrix
Hyperlink Matrix
Dangling Nodes
Dangling Nodes
Dangling Nodes
Web Link Surfer Matrix
One More Fix



Need to account for the fact that a surfer can type in
URLs instead of using links
Add in a personalization vector,
When multiplied by a column vector of ones, we get an
additional personalization matrix
One More Fix



Need to account for the fact that a surfer can type in
URLs instead of using links
Add in a personalization vector,
When multiplied by a column vector of ones, we get an
additional personalization matrix
Google Matrix
Recall
is a damping factor, usually .85
The True Google Matrix?
Solution of the Google Matrix





Since the Google matrix is row stochastic, it
has an eigenvalue of λ=1
λ=1 is biggest and not repeated
Let be the corresponding eigenvector
The eigensystem
has a unique
solution for
, then, is a row probability vector
Solution of the Google Matrix





Since the Google matrix is row stochastic, it
has an eigenvalue of λ=1
λ=1 is biggest and not repeated
Let be the corresponding eigenvector
The eigensystem
has a unique
solution for
, then, is a row probability vector
contains every page’s PageRank
Computing Scores:
The Linear Algebra Way
Recall
Computing Scores:
The Power Method



λ = 1 is the dominant eigenvalue of G and is the
dominant left eigenvector
As a result the power method applied to G converges
to the PageRank vector
Given a starting vector
like
, the power
method calculates successive iterates until a stopping
condition is reached
Speeding Things Up
Wrapping Up:
The Overall Page Scoring

PageRank is still
only a portion of
what determines
the order of search
results

Results are based
off of many factors,
especially page
content
Wrapping Up:
Improving PageRank
Avoiding link spamming – tweak the
personalization vector and α
 Power method convergence algorithms
 Dummy node

Questions?
Questions?
Sources








Rebecca S Wills. Google’s PageRank: The Math Behind the Search
Engine. Department of Mathematics, North Carolina State University. 1
May 2006.
Amy N. Langville and Carl D. Meyer. Fiddling with PageRank. Department
of Mathematics, North Carolina State University. 15 August 2003
http://www.searchenginehistory.com/
http://www.google.com/technology/ and http://www.google.com
David C Lay. Linear Algebra and Its Applications, 3ed. Pearson Education:
2003.
Dr. Biebighauser
http://eperformance.co.uk/uploaded_images/google%20beta-786468.jpg
http://webmechanics.uoregon.edu/Images/Surf%20web.jpg
 http://www.modmyifone.com/iphone_wallpapers/file.php?n=282&w=l
 http://www.smashingmagazine.com/images/pagerank/google-pagerank.jpg