Preview

05 - Page Rank Algorithm

 1. PageRank (PR) is an algorithm used by Google Search to rank websites in their search engine results. PageRank was named _________________

  after Bill Gates - his middle name is 'Page' (Bill Page Gates)

  after the scheduling page algorithm that is very similar to it

  after one of the founders of 'Bing' -Mrs Bingo Rank

  after one of the founders of Google: Mr Larry Page (and the use of a pun with the word 'web page')

 2. Read this excerpt - provided by Google - and fill in the blanks.
According to Google:
====================
PageRank works by counting the __________________________to a page to determine
 a rough estimate of how important the website is. The underlying assumption is 
that more important websites are likely to receive more links from other websites.

  characters in a web address

  number of facebook or google docs likes

  number and quality of links

  numeric value associated with a page's URL (character count)

 3. Currently, PageRank is the only algorithm in the world used to order search results

  FALSE

  TRUE

 4. Analyse the following image that shows mathematical page ranks for a simple network. Which of the following statements are correct?
Statement 1
=============
Page E has a higher PageRank than Page C, even though it appears smaller
as it clearly has a greater number of links to E than to C

Statement 2
=============
Page C has a higher PageRank than Page E, even though there are fewer links to C; 
the one link to C comes from an important page and hence is of high value
pagerankswikiexample.jpg

  Statement 2 is correct

  Neither statement is correct. Page C and E have equal page ranking.

  Statement 1 is correct

  Both statements are incorrect

 5. Because of the size of the actual web, the Google search engine uses an approximative, iterative computation of PageRank values

  FALSE

  TRUE

 6. The size of each face is proportional to the __________________________ (This demonstrates page rank)
pagerankcartoon.png

  size of the first ink that points to it

  numeric value of the web site address

  total size of the other faces which are pointing to it.

  size of the link to which it points to

 7. PageRank is a link analysis algorithm and it assigns a _______________ to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of "measuring" its relative importance within the set

  hard coded weighting

  character set weighting

  numerical weighting

  string based weighting

 8. A ____ is a page with many out-links

  Authority

  Engine

  Hub

  Centre

 9. A _____ is a page with many in-links

  Authority

  Engine

  Centre

  Hub

 10. Read the following excerpt and analyse the image on page ranking. Fill in the blanks
Assume the web consists of five websites: Twtr, Amzn, Fb, Medm, and Mspc. 
Also assume that these sites are linked. For example, the Fb may link to 
Amzn, and Amzn may link to somewhere else

Initially, there is no order preference, that is each page has an equal 
probability of getting the highest rank.

In other words, each page is initialized with a rank of 1/N, 
where N is the total number of webpages in the graph. 

In this example, each webpage would start with a ______________
pagerankingimage.png

  zero ranking

  5/5 ranking

  1/100 ranking

  1/5 ranking

 11. Page Rank or PR(A) can be calculated using a _______ algorithm.

  static

  iterative

  borderline

  googlestic

 12. Read the excerpt below and select the statement that best fits the blanks
PageRank is defined like this:

We assume page A has pages T1...Tn which point to it (i.e., are citations). 
The parameter d is a damping factor which can be set between 0 and 1. We usually 
set d to 0.85. There are more details about d in the next section. Also C(A) is 
defined as the number of links going out of page A. The PageRank of a page A is 
given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

PR(Tn) - Each page has a notion of its own self-importance. 
That’s “PR(T1)” for the first page in the web all the way up to “PR(Tn)” __________________?

  the page itself (in other words, it is not necessary to iterate and continue to check the other pages)

  itself (the first page)

  for the last page

  in the first index (e.g. index 0)

 13. This is where it gets tricky. The PR of each page depends on the PR of the pages pointing to it. But we won’t know what PR those pages have until _______________

  the PR of the first link is calculated

  the pages pointing to them have their PR calculated

  all the back links on the entire world wide web are counted - and this counting may never end

  the PR / back links is calculated for the first and last link

 14. Have a look at the formula for the page rank algorithm. Fill in the blanks
The original PageRank algorithm was described by Lawrence Page and Sergey Brin
 in several publications. It is given by

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))	

	PR(A) is the PageRank of page A,
	PR(Ti) is the PageRank of pages Ti which link to page A,
	C(Ti) is the number of ______________________________and
	d is a damping factor which can be set between 0 and 1.

  inbound links on page A

  outbound links on page A

  outbound links on page Ti

  inbound and outbound links summed together (on page A)

 15. The PageRank of page A is ____________ defined by the PageRanks of those pages which link to page A

  furtively

  binarily

  recursively

  forcefully