03 - Indexes and Web crawlers

 1. How do search engines know where to look? How can search engines recommend a few pages out of the trillions that exist? The answer lies with _________________.

  web bugs

  web browsers

  web insects

  web crawlers

 2. Web crawlers are computer programs that scan the web, "reading" everything they find.



 3. Crawlers are also known as spiders, _____ and automatic indexers.





 4. These crawlers scan web pages to see what words they contain, and where those words are used. The crawler turns its findings into a giant _____.





 5. The index is basically a__________________________.
For example, when you ask a search engine for pages about mooses, the search engine checks its index and gives you a list of pages that mention mooses.

  big list of bugs and errors on the internet.

  big list of letters that can be sorted.

  big list of web browsers that exist.

   big list of words and the web pages that feature them.

 6. Crawlers ______ the web regularly so they always have an up-to-date index of the web.





 7. Once the crawler has found information by crawling over the web, the program builds the index. The index contains the words as well as their ____________.

  chinese spelling




 8. The Google Search index contains hundreds of billions of web pages and is well over 100,000,000 gigabytes in size.



 9. Google does not want to recommend disreputable websites, so if you engage in spammy practices you may be penalised by having your website _____________.



  put to the top of the search results

  indexed without your permission

 10. ___________ is the web crawler software used by Google, which collects documents from the web to build a searchable index for the Google Search engine.