01:198:170 Chapter Notes - Chapter 5: Tertiary Source, Pagerank, Secondary Source
taupebee294 and 9 others unlocked
9
01:198:170 Full Course Notes
Verified Note
9 documents
Document Summary
Comp apps chapter 5: locating info on www. Search engine: collection of computer programs that help us find info on web. Crawler: software that crawls (visits every web page) It has to do list of urls to start and whenever it notices a url while crawling a page, it adds it to the list. Main work is to build an index. Index: list of tokens(like words) associated with a page. For each token, crawler creates list of urls associated with it. Search engine works by 1st crawling, then query processing. And-query: multi-word search, want each page to have all terms requested. To locate pages containing multiple words, query processor fetches index lists for all terms and finds urls in all lists. Put marker at start of each tokens lists. If all markers point to same url, save it. Move markers to next position for whichever url is next in alphabet.