Sunday, June 5, 2011

Search Tools

Web search tools help users find information, including web pages, businesses people, multimedia files, document databases, and more. These tools can be broadly classified into directories, search engines, and metasearch engines.


Google Directory
Yahoo! Directory
DMOZ Open Directory Project

A human compiled hierarchical list of web pages organized by categories. One of the first directories, Jerry's Guide (created by Jerry Yang and David Filo) became the original Yahoo Directory. The process of drilling down into subcategories can assist in logical organization of the subject matter. However, using human editors is a weakness because the review process is lengthy and biased. Breadcrumb trails are common links between subcategories, categories and the home page.

Most directories have strategic partnerships with search engines to enable users to search the directory index by keywords.  For example, Google directory combines Google search with the DMOZ directory pages.

Search Engines 

Yahoo! Search
AOL Search

Search engines include general-purpose search engines such as Google, Ask.com and Bing and others such as AltaVista, Gigablast and Cuil.  Additionally, speciality search engines such as BizRate (shopping), Technorati (blogs) and Fact Monster (kid friendly searches).  Some search engines contain Web page indexes of other search engines. For example AOL Search results are "enhanced" by Google.  AltaVista is owned by Yahoo! which provides it's indexes.

Search engines use a spider/ bot (robot) or Web crawler to browse the web and add information to their index.  Yahoo! Slurp, Googlebot, and MSNbot (Bing) are examples of Web crawlers.

The search engines scan for information like the following to create their indexes:
  • Page title - the title on the top of the browser bar
  • URL - specifically the domain name 
  • Meta tag keywords - descriptive keywords coded into the web page's HTML that is readable by the web crawler but invisible to the reader
  • Occurrence of keywords - the frequency of use and where words appear on the page
  • Full text searching - all the words on a page
  • Internal links within the page to the other pages oat the site, for example a site map
  • Number and relevancy of other pages that link to a page
The web page information is stored in a database on a server creating an index.   Web crawlers continuously crawl the web for new and updated information.

Search engines try to present the most relevant and useful results at the top of their indexes to attract more users.  On most search engines, advertisers pay for their ads to appear for relevant keywords.  The ads are usually labeled and on the right side of the web page.

Metasearch Engines


When a user submits a query into a metasearch engine's text box, the metasearch engine submits the search query to a number of search engines at one time and then compiles the results from all of them into a single list. A good metasearch result will have no duplicate entries, categorize the hits based on topic, order the hits by relevance, and indicate which search engines provided the results . Some metaserch engines mix paid and sponsored hits together in the same search results list; therefore, the source must be carefully reviewed.