Sorting through Search Engines
By Alex Lange
Like the Web itself, many intranets have developed spontaneously without centralized planning, which makes navigating them difficult and time consuming. The intranet where I work, at VLSI Technology, was no exception -- we'd amassed datasheets, databooks, specifications, plans, presentations, policies, procedures, and other information, and we needed a search engine to sift through it all. As the central Webmaster in the information-technology group, I had to select a search-engine system, justify its cost, and get it into operation.
Terminology
By "search engine" I do not mean publicly available services such as Yahoo!, Excite, or AltaVista, nor do I mean the algorithms and logic to seek for patterns in large amounts of data. In the context of this article, "search engine" will refer to two things:
- The search engine proper -- software with which you formulate a question and execute a search for information.
- Behind the scenes software -- also known as a "spider," "robot," "bot," "crawler," or "indexer" -- that creates searchable databases or indexes.
It's also helpful to distinguish between "searching," the act performed by the end user, and "crawling," the act of returning to a data source to update the indexes.
Requirements
The search-engine users needed easy-to-use, powerful searching; as the Webmaster, I was looking for easy maintenance and versatile indexing.