magazine resources subscribe about advertising

New Architect Daily
Commentary and updates on current events and technologies

CMP Media E-Book

Download your copy today.

Research
Search for reports and white papers from industry vendors and analysts.

This Week at NewArchitect.com Subscribe now to our free email newsletter and get notified when the site is updated with new articles







Day of Defeat Online Gaming

 New Architect > Archives > 2001 > 09 > Access  

The World's Information Desk

A Discussion with Google's Craig Silverstein

With an index of 1.3 billion documents that refreshes every 28 days, few companies can say they handle more data on the Web than Google. Web Techniques talked to Craig Silverstein, Google's director of technology, to learn how it's done—and where Google is going.

Web Techniques: If any task truly resembles looking for the proverbial needle in a haystack, searching the Web is it. How did Google's engineers approach that problem?

Craig Silvertein: We owe a huge debt to the large body of research in information retrieval that's been developed since the 1960s. But we added two elements that weren't yet in wide use: significant HTML analysis—that is, we looked not only at the text itself but also the markup used on the text—and link analysis. The link analysis research—what became the PageRank algorithm—is what really drove the new company.

WT: So how does PageRank work?

CS: It takes advantage of the fact that the Web has links. We can use the Web's link structure to get a quality score for every page on the Web. If a lot of high-PageRank pages point to your site, then your site also gets a high PageRank. PageRank wasn't developed for Web search, actually. But when Larry Page, the developer, started studying it, he discovered that the PageRank of a page corresponded closely to his intuitive idea of the quality or importance of a Web page. Intuitively, if Yahoo, the New York Times, and the maintainer of the most popular Barbie Doll site all link to your Web page—I won't try to guess what your Web page might be about—that reflects well on your page.




  Day of Defeat Online Gaming

home | daily | current issue | archives | features | critical decisions | case studies | expert opinion | reviews | access | industry events | newsletter | research | careers | info centers | advertising | subscribe | subscriber service | editorial calendar | press | contacts


Copyright © 2006 CMP Media, LLC Read our privacy policy, your California privacy rights, terms of service.
SDMG Web sites: BYTE.com, C/C++ Users Journal, Developer Pipeline, Dr. Dobb's Journal, DotNetJunkies, MSDN Magazine, Sys Admin,
SD Expo, SD Magazine, SqlJunkies, The Perl Journal, Unixreview, Windows Developer Network, New Architect

web2