Data Mining on the Web
There's Gold in that Mountain of Data
By Dan R. Greening
When visitors interact with your site, they provide information about themselves and how they respond to your content: which links visitors click, where they spend most of their time, which search terms they use, and when they browse. Some visitors may even fill out a lifestyle survey or provide names and addresses. Complex content also contains important information, such as words in articles, job descriptions and resumes, and features of competitive or complementary products. All this information is often stored in a database.
As a result, you have a lot of information on your Web visitors and content, but you probably aren't making the best use of it. Data warehouse reporting systems, such as those provided by traffic analyzers, aggregate and report facts over different dimensions. (See my article titled "Tracking Users," Web Techniques, July 1999.)
These warehouse reporting systems are commonly called online analytic processing (OLAP) systems. OLAP systems can report only on directly observed and easily correlated information. They rely on you to discover patterns and decide what to do with them. OLAP systems won't tell you that people frequently buy potato chips, onion soup mix, and sour cream at the same time, and they won't discover that some people love any movie that contains an explosion. The information is even too complex for humans to discover these patterns using an OLAP system.
To solve this problem, marketers and business analysts use data-mining techniques.