magazine resources subscribe about advertising

New Architect Daily
Commentary and updates on current events and technologies

CMP Media E-Book

Download your copy today.

Research
Search for reports and white papers from industry vendors and analysts.

This Week at NewArchitect.com Subscribe now to our free email newsletter and get notified when the site is updated with new articles







Day of Defeat Online Gaming

 New Architect > Archives > 1997 > 07 > Features  

Parallel and Distributed Searching

Speeding up your database searches?

By Andrew Davison

Many developers assume that parallelism and distributed programming always speed up sequential code. However, when I tested this rose-tinted view by timing the code for various search programs, the results were sobering. This article presents those results and describes how a simple search program for a 24-MB database can be parallelized and distributed across several machines. Along the way, I'll discuss the C fork()and wait()library functions, shared memory, and remote procedure calls (RPCs).

The test database holds membership details in ASCII, with one line per member, sorted by last name. Because of the size of the database, it is stored in four 6-MB files: a2f.txt, g2l.txt, m2r.txt, and s2z.txt. A search query can be given any string, and must return the total number of database lines that contain it, printing up to 40 matching lines. Since the string may appear anywhere in a member's details, a simple linear search is used to scan the database.

Using fgrep

One implementation approach is to use the UNIX fgrep (fast grep) command to print out matching lines. For example, fgrep Andrew a2f.txt will print all the lines containing "Andrew" in a2f.txt, while a linear search of the entire database would be coded as fgrep Andrewva2f.txt ; fgrep Andrew g2l.txt ; fgrep Andrew m2r.txt ; fgrep Andrew s2z.txt.

Each database component file takes about seven seconds to search; the total time for the entire database is about 27 seconds.




  Day of Defeat Online Gaming

home | daily | current issue | archives | features | critical decisions | case studies | expert opinion | reviews | access | industry events | newsletter | research | careers | info centers | advertising | subscribe | subscriber service | editorial calendar | press | contacts


Copyright © 2006 CMP Media, LLC Read our privacy policy, your California privacy rights, terms of service.
SDMG Web sites: BYTE.com, C/C++ Users Journal, Developer Pipeline, Dr. Dobb's Journal, DotNetJunkies, MSDN Magazine, Sys Admin,
SD Expo, SD Magazine, SqlJunkies, The Perl Journal, Unixreview, Windows Developer Network, New Architect

web2