magazine resources subscribe about advertising

New Architect Daily
Commentary and updates on current events and technologies

CMP Media E-Book

Download your copy today.

Research
Search for reports and white papers from industry vendors and analysts.

This Week at NewArchitect.com Subscribe now to our free email newsletter and get notified when the site is updated with new articles







Day of Defeat Online Gaming

 New Architect > Archives > 1998 > 11 > Last Page  

One or More Things I Learned From Regular Expressions

The real business of business-to-business commerce on the Web is exchanging information. In the past, businesses might have exchanged information by sending it on tape or disk, where the receiving company would convert the data and import it into its own database. With the Internet, data can be exchanged in an instant. Still, the question is, how do I get data out of my database and into your database? Or data from your database into my database?

This is a data-conversion problem. To exchange information, each of us must write programs to import or export it. Writing conversion programs that read or write a simple format like a comma-delimited file is simple. Writing conversion programs for unstructured text like HTML documents is more challenging, but fun. I wrote a book, Sed and Awk, about two UNIX utilities used to build conversion tools. My book could have been titled: All I Know About Programming I Learned from Writing Regular Expressions.

A regular expression is a way to identify patterns in text. Programs like sed, grep, awk, Perl, and even the vi editor let you use regular expressions to match text and then perform various manipulations. For instance, a simple sed script could extract all the headings from a set of HTML files. <H[1-3]> will match lines with <H1> or <H2> or <H3>. When running it on a sample file, I might notice that although the program does indeed match what was specified, it missed <h2>, for example.




  Day of Defeat Online Gaming

home | daily | current issue | archives | features | critical decisions | case studies | expert opinion | reviews | access | industry events | newsletter | research | careers | info centers | advertising | subscribe | subscriber service | editorial calendar | press | contacts


Copyright © 2006 CMP Media, LLC Read our privacy policy, your California privacy rights, terms of service.
SDMG Web sites: BYTE.com, C/C++ Users Journal, Developer Pipeline, Dr. Dobb's Journal, DotNetJunkies, MSDN Magazine, Sys Admin,
SD Expo, SD Magazine, SqlJunkies, The Perl Journal, Unixreview, Windows Developer Network, New Architect

web2