Simple XML Processing and Queries
By Randal L. Schwartz
The buzz is still abuzz about XML. You've probably seen XML about 47 times in this issue before reaching my column (unless of course my column is the first one to which you turn). XML processing in Perl is a breeze, but also a rapidly evolving technology, so I decided to tackle a simple problem in an unexpected way to show off some of the strategies.
The problem begins with the Perl Mongers. These Perl user groups are all registered with "the mother ship" at the pm.org site. The pages there give a nice textual description of each group, but they don't let you search through the groups very easily.
Using the HTML::Parser module in a Perl script of your own, you could scan through the output of the pages looking for specific words, but there's a better solution. Fortunately, the creators of pm.org provide two versions of the list one in the usual HTML format for your browser, and one in XML format for scripts and machines. The XML is exactly what we need! We can write a script that searches specific tags in the structured XML document. This will yield better results than with the HTML document.
Example 1 shows a slightly modified sample of the XML document that I got from pm.org. Note that the document includes a lot of information, all tagged appropriately. For our application, we'll fetch this XML document from the pm.org site, and extract the name of the group, the city/state/country location, and the contact info for the "tsar."