Migrating from HTML to XML
By Peter Fischer
As the Internet world shifts its focus to XML and related technologies, what happens to HTML? Everywhere you go, products are becoming "XMLitized" as vendors rush to gain market share. While this is great for companies that are only now beginning to build their infrastructures, what about the rest of us whose sites have existed for years, accumulating documents architected on old HTML technology? How are we to take our millions and millions of HTML documents and bring them into the next generation of Internet computing? Fortunately, the market for tools in this space is growing, and technologies like Extensible Hypertext Markup Language (XHTML) are making it easier to migrate your repository of existing HTML documents.
The Motive
HTML began as a simple markup language for formatting data. It quickly evolved into a monster used to display data, and is now composed of many proprietary tags that aren't supported by every browser. Extraneous visual elements, like the <font> tag, only add to HTML's bloat. With the advent of newer devices whose displays are not as visually oriented, like handhelds and mobile phones, HTML is no longer capable of standing up to the new challenges of Internet- and Web-based computing. We're left with a legacy of information captured in HTML that can't evolve to support new computing platforms and paradigms.
XML promises relief. Because content creators must focus on the structure of their documents as opposed to their display, XML documents contain clean information that can be repurposed for various forms of presentation.