Webagra-vation
By Bob Kaehms
To invent, you need a good imagination and a pile of junk. -- Thomas Edison
At times we all get caught up in the promises and expectations of the next great product. Whether it improves our sex life or adds zing to our Web site, we all too often find ourselves buying in to that magic pill, HTML editor, or asset-management software that will do it all, and more! Other times we have to step back and reexamine our roots, and take stock of our assets, procedures, and basic premises in order to exercise, elucidate, or enumerate the underlying technique of the trade, be it tackling in football or table layout in HTML -- that is the essence of Back to Basics.
Sometimes when we Web professionals study old Web sites we wonder if we can tackle the problems at all. Faced with the programmer's equivalent of spaghetti code -- the broken, relative, and missing links, the inconsistent treatment of graphic elements and tables, the include files that aren't (because someone changed servers), and the navigational aids that lead to nowhere (because the department maintaining those pages no longer exists), may make us want to throw it away and start over. At times like these, we may find ourselves on a frantic search for a good Perl script, or some other way to strip it all back down to its basic form.
My favorite is lynx -dump my.html > raw-html. If you try this, lynx will dump formatted output to STDOUT and remove tables, replace images with [INLINE], or the contents of <ALT> tags, and annotate your links with numeric references.