Thursday, June 21, 2007

XmlStarlet : Command-Line XML Processing

As a general rule, I love XML.  I've been using it for a good 10 years now, and it's never failed me.  Except when I hit that wall of whether to deal with it as an XML file or a text file.  After all there are a million text utilities out there for searching, replacing, counting, analyzing, editing and so forth.  With your XML file, once you take it out of its application context you're left with either treating it like a fairly wordy text file, or else loading it up into a dedicated XML editor.  Even then, it's hard to accomplish bulk tasks like "Transform these 300 XML files using this style sheet" or "Extract out all the different values for Career/Title that are used". 

XmlStarlet gives you the ability to manipulate your XML files directly from the command line, and it is awesome.  Here's just some of the things you can do:

  1. XSL Transformations. 
  2. Search and extract.  Far more useful than just grepping for your text, you can use XPath to go exactly to the element (or elements) you want.
  3. Editing.  Insert/delete/move stuff around.
  4. Validate against a DTD or Schema.

Remember, if you can do it from the command line, you can do it from a script.  Highly recommended app.


1 comment:

steveo said...

That looks pretty useful. Thanks, Duane!

I should note that its very easy to install under Ubuntu Linux because it is in the repositories that I use:
sudo apt-get install xmlstarlet