ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Processing XML with Xerces and the DOM
Pages: 1, 2, 3, 4

Modifying the DOM Tree in Memory

Having in-memory, tree-style access to an XML document is useful for plucking out pieces of data; but half of the benefit of DOM is the ability to modify that tree and save it back to a file. The program step4 rounds out the examples by altering the in-memory tree and saving it back to the original config file. All of this happens in the XMLConfigData:::commit() method.



commit() first calls updateObject() and updateXML(). These update the last-modified date and sync the object with the backing DOM tree, respectively. updateXML() involves updating some attributes, replacing a node, and adding some new nodes.

Updating an attribute is similar to getting its value: call the element node's setAttribute() member. This excerpt sets the <config> element's lastupdate attribute:

xercesc::DOMElement* configElement =
  finder_.getConfigElement() ;

configElement->setAttribute(
  finder_.ATTR_LASTUPDATE_.asXMLString() ,
  sm.convert( getLastUpdate() )
) ;

The sample code doesn't update the <login> tag's user or password attributes, but those would follow the same formula.

Updating the <reports> node takes a little more work. You could delete all of the child <report> elements and create new ones. However, for purely illustrative purposes, step4 takes the long route: it creates a new <reports> element, populates that element with new <report> children, and then swaps the old <reports> for the new.

The parent document owns all nodes, by default. To create an element, call the parent document's createElement() member:

xercesc::DOMElement* newReportsElement =
  xmlDoc->createElement( finder_.TAG_REPORTS_.asXMLString() )  ;

Next, create new <report> elements and add them under the new <reports> element:

for( ... each report in the XMLConfigData object ...){
  xercesc::DOMElement* element =
    xmlDoc->createElement( ... ) ;

  newReportsElement->appendChild( element ) ;

Finally, swap the old and new elements:

xercesc::DOMElement* oldReportsElement =
  finder_.getReportsElement() ;

configElement->replaceChild(
  newReportsElement ,
  oldReportsElement
) ;

You don't have to free the oldReportsElement pointer explicitly, as the parent XMLDocument still owns it.

Writing XML

The last part of commit() takes care of saving the DOM tree back to disk, using a LocalFileFormatTarget object. Xerces also supports storing XML in a memory buffer (MemBufFormatTarget) and writing to standard output (StdOutFormatTarget). You're free to implement your own FormatTarget class for custom output.

A DOMWriter object is responsible for writing out the data. step4 configures the DOMWriter to add spacing and formatting to make the document more human-readable:

xercesc::DOMWriter* writer =  ... create new writer ...
writer->setFeature(
  xercesc::XMLUni::fgDOMWRTFormatPrettyPrint ,
  true
);

Finally, step4 calls the writer to write out the document:

writer->writeNode( outfile , *xmlDoc ) ;

Note that because the parent document does not own the LocalFileFormatTarget and DOMWriter, the code calls delete() on them explicitly.

If you check step4's output, you'll notice the in-memory DOM tree has become well-formed XML. Furthermore, the file is an accurate representation of the DOM tree managed by the program: unmodified nodes remain as is, including comments. (Remember, comments are valid XML constructs; they're just not valid elements.)

That's all for Xerces-C++ and DOM. My next article will show Xerces's SAX side and explain XML validation using DTD and schema.

Resources

  • You can download this article's sample code..
  • Despite its title, Elliotte Rusty Harold's Processing XML with Java is a useful reference for XML processing in all languages. The book is available for purchase as a hard copy, or you can read it all online.
  • The Xerces-C++ web site has links to documentation and downloads. Binaries are available for several platforms. While no RPMs are available, the source bundle includes a spec file for building your own.

Q Ethan McCallum grew from curious child to curious adult, turning his passion for technology into a career.


Return to ONLamp


Valuable Online Certification Training

Online Certification for Your Career
Earn a Certificate for Professional Development from the University of Illinois Office of Continuing Education upon completion of each online certificate program.

Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills.

PHP/SQL Programming Certificate — The PHP/SQL Programming Certificate series is comprised of four courses covering beginning to advanced PHP programming, beginning to advanced database programming using the SQL language, database theory, and integrated Web 2.0 programming using PHP and SQL on the Unix/Linux mySQL platform.

Enroll today!


Sponsored by: