XMLTool
The XMLTool viewtool can be used to parse an remote XML file and iterate over each object returned in the XML file as needed.
The XMLTool supports full XPath expressions, includes built-in caching for remote XML files with configurable TTL, and method chaining. It blocks access to private subnets not explicitly allowed, and disables DOCTYPE declarations and external entities to prevent XXE attacks.
Toolbox Configuration#
<tool> <key>xmltool</key> <scope>application</scope> <class>com.dotmarketing.viewtools.xmltool</class> </tool>
Core Methods#
Method | Arguments | Description |
---|---|---|
read(o) | Object o (URL or String) | Reads and parses XML from a URL or file path. Uses caching with configurable TTL (default 30 minutes). Returns a new XmlTool instance with the parsed document. |
parse(o) | Object o (XML String) | Parses XML from a string. Returns a new XmlTool instance with the parsed XML document as root node. |
get(o) | Object o (attribute name, number, or XPath) | Multi-purpose getter that first tries to find an attribute, then treats as array index, or finally as XPath expression. Returns matching content or XmlTool instance. |
find(o) | Object o (XPath expression) | Performs XPath selection on current nodes. If no '/' found, prepends '//' for descendant search. Returns new XmlTool with matching nodes. |
find(xpath) | String xpath | Performs XPath selection with string parameter. Supports full XPath syntax. Returns new XmlTool instance wrapping results. |
Navigation Methods#
Method | Arguments | Description |
---|---|---|
getParent() | None | Returns new XmlTool wrapping the parent Element of the first/sole node. Returns null if no parent exists. |
parents() | None | Returns new XmlTool wrapping all unique parent Elements of currently held nodes (immediate parents only, not ancestors). |
children() | None | Returns new XmlTool wrapping all child Elements of currently held Element nodes. |
getFirst() | None | Returns XmlTool wrapping only the first node from internal node list. Returns self if only one node. |
getLast() | None | Returns XmlTool wrapping only the last node from internal node list. Returns self if only one node. |
get(Number n) | Number n (index) | Returns XmlTool wrapping the node at specified index. Returns null for invalid indices. |
Information Methods#
Method | Arguments | Description |
---|---|---|
getName() | None | First attempts to get "name" attribute/element, falls back to node name via getNodeName(). |
getNodeName() | None | Returns the name of the first/sole node. Returns null if empty. |
getPath() | None | Returns the XPath that identifies the first/sole node. |
getText() | None | Returns concatenated text content of all internally held nodes, trimmed. Most useful with single node. |
attr(Object o) | Object o (attribute name) | Returns value of specified attribute for first/sole Element node. Returns null for non-Elements or missing attributes. |
attributes() | None | Returns Map<String, String> of all attributes for first/sole Element node. Returns null for non-Elements. |
Utility Methods#
Method | Arguments | Description |
---|---|---|
isEmpty() | None | Returns true if no nodes are internally held by this instance. |
size() | None | Returns the number of nodes internally held. Returns 0 if empty. |
iterator() | None | Returns Iterator |
node() | None | Returns the first/sole Node from internal list. Returns null if empty. |
toString() | None | Returns XML string representation of all held nodes (or super.toString() if empty). Attributes return just their values. |
Configuration Methods#
Method | Arguments | Description |
---|---|---|
setTTL(long ttl_time) | long ttl_time (minutes) | Static method to set cache Time To Live in minutes. Default is 30 minutes. |
getTTL() | None | Static method that returns current TTL setting in minutes. |
Example Usage Patterns#
## Read remote XML with caching #set($xml = $xmltool.read("https://example.com/data.xml")) ## Navigate and extract data $xml.find("//book").getFirst().get("title").text $xml.children().get(0).attr("id") ## XPath queries $xml.find("//book[@category='fiction']") $xml.find("/catalog/book[1]/title")
Community Example 1#
Here's a usage example created by Chris Falzone of the dotCMS community. Thanks, Chris!
First, we'll start with an example of an XML page that can be parsed: http://www.w3schools.com/XML/cd_catalog.xml
To parse the XML file on a dotCMS page, preview the following VTL example which parses an XML page and then iterates over five objects at the bottom of this document:
#set($myXML = $xmltool.read("https://www.w3schools.com/XML/cd_catalog.xml")) <table border="1" style="width:100%;"> <tr> <th>Title</th> <th>Artist</th> <th>Country</th> <th>Company</th> <th>Price</th> <th>Year</th> </tr> #foreach($cd in $myXML.children().iterator()) #set($cdXML = $xmltool.parse($cd)) #if($velocityCount <= 5) <tr> <td>$cdXML.TITLE.text </td> <td>$cdXML.ARTIST.text </td> <td>$cdXML.COUNTRY.text </td> <td>$cdXML.COMPANY.text </td> <td>$cdXML.PRICE.text </td> <td>$cdXML.YEAR.text</td> </tr> #end #end </table>
This renders on a page as follows:
Title | Artist | Country | Company | Price | Year |
---|---|---|---|---|---|
Empire Burlesque | Bob Dylan | USA | Columbia | 10.90 | 1985 |
Hide your heart | Bonnie Tyler | UK | CBS Records | 9.90 | 1988 |
Greatest Hits | Dolly Parton | USA | RCA | 9.90 | 1982 |
Still got the blues | Gary Moore | UK | Virgin records | 10.20 | 1990 |
Eros | Eros Ramazzotti | EU | BMG | 9.90 | 1997 |
Community Example 2#
Jonathan Sanchez offers an example drawing from an RSS feed:
#set($myXML = $xmltool.read("https://news.leavitt.com/category/press-releases/affiliation/feed/")) <h3>Raw XML Fetched:</h3> ------- #if(!$myXML.isEmpty() ) <p>Found <strong>${myXML.size()}</strong> <CD> elements.</p> #foreach($node in $myXML.get('/rss/channel/item/media:content')) $node #end #else <p><strong>WARNING:</strong> No <CD> elements under $myXML.CATALOG.</p> #end
This renders as follows:
<h3>Raw XML Fetched:</h3> ------- <p>Found <strong>1</strong> <CD> elements.</p> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/07/Stalwart-Insurance-Partners_web750.jpg" medium="image"/> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/07/johnstown-pa-insurance-agency_750x-500.jpg" medium="image"/> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/05/Aspen-Insurance_web750.jpg" medium="image"/> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/05/Lawrenceville_Georgia_web750.jpg" medium="image"/> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/04/Peterson-Strouse_web750.jpg" medium="image"/> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/01/Rainsville-Group_web750.jpg" medium="image"/> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2024/11/Image_Troy-Insurance_web750.jpg" medium="image"/> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2024/11/BANNER_HRBenefitsIntoLGNorthwest_750x500.jpg" medium="image"/> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2024/10/BSG-Group_web750.jpg" medium="image"/> <media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2024/10/Troy-Insurance_web750.jpg" medium="image"/>