XMLTool

The XMLTool viewtool can be used to parse an remote XML file and iterate over each object returned in the XML file as needed.

The XMLTool supports full XPath expressions, includes built-in caching for remote XML files with configurable TTL, and method chaining. It blocks access to private subnets not explicitly allowed, and disables DOCTYPE declarations and external entities to prevent XXE attacks.

Toolbox Configuration#


<tool>
    <key>xmltool</key>
    <scope>application</scope>
    <class>com.dotmarketing.viewtools.xmltool</class>
</tool>

Core Methods#


MethodArgumentsDescription
read(o)Object o (URL or String)Reads and parses XML from a URL or file path. Uses caching with configurable TTL (default 30 minutes). Returns a new XmlTool instance with the parsed document.
parse(o)Object o (XML String)Parses XML from a string. Returns a new XmlTool instance with the parsed XML document as root node.
get(o)Object o (attribute name, number, or XPath)Multi-purpose getter that first tries to find an attribute, then treats as array index, or finally as XPath expression. Returns matching content or XmlTool instance.
find(o)Object o (XPath expression)Performs XPath selection on current nodes. If no '/' found, prepends '//' for descendant search. Returns new XmlTool with matching nodes.
find(xpath)String xpathPerforms XPath selection with string parameter. Supports full XPath syntax. Returns new XmlTool instance wrapping results.

MethodArgumentsDescription
getParent()NoneReturns new XmlTool wrapping the parent Element of the first/sole node. Returns null if no parent exists.
parents()NoneReturns new XmlTool wrapping all unique parent Elements of currently held nodes (immediate parents only, not ancestors).
children()NoneReturns new XmlTool wrapping all child Elements of currently held Element nodes.
getFirst()NoneReturns XmlTool wrapping only the first node from internal node list. Returns self if only one node.
getLast()NoneReturns XmlTool wrapping only the last node from internal node list. Returns self if only one node.
get(Number n)Number n (index)Returns XmlTool wrapping the node at specified index. Returns null for invalid indices.

Information Methods#


MethodArgumentsDescription
getName()NoneFirst attempts to get "name" attribute/element, falls back to node name via getNodeName().
getNodeName()NoneReturns the name of the first/sole node. Returns null if empty.
getPath()NoneReturns the XPath that identifies the first/sole node.
getText()NoneReturns concatenated text content of all internally held nodes, trimmed. Most useful with single node.
attr(Object o)Object o (attribute name)Returns value of specified attribute for first/sole Element node. Returns null for non-Elements or missing attributes.
attributes()NoneReturns Map<String, String> of all attributes for first/sole Element node. Returns null for non-Elements.

Utility Methods#


MethodArgumentsDescription
isEmpty()NoneReturns true if no nodes are internally held by this instance.
size()NoneReturns the number of nodes internally held. Returns 0 if empty.
iterator()NoneReturns Iterator that creates new XmlTool instances for each held node. Returns null if empty.
node()NoneReturns the first/sole Node from internal list. Returns null if empty.
toString()NoneReturns XML string representation of all held nodes (or super.toString() if empty). Attributes return just their values.

Configuration Methods#


MethodArgumentsDescription
setTTL(long ttl_time)long ttl_time (minutes)Static method to set cache Time To Live in minutes. Default is 30 minutes.
getTTL()NoneStatic method that returns current TTL setting in minutes.

Example Usage Patterns#


## Read remote XML with caching
#set($xml = $xmltool.read("https://example.com/data.xml"))

## Navigate and extract data
$xml.find("//book").getFirst().get("title").text
$xml.children().get(0).attr("id")

## XPath queries
$xml.find("//book[@category='fiction']")
$xml.find("/catalog/book[1]/title")

Community Example 1#

Here's a usage example created by Chris Falzone of the dotCMS community. Thanks, Chris!

First, we'll start with an example of an XML page that can be parsed: http://www.w3schools.com/XML/cd_catalog.xml

To parse the XML file on a dotCMS page, preview the following VTL example which parses an XML page and then iterates over five objects at the bottom of this document:

#set($myXML = $xmltool.read("https://www.w3schools.com/XML/cd_catalog.xml"))
<table  border="1" style="width:100%;">
    <tr>
        <th>Title</th>
        <th>Artist</th>
        <th>Country</th>
        <th>Company</th>
        <th>Price</th>
        <th>Year</th>
    </tr>
    #foreach($cd in $myXML.children().iterator())
        #set($cdXML = $xmltool.parse($cd))
        #if($velocityCount <= 5)
            <tr>
                <td>$cdXML.TITLE.text </td>
                <td>$cdXML.ARTIST.text </td>
                <td>$cdXML.COUNTRY.text </td>
                <td>$cdXML.COMPANY.text </td>
                <td>$cdXML.PRICE.text </td>
                <td>$cdXML.YEAR.text</td>
            </tr>
        #end
    #end
</table>

This renders on a page as follows:

TitleArtistCountryCompanyPriceYear
Empire Burlesque Bob Dylan USA Columbia 10.90 1985
Hide your heart Bonnie Tyler UK CBS Records 9.90 1988
Greatest Hits Dolly Parton USA RCA 9.90 1982
Still got the blues Gary Moore UK Virgin records 10.20 1990
Eros Eros Ramazzotti EU BMG 9.90 1997

Community Example 2#

Jonathan Sanchez offers an example drawing from an RSS feed:

#set($myXML = $xmltool.read("https://news.leavitt.com/category/press-releases/affiliation/feed/"))
<h3>Raw XML Fetched:</h3> 

-------

#if(!$myXML.isEmpty() )
<p>Found <strong>${myXML.size()}</strong> &lt;CD&gt; elements.</p>
#foreach($node in $myXML.get('/rss/channel/item/media:content'))
$node
#end
#else
<p><strong>WARNING:</strong> No &lt;CD&gt; elements under $myXML.CATALOG.</p>
#end

This renders as follows:

<h3>Raw XML Fetched:</h3> 

-------

<p>Found <strong>1</strong> &lt;CD&gt; elements.</p>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/07/Stalwart-Insurance-Partners_web750.jpg" medium="image"/>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/07/johnstown-pa-insurance-agency_750x-500.jpg" medium="image"/>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/05/Aspen-Insurance_web750.jpg" medium="image"/>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/05/Lawrenceville_Georgia_web750.jpg" medium="image"/>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/04/Peterson-Strouse_web750.jpg" medium="image"/>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2025/01/Rainsville-Group_web750.jpg" medium="image"/>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2024/11/Image_Troy-Insurance_web750.jpg" medium="image"/>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2024/11/BANNER_HRBenefitsIntoLGNorthwest_750x500.jpg" medium="image"/>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2024/10/BSG-Group_web750.jpg" medium="image"/>
<media:content xmlns:media="http://search.yahoo.com/mrss/" url="https://news.leavitt.com/wp-content/uploads/2024/10/Troy-Insurance_web750.jpg" medium="image"/>
    XML | dotCMS Dev Site