JAXB Binder and XPath

came across javax.xml.bind.Binder when I was reading SOA Using Java Web Services (excellent book). I had never used this class before so I set out to find how the class could be used. I found that the class hadn’t been mentioned as much as the classes like Marshaller or UnMarshaller but it was very useful

Binder is usually used to perform partial binding; unmarshalling JAXB object from a part of XML DOM tree. JAXB specification states three use cases of the class. Two are related to partial binding and another one is about the capability of using XPath navigation. It is the last one that I am interested in the most because I actually have a module of my product that I can make use of this technique perfectly

Below is the XML schema I have created just to simulate the functionality of the module

<schema xmlns="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://ws.news.com/query"
    xmlns:tns="http://ws.news.com/query"
    elementFormDefault="qualified"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
     
    <element name="Query" type="tns:Query"/>
     
    <complexType name="Query">
        <sequence>
            <element name="TimeOut" type="int"/>
            <element name="Hit" type="int"/>  
            <element name="Filter" type="tns:Filter"/>  
        </sequence>
    </complexType> 
     
    <complexType name="Filter">
        <group ref="tns:Searchable"/>
    </complexType>
     
    <element name="And" type="tns:BooleanExpr"/> 
    <element name="Or" type="tns:BooleanExpr"/>
     
    <group name="Searchable"> 
        <choice>
            <element name="Company" type="string"/>
            <element name="Section" type="string"/>
            <element name="TitleText" type="string"/>
            <element name="TitleAndBodyText" type="string"/>
            <element ref="tns:And"/>
            <element ref="tns:Or"/>
        </choice>
    </group>
     
    <complexType name="BooleanExpr">
        <sequence>
            <group ref="tns:Searchable" minOccurs="2" maxOccurs="unbounded"/> 
        </sequence>
    </complexType>
</schema>

The schema describes request format of a kind of search engine. Users are able to search for item that associated with metadata; Company/Section or search for item that contains a particular string. In my real production code, it’s a news server. The interesting thing is that the schema allow user to group searchable indexes using boolean operator like And, Or. The boolean operators can also be comprise of sub boolean operators allowing the request to grow with no limit of the depth of content tree

The functionality of the module I’ve mentioned is to extract all occurrences of JAXB object correspondent to , perform decoration on the content of the indexes then replace the original content with the newly decorated one. Below is an example of a simple request

<Query xmlns="http://ws.news.com/query"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 
    <TimeOut>10</TimeOut>
    <Hit>60</Hit>
    <Filter>
        <Or>
            <And>
                <Or>
                    <Section>News</Section>
                    <Section>Announcement</Section>
                    <Section>Product</Section>
                </Or> 
                <Company>BBL.BK</Company> 
            </And>
            <And>
                <Or>
                    <Section>News</Section>
                    <Section>Announcement</Section>
                    <Section>Product</Section>
                </Or>
                <Company>PTT.BK</Company>
            </And>
            <And>
                <Section>Trade</Section>
                <Company>SCB.BK</Company>
                <Company>MSFT.O</Company>
                <Company>IBM.N</Company>
            </And>
        </Or>
    </Filter>
</Query>

Manipulating JAXB object is normally easier than operating on low level DOM. But traversing through the whole JAXB object hierarchy is not much better than traversing DOM tree. Especially when the JAXB object we are working with is not quite straightforward. Let’s look at the generated BooleanExpr class for example

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "BooleanExpr", propOrder = {
    "searchable"
})
public class BooleanExpr {
  @XmlElementRefs({
    @XmlElementRef(name = "Or", namespace = "http://ws.news.com/query", type = JAXBElement.class),
    @XmlElementRef(name = "TitleAndBodyText", namespace = "http://ws.news.com/query", type = JAXBElement.class),
    @XmlElementRef(name = "TitleText", namespace = "http://ws.news.com/query", type = JAXBElement.class),
    @XmlElementRef(name = "Section", namespace = "http://ws.news.com/query", type = JAXBElement.class),
    @XmlElementRef(name = "And", namespace = "http://ws.news.com/query", type = JAXBElement.class),
    @XmlElementRef(name = "Company", namespace = "http://ws.news.com/query", type = JAXBElement.class)
  })
  protected List<JAXBElement<?>> searchable;
 
  public List<JAXBElement<?>> getSearchable() {
    if (searchable == null) {
        searchable = new ArrayList<JAXBElement<?>>();
    }
    return this.searchable;
  }
}

The concept of data binding between Java and XML is not a perfect world. XML is a very large and complex standard. It’s very difficult if not impossible to define mapping between Java representation and the whole XML information set seamlessly. Some XML artifacts are not able to be mapped to Java with all XML constraints 100% preserved

Content in BooleanExpr is a choice model group which combines with the maxOccurs=”unbounded” constraint to make the getSearchable() method doesn’t look so nice. Traversing through this BooleanExpr need some checking to see what is the object being operated on

public static void handleBooleanExpr(BooleanExpr expr){
  List<JAXBElement<?>> searchableList = expr.getSearchable();
  for(JAXBElement<?> elem : searchableList){
  if( elem.getName().equals(andQname ) || elem.getName().equals(orQname )){
    handleBooleanExpr( (BooleanExpr)elem.getValue() );
                 
   }else{
    if( elem.getName().equals(companyQname ) ){
        decorate(elem);
    }
   }
  }
}

I am using just one choice model group in the example because I don’t want to make it too complicated. You may be able to guess that the code to traverse JAXB object will get bloated quickly if there are three or more choice model groups

If our JAXB object was DOM document then XPath is the clear choice for this kind of task. But if I want to use DOM then I have to marshall the JAXB object to DOM, apply XPath query to perform decoration then unmarshall the modified DOM back to JAXB object. I need to repeat this round-trip processing every time I want to use XPath on the request. It would be nice if I can operate on the request both with JAXB object and XPath. JAXB Binder allows you to do just that

public void decorateCompany(Query query) throws JAXBException, XPathExpressionException{  
        Binder<Node> binder = _ctx.createBinder();
        Node queryDOMView = createBlankDOMDocument(true);  
         
        //Marshall Query object to a blank DOM document.
        //Binder will maintains association between two views.
        QName qname = new QName("http://ws.news.com/query", "Query"); 
        binder.marshal( new JAXBElement<Query>(qname, Query.class, query)  , queryDOMView);
         
        //Search for all occurrences of Company using XPath.
        XPath xpath = XPathFactory.newInstance().newXPath();
        xpath.setNamespaceContext( new QueryNamespaceContext());
        NodeList compList = (NodeList)xpath.evaluate("//query:Company", queryDOMView, XPathConstants.NODESET);
         
        //Perform decoration
        for(int i=0; i<compList.getLength(); i++){
            Node comp = compList.item(i);
            comp.setTextContent( decorate( comp.getTextContent() )); 
        }
         
        //Synchronize the changes back to Query object.
        binder.updateJAXB(queryDOMView);
         
          
    }
     
    public Node createBlankDOMDocument(boolean namespaceAware) { 
        DocumentBuilderFactory fact = DocumentBuilderFactory.newInstance();
        fact.setNamespaceAware(namespaceAware);
        DocumentBuilder builder;
        try {
            builder = fact.newDocumentBuilder();
 
        } catch (ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
 
        return builder.newDocument();
    }

Binder maintains the association between JAXB object and its correspondent XML information set. You can bind Query object to DOM document then modify JAXB object and update the modification to the associated DOM. Or you can modify the DOM tree and then synchronize the changes back to JAXB object.

This will give us the best from both worlds. It’s easy to get simple properties like Hit or TimeOut from Query object and I also have option to use low level XML manipulation like XPath to search for particular information from the whole Query object graph