XML Interoperability of Serialized Entities in Java and .NET

Abstract:

In order to exchange structured data directly between the platforms, we must be able to easily take the marshalled or serialized definition of the object and turn it into an object in memory.  There are standard ways of marshalling of objects to XML in both Java and .NET.  I have found it a little frustrating in the past when I’ve had to adopt large frameworks or external machinery in order to easily move structured data between the JVM and CLR.   It seems that we should be able to bring these worlds together in a simple set of OOTB idioms, while providing a convenient way (one liner) to move back and forth between object and stringified forms.   For this I have created a minimal helper class for both languages that does the following:

  • Provides a common API between languages for moving between XML string and Objects (entities)
  • Provides adaptation capabilities between canonical XML representations for both Java’s JAXB and .NET’s XmlSerializer
  • Provides a façade to the underlying language and framework mechanics for going between representations
  • Implementation of SerializationHelper.java
  • Implementation of SerializationHelper.cs

The Need for Interoperable Xml Representation of Entities in Java and .NET

Both the Java and .NET ecosystems provide many ways to work with XML, JSON, Binary, YAML, etc. serialization.  In this article I’m focused on the base case between the standard platform-level mechanisms for moving between XML and Object graphs in memory.  The Web Services stacks in both platforms are of course built on top of their respective XML binding or serialization standards.  The standards however differ, in some slight but important ways.  Here I do not seek to build a bullet proof general purpose adapter between languages.  I’ll leave that to the WS-* ppl.  However, I think there is a common and often overlooked ability to do marshalling with XML with little to no additional framework or specialized stack.  Here are some scenarios that make sense with this kind capability.

  • Intersystem Messaging
  • Transforming and Adapting Data Structures
  • Database stored and shared XML
  • Queue-based storage and shared XML
  • File-based storage and shared XML
  • Web Request/Response shared XML

The Specifications:

Java:

JAXB (Java XML Binding)

JSR: 222

.NET

XmlSerializer

Version >= .NET 2.0

First, we need to understand the default differences between the XML output by JAXB and XmlSerializer. To start we’ll create the same entity in both Java and C#. Then we can compare them.

The entity: DataObject

.NET Entity Class:

[Serializable]
public class DataObject
{
   public string Id { get; set; }
   public string Name { get; set; }
   public bool Processed { get; set; }
}

Java Entity Class:

public class DataObject implements Serializable {

	private String id;
	private String name;
	private boolean processed = false;

	public String getId() {
		return id;
	}

	public void setId(String id) {
		this.id = id;
	}

	public String getName() {
		return name;
	}

	public void setName(String name) {
		this.name = name;
	}

	public boolean isProcessed() {
		return processed;
	}

	public void setProcessed(boolean processed) {
		this.processed = processed;
	}
}

Java Entity XML:

<DataObject>
  <id>ea9b96a6-1f8a-4563-9a15-b1ec0ea1bc34</id>
  <name>blah</name>
  <processed>false</processed>
</DataObject>

.NET Entity XML:

<DataObject xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Id>b3766011-a1ab-41bf-9ce2-8566fca5736f</Id>
  <Name>blah</Name>
  <Processed>false</Processed>
</DataObject>

The notable differences in the XML are these:

  • xsi and xsd namespaces are put in by .NET and not by Java
  • The casing of the element names are different.  In fact, they follow the style convention used to create the entity.  The property naming styles between the languages are as follows:
    • Java: CamelCase
    • .NET: PascalCase

Let’s have a look at how we can use a class called SerializationHelper to round-trip objects to xml and back objects. We want it to easily dehydrate (stringify) and rehydrate (objectify) data objects.

The implementation of this class in both Java and C# provides the following api:

String serialize(Object object)
Object deserialize(String str, Class klass)

This is useful for quickly reversing objects to XML and visaversa.

I’ll walk you through how to use it with some tests.

Round Tripping (Java Usage):

@Test
public void can_round_trip_a_pojo_to_xml() throws Exception
{
	SerializationHelper helper = new SerializationHelper();
	DataObject obj = buildDataObject();

	String strObj = helper.serialize(obj);

	DataObject obj2 = (DataObject) helper.deserialize(strObj, DataObject.class);

	Assert.isTrue(obj.getId().equals(obj2.getId()));
	Assert.isTrue(obj.getName().equals(obj2.getName()));

}

Round Tripping (C# Usage):

[TestMethod]
public void can_round_trip_a_poco_to_xml()
{
    SerializationHelper helper = new SerializationHelper();
    DataObject obj = BuildDataObject();

    string strObj = helper.serialize(obj);

    DataObject obj2 = (DataObject)helper.deserialize(strObj, typeof(DataObject));

    Assert.IsTrue(obj.Id.Equals(obj2.Id));
    Assert.IsTrue(obj.Name.Equals(obj2.Name));
}

No problem. A simple single line expression reverses the representation. Now lets see if we can move the stringified representations between runtimes to become objects.

Adapting .NET XML to Java (Java Usage):

@Test
public void can_materialize_an_object_in_java_from_net_xml() throws Exception
{
	SerializationHelper helper = new SerializationHelper();

	String netStrObj = Files.toString(new File("DOTNET_SERIALIZED_DATAOBJECT.XML"), Charsets.UTF_8);

	DataObject obj2 = (DataObject) helper.deserialize(netStrObj, DataObject.class);

	Assert.isTrue(obj2.getName().equals("blah"));
}

Behind the scenes here there is a StreamReaderDelegateunder the hood in the SerializationHelper that is intercepting the inbound XML and camel-casing the names before it attempts to bind them onto the DataObject instance directly.

SerializationHelper.java:

public class SerializationHelper {

	public String serialize(Object object) throws Exception{
		StringWriter resultWriter = new StringWriter();
		StreamResult result = new StreamResult( resultWriter );
		XMLStreamWriter xmlStreamWriter =
		           XMLOutputFactory.newInstance().createXMLStreamWriter(result);

		JAXBContext context = JAXBContext.newInstance(object.getClass());
		Marshaller marshaller = context.createMarshaller();
		marshaller.marshal(new JAXBElement(new QName(object.getClass().getSimpleName()), object.getClass(), object), xmlStreamWriter);

		String res = resultWriter.toString();
	    return res;
	}

	public Object deserialize(String str, Class klass) throws Exception{

        InputStream is = new ByteArrayInputStream(str.getBytes("UTF-8"));
        XMLStreamReader reader = XMLInputFactory.newInstance().createXMLStreamReader(is);
        reader = new CamelCaseTransfomingReaderDelegate(reader, klass);

		JAXBContext context = JAXBContext.newInstance(klass);
		Unmarshaller unmarshaller = context.createUnmarshaller();

		JAXBElement elem = unmarshaller.unmarshal(reader, klass);
		Object object = elem.getValue();

		return object;
	}

	//adapts to Java property naming style
	private static class CamelCaseTransfomingReaderDelegate extends StreamReaderDelegate {

		Class klass = null;

        public CamelCaseTransfomingReaderDelegate(XMLStreamReader xsr, Class klass) {
        	super(xsr);
        	this.klass = klass;
        }

        @Override
        public String getLocalName() {
            String nodeName = super.getLocalName();
            if (!nodeName.equals(klass.getSimpleName()))
            {
            	nodeName = nodeName.substring(0, 1).toLowerCase() +
            			   nodeName.substring(1, nodeName.length());
            }
            return nodeName.intern(); //NOTE: intern very important!..
        }
    }
}

Note the deserialize method is able to do just-in-time fixup of the property name xml nodes to ensure they meet the expection (a camelCased fieldname) of the default jaxb unmarshalling behavior.

Now to go from XML produced by the default JAXB serializer to .NET objects with the same api. To do this I’ll switch back to C# now.

Adapting Java XML to .NET (C# Usage):

[TestMethod]
public void can_materialize_an_object_in_net_from_java_xml()
{
    string javaStrObj = File.ReadAllText("JAVA_SERIALIZED_DATAOBJECT.XML");

    SerializationHelper helper = new SerializationHelper();

    DataObject obj2 = (DataObject)helper.deserialize(javaStrObj, typeof(DataObject));

    Assert.isTrue(obj2.getName().equals("blah"));
}

In this case, I’m using a custom XmlReader that adapts the XML from Java style property names to .NET style. The pattern in Java and .NET is roughly the same for adapting the XML into a consumable form. This is the convenience and power that using an intermediary stream reader gives you. It basically applies changes to the node names it needs to bind them to the correct property names. The nice thing is that this happens just-in-time, as the XML being deserialized into a local Object.

Here is the C# implementation of the same SerializationHelper api in .NET.

SerializationHelper.cs:

public class SerializationHelper
{

    public string serialize(object obj)
    {
        using (MemoryStream stream = new MemoryStream())
        {
            XmlSerializer xs = new XmlSerializer(obj.GetType());
            xs.Serialize(stream, obj);
            return Encoding.UTF8.GetString(stream.ToArray());
        }
    }

    public object deserialize(string serialized, Type type)
    {
        using (MemoryStream stream = new MemoryStream(Encoding.UTF8.GetBytes(serialized)))
        {
            using (var reader = new PascalCaseTransfomingReader(stream))
            {
                XmlSerializer xs = new XmlSerializer(type);
                return xs.Deserialize(reader);
            }
        }
    }

    private class PascalCaseTransfomingReader : XmlTextReader
    {
        public PascalCaseTransfomingReader(Stream input) : base(input) { }

        public override string this[string name]
        {
            get { return this[name, String.Empty]; }
        }

        public override string LocalName
        {
            get
            {
                // Capitalize first letter of elements and attributes.
                if (base.NodeType == XmlNodeType.Element ||
                    base.NodeType == XmlNodeType.EndElement ||
                    base.NodeType == XmlNodeType.Attribute)
                {
                    return base.NamespaceURI == "http://www.w3.org/2000/xmlns/" ?
                           base.LocalName : MakeFirstUpper(base.LocalName);
                }
                return base.LocalName;
            }
        }

        public override string Name
        {
            get
            {
                if (base.NamespaceURI == "http://www.w3.org/2000/xmlns/")
                    return base.Name;
                if (base.Name.IndexOf(":") == -1)
                    return MakeFirstUpper(base.Name);
                else
                {
                    // Turn local name into upper, not the prefix.
                    string name = base.Name.Substring(0, base.Name.IndexOf(":") + 1);
                    name += MakeFirstUpper(base.Name.Substring(base.Name.IndexOf(":") + 1));
                    return NameTable.Add(name);
                }
            }
        }

        private string MakeFirstUpper(string name)
        {
            if (name.Length == 0) return name;
            if (Char.IsUpper(name[0])) return name;
            if (name.Length == 1) return name.ToUpper();
            Char[] letters = name.ToCharArray();
            letters[0] = Char.ToUpper(letters[0]);
            return NameTable.Add(new string(letters));
        }

    }
}

I think it’s important to have a thorough understanding and good control of the basics of serialization. In some cases, we’re just consuming a serialized object from a message queue, a file, or a database. The ability to move entities between process and stack boundaries should be easy.

It should take only 1 line of code.

One thought on “XML Interoperability of Serialized Entities in Java and .NET”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s