How To Cleanly Integrate Java and Clojure In The Same Package

emacs-my-app.png

A hybrid Java/Clojure library designed to demonstrate how to setup Java interop using Maven

This is a complete Maven-first Clojure/Java interop application. It details how to create a Maven application, enrich it with clojure code, call into clojure from Java, and hook up the entry points for both Java and Clojure within the same project.

Further, it contains my starter examples of using the fantastic Incanter Statistical and Graphics Computing Library in clojure. I include both a pom.xml and a project.clj showing how to pull in the dependencies.

The outcome is a consistent maven-archetyped project, wherein maven and leiningen play nicely together. This allows the best of both ways to be applied together. For the emacs user, I include support for cider and swank. NRepl by itself is present for general purpose use as well.

Starting a project

Maven first

Create Maven project

follow these steps

mvn archetype:generate -DgroupId=com.mycompany.app -DartifactId=my-app -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

cd my-app

mvn package

java -cp target/my-app-1.0-SNAPSHOT.jar com.mycompany.app.App
Hello World

Add Clojure code

Create a clojure core file

mkdir -p src/main/clojure/com/mycompany/app

touch src/main/clojure/com/mycompany/app/core.clj

Give it some goodness…

  (ns com.mycompany.app.core
  (:gen-class)
  (:use (incanter core stats charts)))

(defn -main [& args]
  (println "Hello Clojure!")
  (println "Java main called clojure function with args: "
           (apply str (interpose " " args))))


(defn run []
  (view (histogram (sample-normal 1000))))

Notice that we’ve added in the Incanter Library and made a run function to pop up a histogram of sample data

Add dependencies to your pom.xml

  <dependencies>
    <dependency>
      <groupId>org.clojure</groupId>
      <artifactId>clojure</artifactId>
      <version>1.7.0</version>
    </dependency>
    <dependency>
      <groupId>org.clojure</groupId>
      <artifactId>clojure-contrib</artifactId>
      <version>1.2.0</version>
    </dependency>
    <dependency>
      <groupId>incanter</groupId>
      <artifactId>incanter</artifactId>
      <version>1.9.0</version>
    </dependency>
    <dependency>
      <groupId>org.clojure</groupId>
      <artifactId>tools.nrepl</artifactId>
      <version>0.2.10</version>
    </dependency>
   <!-- pick your poison swank or cider. just make sure the version of nRepl matches. -->
    <dependency>
      <groupId>cider</groupId>
      <artifactId>cider-nrepl</artifactId>
      <version>0.10.0-SNAPSHOT</version>
    </dependency>
    <dependency>
      <groupId>swank-clojure</groupId>
      <artifactId>swank-clojure</artifactId>
      <version>1.4.3</version>
    </dependency>
  </dependencies>

Java main class

Modify your java main to call your clojure main like in the following:

package com.mycompany.app;

// for clojure's api
import clojure.lang.IFn;
import clojure.java.api.Clojure;

// for my api
import clojure.lang.RT;

public class App
{
  public static void main( String[] args )
  {

    System.out.println("Hello Java!" );

    try {

      // running my clojure code
      RT.loadResourceScript("com/mycompany/app/core.clj");
      IFn main = RT.var("com.mycompany.app.core", "main");
      main.invoke(args);

      // running the clojure api
      IFn plus = Clojure.var("clojure.core", "+");
      System.out.println(plus.invoke(1, 2).toString());

    } catch(Exception e) {
      e.printStackTrace();
    }

  }
}

Maven plugins for building

You should add in these plugins to your pom.xml

  • Add the maven-assembly-plugin

    Create an Ubarjar

    Bind the maven-assembly-plugin to the package phase this will create a jar file without the dependencies suitable for deployment to a container with deps present.

      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <configuration>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
          <archive>
            <manifest>
    
              <!-- use clojure main -->
              <!-- <mainClass>com.mycompany.app.core</mainClass> -->
    
              <!-- use java main -->
              <mainClass>com.mycompany.app.App</mainClass>
    
            </manifest>
          </archive>
        </configuration>
        <executions>
          <execution>
            <id>make-assembly</id>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    
  • Add the clojure-maven-plugin

    Add this plugin to give your project the mvn: clojure:… commands

    A full list of these is posted later in this article.

      <plugin>
        <groupId>com.theoryinpractise</groupId>
        <artifactId>clojure-maven-plugin</artifactId>
        <version>1.7.1</version>
        <configuration>
          <mainClass>com.mycompany.app.core</mainClass>
        </configuration>
        <executions>
          <execution>
            <id>compile-clojure</id>
            <phase>compile</phase>
            <goals>
              <goal>compile</goal>
            </goals>
          </execution>
          <execution>
            <id>test-clojure</id>
            <phase>test</phase>
            <goals>
              <goal>test</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    
  • Add the maven-compiler-plugin

    Add Java version targeting

    This is always good to have if you are working against multiple versions of Java.

      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.3</version>
        <configuration>
          <source>1.8</source>
          <target>1.8</target>
        </configuration>
      </plugin>
    
  • Add the maven-exec-plugin

    Add this plugin to give your project the mvn exec:… commands

    The maven-exec-plugin is nice for running your project from the commandline, build scripts, or from inside an IDE.

      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>exec-maven-plugin</artifactId>
        <version>1.4.0</version>
        <executions>
          <execution>
            <goals>
              <goal>exec</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <mainClass>com.mycompany.app.App</mainClass>
        </configuration>
      </plugin>
    
  • Add the maven-jar-plugin

    With this plugin you can manipulate the manifest of your default package. In this case, I’m not adding a main, because I’m using the uberjar above with all the dependencies for that. However, I included this section for cases, where the use case is for a non-stand-alone assembly.

      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-jar-plugin</artifactId>
        <version>2.6</version>
        <configuration>
          <archive>
            <manifest>
    
              <!-- use clojure main -->
              <!-- <mainClass>com.mycompany.app.core</mainClass> -->
    
              <!-- use java main -->
              <!-- <mainClass>com.mycompany.app.App</mainClass> -->
    
            </manifest>
          </archive>
        </configuration>
      </plugin>
    

Using Maven

  • building
    mvn package
    
    • Run from cli with
      • run from java entry point:
        java -cp target/my-app-1.0-SNAPSHOT-jar-with-dependencies.jar com.mycompany.app.App
        
      • Run from Clojure entry point:
        java -cp target/my-app-1.0-SNAPSHOT-jar-with-dependencies.jar com.mycompany.app.core
        
      • Run with entry point specified in uberjar MANIFEST.MF:
        java -jar target/my-app-1.0-SNAPSHOT-jar-with-dependencies.jar
        
    • Run from maven-exec-plugin
      • With plugin specified entry point:
        mvn exec:java
        
      • Specify your own entry point:
        • Java main
          mvn exec:java -Dexec.mainClass="com.mycompany.app.App"
          
        • Clojure main
          mvn exec:java -Dexec.mainClass="com.mycompany.app.core"
          
      • Feed args with this directive
        -Dexec.args="foo"
        
    • Run with maven-clojure-plugin
      • Clojure main
        mvn clojure:run
        
      • Clojure test
        • Add a test

          In order to be consistent with the test location convention in maven, create a path and clojure test file like this:

          mkdir src/test/clojure/com/mycompany/app
          
          touch src/test/clojure/com/mycompany/app/core_test.clj
          

          Add the following content:

          (ns com.mycompany.app.core-test
            (:require [clojure.test :refer :all]
                      [com.mycompany.app.core :refer :all]))
          
          (deftest a-test
            (testing "Rigourous Test :-)"
              (is (= 0 0))))
          
      • Testing
        mvn clojure:test
        

        Or

        mvn clojure:test-with-junit
        
      • Available Maven clojure:… commands

        Here is the full set of options available from the clojure-maven-plugin:

        mvn ...
        
        clojure:add-source
        clojure:add-test-source
        clojure:compile
        clojure:test
        clojure:test-with-junit
        clojure:run
        clojure:repl
        clojure:nrepl
        clojure:swank
        clojure:nailgun
        clojure:gendoc
        clojure:autodoc
        clojure:marginalia
        

        See documentation:

        https://github.com/talios/clojure-maven-plugin

Add Leiningen support

  • Create project.clj

    Next to your pom.xml, create the Clojure project file

    touch project.clj
    

    Add this content

    (defproject my-sandbox "1.0-SNAPSHOT"
     :description "My Encanter Project"
     :url "http://joelholder.com"
     :license {:name "Eclipse Public License"
               :url "http://www.eclipse.org/legal/epl-v10.html"}
     :dependencies [[org.clojure/clojure "1.7.0"]
                    [incanter "1.9.0"]]
     :main com.mycompany.app.core
     :source-paths ["src/main/clojure"]
     :java-source-paths ["src/main/java"]
     :test-paths ["src/test/clojure"]
     :resource-paths ["resources"]
     :aot :all)
    

    Note that we’ve set the source code and test paths for both java and clojure to match the maven-way of doing this.

    This gives us a consistent way of hooking the code from both lein and mvn. Additionally, I’ve added the incanter library here. The dependency should be expressed in the project file, because when we run nRepl from this directory, we want it to be available in our namespace, i.e. com.mycompany.app.core

  • Run with Leiningen
    lein run
    
  • Test with Leiningen
    lein test
    

Running with org-babel

This blog entry was exported to html from the README.org of this project. It sits in the base directory of the project. By using it to describe the project and include executable blocks of code from the project itself, we’re able to provide working examples of how to use the library in it’s documentation. People can simply clone our project and try out the library by executing it’s documentation. Very nice..

Make sure you jack-in to cider first:

M-x cider-jack-in (Have it mapped to F9 in my emacs)

Clojure code

The Clojure code block

#+begin_src clojure :tangle ./src/main/clojure/com/mycompany/app/core.clj :results output 
  (-main)
  (run)
#+end_src

Blocks are run in org-mode with C-c C-c

(-main)
(run)
Hello Clojure!
Java main called clojure function with args:

Note that we ran both our main and run functions here. -main prints out the text shown above. The run function actually opens the incanter java image viewer and shows us a picture of our graph.

run.png

I have purposefully not invested in styling these graphs in order to keep the code examples simple and focussed, however incanter makes really beautiful output. Here’s a link to get you started:

http://incanter.org/

Playing with Incanter

(use '(incanter core charts pdf))
;;; Create the x and y data:
(def x-data [0.0 1.0 2.0 3.0 4.0 5.0])
(def y-data [2.3 9.0 2.6 3.1 8.1 4.5])
(def xy-line (xy-plot x-data y-data))
(view xy-line)
(save-pdf xy-line "img/incanter-xy-line.pdf")
(save xy-line "img/incanter-xy-line.png")

PNG

incanter-xy-line.png

Resources

Finally here are some resources to move you along the journey. I drew on the links cited below along with a night of hacking to arrive a nice clean interop skeleton. Feel free to use my code available here:

https://github.com/jclosure/my-app

For the eager, here is a link to my full pom:

https://github.com/jclosure/my-app/blob/master/pom.xml

Working with Apache Storm (multilang)

Starter project:

This incubator project from the Apache Foundation demos drinking from the twitter hose with twitter4j and fishing in the streams with Java, Clojure, Python, and Ruby. Very cool and very powerful..

https://github.com/apache/storm/tree/master/examples/storm-starter

Testing Storm Topologies in Clojure:

http://www.pixelmachine.org/2011/12/17/Testing-Storm-Topologies.html

Vinyasa

READ this to give your clojure workflow more flow

https://github.com/zcaudate/vinyasa

Wrapping up

Clojure and Java are siblings on the JVM; they should play nicely together. Maven enables them to be easily mixed together in the same project or between projects. For a more indepth example of creating and consuming libraries written in Clojure, see Michael Richards’ article detailing how to use Clojure to implement interfaces defined in Java. He uses a FactoryMethod to abstract the mechanics of getting the implementation back into Java, which make’s the Clojure code virtually invisible from an API perspective. Very nice. Here’s the link:

http://michaelrkytch.github.io/programming/clojure/interop/2015/05/26/clj-interop-require.html

Happy hacking!..

The Mind Think and Practice of Designing Testable Routing

Integrating applications, data, and automated workflow presents a uniquely entrenched set of dependencies for software developers. The need for an endpoint to exist in order to integrate with it seems basic and undeniable. Additionally, the data that an endpoint produces or consumes needs be available for the integration to work. It turns out that these constraints are in fact the goal of integration.

So given that integration the endpoints must be available and functioning in order to send and receive data, and that they are external to one another and middleware, how then can we develop the integration code without these dependencies being in place at all times? Furthermore, how do we test messaging without having actual messages, and how do we isolate our routes from the things that they integrate?

These questions seem rhetorical and yet there is a very real “chicken-egg” problem inherent in the general domain of integration. How to develop integration, without the things you need to integrate?

The concepts for good developer practices are ubiquitous. Developing under test with a set of specifications describing integration logic is in reality no different from any other type of software development.

In this article, I am going to walk you through exactly how to do this in a simple series of concepts and steps that will enable you to fly as an integration developer. But first, we need to level our knowledge regarding the following concerns:

  1. Conceptual Ingredients
  2. Architectural Patterns
  3. Framework Facilities
  4. Testing Idioms

Designing applications to be testable

Applications in general must be designed with affordances that enable them to be automatically tested. The ease or difficulty of testing an application is most directly impacted by how it’s architected. In order to achieve the benefit of designing an application to be tested, the best approach is to design it under test from the very beginning, ala Test-Driven-Development.

By writing a test for a required facility before creating the facility, we guarantee that it will be testable. Seems too obvious and yet, many developers skip straight to the implementation, writing the required code, only to find that it’s design is difficult to cleanly test. Integration code is actually no different.

Just because it uses facilities external to itself, doesn’t mean that it cannot be designed with clean seams and isolatable subsystems. To this end, TDD of integration code can yield excellent results, not to mention let you go fast, get it right quickly, and ensure that the design is not a snarled up ball of mud, deeply coupled to a network, external servers, databases, the web, or other external things. Finally, there is an advantage of having the tests run as part of your build and Continuous Integration pipeline. This is like having living armor to ensure that as you continue to develop and evolve the application, you have a set of automated alarm bells that will go off when problems are unintentionally introduced. When tests fail, you are alerted immediately and are presented with the opportunity to fix the cause fast. Much of what I’ve said here, is just common wisdom related to test-first mentality, however the point is that integration applications are just applications in that they run, process, and change the state of data. This makes them ideal for TDD, which allows you to focus on calibrating the internal logic and design directly to your requirements.

The key to successful test-driven integration development and reaping its benefits is to understand what facilities exist within your application framework for design and testing. Spring has most of the architectural concerns already thought out for you. Let’s take a quick survey of what’s important.

Object lifecycles

Singleton – created only once in the application runtime
Prototype – a new version is created everytime a class is
Scoped – new versions of a given class are created and destroyed based on an orthagonal scope, e.g. WebRequest

Managed versus Unmanaged objects

Managed Objects – objects whose lifecycle has been delegated to the application framework, e.g. Spring Framework. Note that the default lifecycle in Spring is Singleton. This is by design in order to encourage an architecture that instantiates an object facility once and then reuses it throughout the running of the app.

<?xml version="1.0" encoding="UTF-8"?>

<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:amq="http://activemq.apache.org/schema/core" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd http://activemq.apache.org/schema/core http://activemq.apache.org/schema/core/activemq-core.xsd">


   <bean id="xmlProcessingService" class="com.mycompany.integration.project.services.XmlProcessingService"></bean>
 

</beans>

Unmanaged Objects – objects whose lifecycle will be handled by the developer

import com.mycompany.integration.project.services.XmlProcessingService;

XmlProcessingService xmlProcessingService = new XmlProcessingService();

Patterns of intent are key to determining what lifecycle any given object in your domain should have. If there are class-level fields that are intended to change on a per usage basis, then that bean should be declared as a prototype. However, this does not mean that a bean with fields cannot be a singleton. It should be singleton only if those fields are set once and not changed again. This promotes clean state sharing across concurrency boundaries within the system. We don’t want to be changing state out from under an object shared between threads, nor do we want to have to deal with locking, mutexes, or potential race conditions. By simply being mindful of what the lifecycle of an object should be, we are safe to use it within that scope.

Domain Facilities

Domain Services
represent transactional concerns in an application
reponsibilities are narrowly focused
promote composability and reusability
generally are managed objects (usually singletons)

Utilities
represent small reusable concerns
generally not managed objects

Routes
represent transition of data
represent transactional pipelines
a pipeline is composed of routing steps
routing steps are usages of facilities of the camel framework

The ingredients of a Camel Route

Now lets have a look at our example application. We will use it to demonstrate the architecture. Note that testing is purposefully kept minimalistic in order to highlight the salient concepts and reduce distractions.

Example Application

To setup the context of our environment, let’s have a look at what a reasonable project structure looks like. In our case, we will be using a Maven-based project archetype, with the Fuse nature enabled in Eclipse. We’ll use good-old JUnit for our work here. The good news is that Spring and Camel together provide excellent test support classes. These will make our testing convenient and straight forward.

Models

This app will integrate data related to Customers and Orders, using a sample Xml payload and corresponding model classes. These have been generated from an Xml Schema Definition (xsd) available here. I have pregenerated the model classes from the xsd.

xsd

They have been filed into a namespace dedicated to models.

package com.mycompany.integration.project.models;

Domain Services

There are 2 domain services.

XmlProcessingService – responsible for processing message body contents as XmlDocument.
ModelProcessingService – repsonsible for processing message body contents as an Object Graph.

They have been organized into a package dedicated to services.

package com.mycompany.integration.project.services;

Utilities

There are a handful of simple utilities in this application.

ModelBuilder – helper for serialization/deserialization concerns.
FileUtils – helper for interacting with filesystems.
Perf – helper for timing execution of code.
StackTraceInfo – helper for inferring the context of code execution at runtime.

And as you would guess utilities have been organized into a package dedicated to them.

package com.mycompany.integration.project.utils;

The project structure looks like this.

The Domain is Organized According to Role of Classes and Test Classes Follow The Same Convention
The domain is organized according to the role of its classes and test classes follow the same convention

It is generally a good idea to organize your ApplicationContext into logically similar units as well. I prefer a role-based approach. When designing the composition model, I ask myself, “what kind of object is this that I’m declaring as a bean?”. The answer to this question usually yields a clear answer to the following concerns:

  1. Is there already a Spring context file for this kind of thing?
  2. Do I need to create a new context file for this kind of thing and if so what is the category of this kind of thing? The name of the context file should align to the category.

Have a look at the composition model of the context files in our project.

xml_imports
Spring context space is organized into bounded-contexts

Unit Testing

There is a world of highly-opinionated approaches to testing, and they are all right. In this article, I want to focus on the nuts and bolts. Specifically, we are going to focus on unit testing the 3 categories of things of we discussed earlier, Utilities, Domain Services, and Routes.

Testing Utilities

Example:

package com.mycompany.integration.project.tests.utils;

import org.junit.Test;

import com.mycompany.integration.project.models.*;
import com.mycompany.integration.project.utils.*;

public class ModelBuilderTests {

	String xmlFilePath = "src/exemplar/CustomersOrders-v.1.0.0.xml";

	@Test
	public void test_can_fast_deserialize() throws Exception {
		// arrange
		String xmlString = FileUtils.getFileString(xmlFilePath);

		// act
		CustomersOrders customersOrders = CustomersOrders.class
				.cast(ModelBuilder.fastDeserialize(xmlString,
						CustomersOrders.class));

		// assert
		assert (customersOrders != null);
	}

	@Test
	public void test_can_deserialize() throws Exception {
		// arrange
		String axmlString = FileUtils.getFileString(xmlFilePath);

		// act
		CustomersOrders customersOrders = CustomersOrders.class
				.cast(ModelBuilder.deserialize(axmlString,
						CustomersOrders.class));

		// assert
		assert (customersOrders != null);
	}
}

Testing utilities can be quite easy. Since the purpose of utilities is to function as stand-alone units of functionality, isolating them as a SUT (System Under Test) is not difficult. Standard black-box testing of input and output apply.

Testing Domain Services

Domain services usually represent transactional components. They are generally stateless and provide simple facilities to handle single or related sets of responsibilities. Collaborators are objects used by a given domain service. They are often other domain services, e.g. a PurchasingService collaborates with an OrderingService. When unit testing a single domain service collaborators are usually mocked to isolate the service as a SUT. We will look at mocking in detail later.

Example:

package com.mycompany.integration.project.tests.services;

import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.test.annotation.DirtiesContext;
import org.springframework.test.annotation.DirtiesContext.ClassMode;
import org.springframework.test.context.ContextConfiguration;
import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;

import com.mycompany.integration.project.services.XmlProcessingService;
import com.mycompany.integration.project.utils.FileUtils;

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations={"classpath:META-INF/spring/domain.xml"})
@DirtiesContext(classMode = ClassMode.AFTER_EACH_TEST_METHOD)
public class XmlProcessingServiceTests {

	String xmlFilePath = "src/exemplar/CustomersOrders-v.1.0.0.xml";
	
	@Autowired
    private XmlProcessingService xmlProcessingService;
	
	@Test
	public void test_service_gets_injected() throws Exception {
		assert(xmlProcessingService != null);
	}

	@Test
	public void test_process_an_xml_transaction() throws Exception {
		// Arrange
		String xml = FileUtils.getFileString(xmlFilePath);
		
		// Act
		Boolean result = xmlProcessingService.processTransaction(xml);
		
		// Assert
		assert(result);
	}
}

Note that we are instructing JUnit to run the tests in this class with:

@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(locations={“classpath:META-INF/spring/domain.xml“})
@DirtiesContext(classMode = ClassMode.AFTER_EACH_TEST_METHOD)

These annotations tell JUnit to load the ApplicationContext file domain.xml from the classpath and to reset the context to its default state after each run. The later ensures that we don’t bleed state between tests. We don’t even need a setUp() method in the class because of these annotations.

Now, because this class is Spring aware and Spring managed, the XmlProcessingService instance gets automatically @Autowired into it, through another simple annotation. This facility allows complex composition of Domain Services and their Collaborators to be handled by your Spring configuration, while you the developer just pull in what you need and test it.

A final important distinction of domain services is that they should always be Singleton Managed Objects. This means that our application framework (Spring), will be the creator and custodian of these objects. Whenever we need one, we’ll either ask the Spring ApplicationContext via service lookup or have it injected as a dependency. Composing an application in Spring is actually quite straight forward, but is outside the scope of our present study. If you want to know more about it, read up on it here.

Testing Camel Routes Expressed In The Spring DSL

It’s important to remember when testing a CamelContext that the SUT we are interested in is the Route. Well what is the route comprised of? It’s a set of steps, specifically step-wise treatments that are applied to messages as they traverse the route. Thus, what we are interested in testing are that these steps happen, that they are correctly sequenced, and that together they produce the desired result. The state that we examine in routing tests is the Message itself, and sometimes the Camel Routing Exchange.

The killer trick for testing a CamelContext with routes declared in Spring DSL is this:

In your src/test/resources You need to create a test-camel-context.xml that imports your real camel-context.xml from the classpath. Then in the test-camel-context.xml file you add the InterceptSendToMockEndpointStrategy bean to “mock all endpoints”, and you add a DirectComponent to override your activemq broker bean definition from the real camel-context.xml.

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd">

	<!-- the Camel route is defined in another XML file -->
	<import resource="classpath:META-INF/spring/camel-context.xml"></import>
	
	<!-- mock endpoint and override activemq bean with a noop stub (DirectComponent) -->
	<bean id="mockAllEndpoints" class="org.apache.camel.impl.InterceptSendToMockEndpointStrategy"></bean>
    <bean id="activemq" class="org.apache.camel.component.direct.DirectComponent"></bean>
</beans>

This in effect mocks all endpoints and provides a fake broker bean, so that you don’t have to have an instance of ActiveMQ actually available. This is what I mean by isolating your tests away from their integration points. This Spring ApplicationContext can now provide UnitTesting in a vacuum.

Route Testing Examples

Before diving in and showing off the code, its worth taking a step back here and asking, “what is it that we are going to want to know to determine if our routes are functioning as intended?”. The answer is again kind of straight down the middle of standard unit testing’s goals. We have state that should be pushed through a pipeline of routing steps. It travels in the form of a Camel Message. The structure of a message is just like anything a courier would carry for us. It has a payload (the body), and meta-data describing the payload in the form of the message’s headers.

{ 
  message:
    headers:
      status: "SUCCESS"
      foo: "bar"
    body:
      "Hi, I am a message payload..."
}

These are simple key-value pair data structures that help both the developer and Camel get the message to the right place, and ensure that everything went as expected. Thus, the idea of having expectations about correctness of a message at various stages in the route is at the heart of route testing. Luckily Camel includes the CamelSpringTestSupport class, which gives us an api with expectation-based semantics. With it driving our routing tests, we simply tell the test framework what expectations we have about the message and then feed the route an example message we want it to process. If all of the expectations are met, then the test passes. Otherwise the framework tells us which ones were not met..

Example camel-context.xml:

<!-- Configures the Camel Context -->
<camelContext id="customers_and_orders_processing" xmlns="http://camel.apache.org/schema/spring">
	<route id="process_messages_as_models">
		<from uri="file:src/data1" />
		<process ref="customersOrdersModelProcessor" id="process_as_model" />
		<to uri="file:target/output1" />
	</route>
	<route id="process_messages_as_xml">
		<from uri="file:src/data2" />
		<process ref="customersOrdersXmlDocumentProcessor" id="process_as_xml" />
		<to uri="file:target/output2" />
	</route>
	<route id="process_jetty_messages_as_xml">
		<from uri="jetty:http://0.0.0.0:8888/myapp/myservice/?sessionSupport=true" />
		<process ref="customersOrdersXmlDocumentProcessor" id="process_jetty_input_as_xml" />
		<to uri="file:target/output3" />
		<transform>
			<simple>
				OK
			</simple>
		</transform>
	</route>
</camelContext>

Below is the test class for this Camel Context. Note the naming convention alignment of route Id to test method name. Observing this convention will make clear which tests are targeting which routes. This is nice because the test output in your build reports will be easy to read and understand.

Example Route 1: process_messages_as_models

process_messages_as_models_test() -> process_messages_as_models
It expects the route to run File to File through the Model Deserialization Processor.

Example Route 2: process_messages_as_xml

process_messages_as_xml_test() -> process_messages_as_xml
It expects the route to run File to File through the XmlDocument manipulation Processor.

Example Route 3: process_jetty_messages_as_xml

Expects Route to Be Http to File through XmlDocument manipulation Processor
process_jetty_messages_as_xml_test() -> process_jetty_messages_as_xml

Example RoutTester.java:

package com.mycompany.integration.project.tests.routes;

import org.apache.camel.CamelContext;
import org.apache.camel.ConsumerTemplate;
import org.apache.camel.ProducerTemplate;
import org.apache.camel.component.mock.MockEndpoint;
import org.apache.camel.spring.SpringCamelContext;
import org.apache.camel.test.spring.CamelSpringTestSupport;
import org.junit.Before;
import org.junit.Ignore;
import org.junit.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.support.AbstractXmlApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;

import com.mycompany.integration.project.models.*;
import com.mycompany.integration.project.utils.FileUtils;
import com.mycompany.integration.project.utils.StackTraceInfo;

public class RouteTester extends CamelSpringTestSupport {

	public String testXmlContextPath = "/test-camel-context.xml";
	
	
	@Autowired
	protected CamelContext camelContext;

	@Override
	public String isMockEndpoints() {
		// override this method and return the pattern for which endpoints to
		// mock.
		// use * to indicate all
		return "*";
	}

	private ProducerTemplate producer;
	private ConsumerTemplate consumer;

	protected CamelContext getCamelContext() throws Exception {
		applicationContext = createApplicationContext();
		return SpringCamelContext.springCamelContext(applicationContext);
	}

	@Override
	protected AbstractXmlApplicationContext createApplicationContext() {
		return new ClassPathXmlApplicationContext(testXmlContextPath);
	}

	
	String inputExemplarFilePath = "src/exemplar/CustomersOrders-v.1.0.0.xml";
	String inputExemplar;
	
	String outputExemplarFilePath = "src/exemplar/CustomersOrders-v.1.0.0-transformed.xml";
	String outputExemplar;
	
	@Before
	public void setUp() throws Exception {

		System.out.println("Calling setUp");
		
		// load i/o exemplars
		inputExemplar = FileUtils.getFileString(inputExemplarFilePath);
		outputExemplar = FileUtils.getFileString(outputExemplarFilePath);
		
		camelContext = getCamelContext();

		camelContext.start();

		producer = camelContext.createProducerTemplate();
		consumer = camelContext.createConsumerTemplate();

	}

	@Test
	public void process_messages_as_models_test() throws Exception {

		System.out.println("Calling " + StackTraceInfo.getCurrentMethodName());
		
		String inputUri = "file:src/data1";
		String outputUri = "file:target/output1";
		
		// Set expectations
		int SEND_COUNT = 1;
		
		MockEndpoint mockOutput = camelContext.getEndpoint("mock:" + outputUri, MockEndpoint.class);
		//mockOutput.expectedBodiesReceived(outputExemplar);
		mockOutput.expectedHeaderReceived("status", "SUCCESS");
		mockOutput.expectedMessageCount(SEND_COUNT);
		

		// Perform Test

		for (int i = 0; i < SEND_COUNT; i++) {
			System.out.println("sending message.");

			// do send/receive, aka. run the route end-to-end
			producer.sendBody(inputUri, inputExemplar); 
			String output = consumer.receiveBody(outputUri, String.class); 
		}

	
		// ensure that the order got through to the mock endpoint
		mockOutput.setResultWaitTime(10000);
		mockOutput.assertIsSatisfied();
	}
	
	@Test
	public void process_messages_as_xml_test() throws Exception {

		System.out.println("Calling " + StackTraceInfo.getCurrentMethodName());
		
		// Set expectations
		int SEND_COUNT = 1;

		String inputUri = "file:src/data2";
		String outputUri = "file:target/output2";
		
		MockEndpoint mockOutput = camelContext.getEndpoint("mock:" + outputUri, MockEndpoint.class);
		//mockOutput.expectedBodiesReceived(outputExemplar);
		mockOutput.expectedHeaderReceived("status", "SUCCESS");
		mockOutput.expectedMessageCount(SEND_COUNT);

		// Perform Test

		for (int i = 0; i < SEND_COUNT; i++) {
			System.out.println("sending message.");

			// do send/receive, aka. run the route end-to-end
			producer.sendBody(inputUri, inputExemplar); 
			String output = consumer.receiveBody(outputUri, String.class); 
		}

		// ensure that the order got through to the mock endpoint
		mockOutput.setResultWaitTime(100000);
		mockOutput.assertIsSatisfied();
	}

	
	@Test
	public void process_jetty_messages_as_xml_test() throws Exception {

		System.out.println("Calling " + StackTraceInfo.getCurrentMethodName());
		
		// Set expectations
		int SEND_COUNT = 1;

		String inputUri = "jetty:http://0.0.0.0:8888/myapp/myservice/?sessionSupport=true";
		String outputUri = "file:target/output3";
		
		MockEndpoint mockOutput = camelContext.getEndpoint("mock:" + outputUri, MockEndpoint.class);
		mockOutput.expectedBodiesReceived(inputExemplar);
		mockOutput.expectedHeaderReceived("status", "SUCCESS");
		mockOutput.expectedMessageCount(SEND_COUNT);

		// Perform Test

		for (int i = 0; i < SEND_COUNT; i++) {
			System.out.println("sending message.");

			// do send/receive, aka. run the route end-to-end
			String result = producer.requestBody(inputUri, inputExemplar, String.class); 
			String output = consumer.receiveBody(outputUri, String.class); 
			
			assertEquals("OK", result);
		}

		// ensure that the order got through to the mock endpoint
		mockOutput.setResultWaitTime(10000);
		mockOutput.assertIsSatisfied();
	}
}

Discussion of the RouteTester class

The important thing to know about my RouteTester.java example is this. It extends CamelSpringTestSupport, which requires you to override its createApplicationContext() method. This method tells it where to find the Spring ApplicationContext you want it to test. In our case that context, is a Camel Context. Thus I’ve set the path to “/test-camel-context.xml”. This basically boots up the Camel Context and now we can run its routes from inside our @Test methods.

Furthermore, there is a VERY IMPORTANT and VERY SIMPLE convention you need to understand in order to use the mocking framework. It is this:

If you want to mock an endpoint, say “file:src/data1”, the syntax to mock it will look like this “mock:file:src/data1”.

That’s it… Once you understand this, you see how easy it is to wrap your endpoints, whether they be producers or consumers in mocks that will prevent them from actually running or needing to be there, and instead provide you with programmatic access to both feed them and receive from them in your tests. That said, the expectations based semantics the mocks give you is pretty awesome. It just makes sense to human brains.

For example in the process_jetty_messages_as_xml_test() test, we tell the “output routing step”, file:target/output3, to expect the following to be the case:

mockOutput.expectedBodiesReceived(inputExemplar);
mockOutput.expectedHeaderReceived("status", "SUCCESS");
mockOutput.expectedMessageCount(SEND_COUNT);

Basically, we told it to expect that the output matches the exemplar payload we want, as well as the “status” header should be set to “SUCCESS”, and the send count should be what we set it to.

If any of these expectations are not met, then the test will fail and we’ll get a comprehensible message from the framework. Here’s an example of when the “status” header doesn’t meet the expectation.

expectation_not_met_features
Header with name “status” for message: 0. Expected: “SUCCESS” but was “BAD”

This is great! We know exactly why it failed and can quickly investigate, fix, and retest to ensure that we fix the bug.

We did not need a real jetty server or a http client to call it, nor did we need a filesystem to put the file into. More importantly, we found out that there was a problem while our hands were on the keyboard during dev time, not production, or some heavy manual regression testing. Best of all is that this test helped us now, and will continue to run every time we build. This means Jenkins or whatever other CI tooling you’re using will also provide you with elegant, automatic, and perpetual test running. So that in the future, when you accidentally break something indirectly related to this route, perhaps a change to one of the Collaborating Domain Services, you get an email telling exactly what’s wrong.

So, we’ve gone through a lot of content here, touching on a number of topics that are all directly or indirectly related to getting a high-quality test capability in place for your integration code. With the CamelSpringTestSupport, the excellent Apache Camel framework showcases just how powerful testing within it can be. Given that Camel is a mature and widely-used integration solution, it has evolved to accommodate good testing practices. Developers only need to be aware and in command of the testing layer of the framework to put it to work in their daily practices.

In this article, I distilled down what are some of the more important design and testing concepts and showed you how to apply them with the tooling. Mind you, these concepts are not specific to tools, platforms, or any specific flavor of integration technology. In fact, they are generally applicable to good development practices. Going forward I would encourage you to investigate Domain-Driven-Design, the Spring Framework, Apache Camel, and the Arrange Act Assert and Behavior-Based unit testing paradigms. The rabbit hole is deep and one could spend a career learning and applying these things to their professional work. However, this is a well understood area of Software Architecture and the best stuff to be had is near the surface. My hope is that you’ve found this work insightful and that it finds it’s way into your thought process and key strokes in the future.

If you would like to contact me with questions and discussion, I’m available via twitter and the comments of this article. The code in this article can be found here. If you’ve made it this far, you are ready to grab it and begin using these techniques yourself. Good luck and enjoy.

Surfing the ReferencePipeline in Java 8

Java 8 includes new a Stream Processing API. At its core is the ReferencePipeline class which gives us a DSL for working with Streams in a functional style. You can get an instance of a ReferencePipeline flowing with a single expression.


IntStream.range(1, 50)
         .mapToObj(i -> new Thread(() -> System.out.println(i)))
         .forEach(thread -> thread.start());

The MapReduce DSL has the essential set of list processing operations such as querying, mapping, and iterating. The operations can be chained to provide the notion of linked inline strategies. When the stream of data is pulled through this pipeline, each data element passes through the operation chain.

Streams of data can be folded or reduced to a single value.

For example, here is how you can query a stream and accumulate the matches into a single value:


String value = Stream.of("foo", "bar", "baz", "quux")
		             .filter(s -> s.contains("a") || s.endsWith("x"))
		             .map(s -> s.toUpperCase())
		             .reduce((acc, s) -> acc + s);

Pipeline Flow For Code Sample
Pipeline Anatomy of Code Sample

These functions are monadic operations that enable us to query, transform, and project objects down the pipeline. The ReferencePipeline contains these functions. It is constructed to return copies of itself after each method invokation, giving us a chainable API. This API can be considered to be an architectural scaffolding for describing how to process Streams of data.

See here how a pipeline can take in lines of CSV and emit structured row objects in the form of Arrays:


//start with a stream

String csv = "a,b,c\n"
		   + "d,e,f\n"
		   + "g,h,i\n";

//process the stream

Stream<String[]> rows = new BufferedReader(
		new InputStreamReader(new ByteArrayInputStream(
				csv.getBytes("UTF-8"))))
		.lines()
		.skip(1)
		.map(line -> line.split(","));

//print it out

rows.forEach(row -> System.out.println(row[0]
									 + row[1]
									 + row[2]));

Notice that the stream processing routine is designed as a single expression. The .lines() method initiates the pipeline. Then we skip the headers (they are not data), and project out an array of fields with the .map(). This is nice. We’re able to use a higher-order thought process when describing the algorithm to the JVM. Instead of diving into procedural control-flow, we simply tell the system what we want it to do by using a step-wise functional prescription. This style of programming leads to more readable and comprehensible code, and as you would expect, behind the scenes, the compiler converts the lambda syntax into Java classes, aka it gets “de-sugared” into Predicates (filters), Comparitors (sorters), BiFunctions (mappers), and Functions (accumulators). The lambda expressions make it so we do not to have to get our hands dirty with the details of Java’s functional programming object model.

Consider the following example.

I want to download historical stock pricing data from Yahoo and turn it into PricingRecord objects in my system.

Yahoo Finance API

It’s data can be acquired with a simple HTTP Get call:


http://real-chart.finance.yahoo.com/table.csv?s=AMD&d=6&e=18&f=2014&g=d&a=7&b=9&c=1996&ignore=.csv

First note the shape of the CSV that Yahoo’s API returns:

Date Open High Low Close Volume Adj Close*
2008-12-29 15.98 16.16 15.98 16.08 158600 16.08

Our CSV looks like this:

2008-12-29,15.98,16.16,15.98,16.08,158600,16.08

Let’s compose a simple recipe that uses the Stream API to pull this data from the web and turn it into objects that we can work with in our system.

Our program should do these things:

  1. Take in a set of stock symbols in the form of a Stream strings.
  2. Transform the Stream of symbols into a new Stream containing Streams of PricingRecords.
    • This will be done by Making a remote call to Yahoo’s API.
    • The CSV returned should be mapped directly into PricingRecords objects.
  3. Since we’re pulling the data for multiple symbols, we should do the API calls for each concurrently. We can achieve this by parallelizing the flow of stream elements through the operation chain.

Here is the solution implemented as single composite expression. Note how we aquire a Stream<String>, process it, and emit a Map<String,List<PricingRecord>>.

Using a ReferencePipeline as a builder:


//start with a stream

Stream<String> stockStream = Stream.of("AMD", "APL", "IBM", "GOOG");

//generate data with a stream processing algorithm

Map<String, List<PricingRecord>> pricingRecords = stockStream
		.parallel()
		.map(symbol -> {

			try {
				String csv = new JdkRequest(String.format("http://real-chart.finance.yahoo.com/table.csv?s=%s&d=6&e=18&f=2014&g=d&a=7&b=9&c=1996&ignore=.csv", symbol))
							 .header("Accept", "text/xml")
						     .fetch()
						     .body();

				return new BufferedReader(new InputStreamReader(new ByteArrayInputStream(csv.getBytes("UTF-8"))))
					   .lines()
					   .skip(1)
					   .map(line -> line.split(","))
					   .map(arr -> new PricingRecord(symbol, arr[0],arr[1], arr[2], arr[3], arr[4], arr[5], arr[6]));

			} catch (Exception e) {
				e.printStackTrace();
			}

			return symbol;
		})
		.flatMap(records -> (Stream<PricingRecord>) records)
		.collect(Collectors.groupingBy(PricingRecord::getSymbol));


//print it out..

pricingRecords.forEach((symbol, records) -> System.out.println(String
		.format("Symbol: %s\n%s",
				symbol,
				records.stream()
					   .parallel()
					   .map(record -> record.toString())
					   .reduce((x, y) -> x + y))));

The elegance of this solution may not at first be apparent, but note a few key characteristics that emerge from a closer look. Notice that we get concurrency for free with the .parallel(). We do this near the beginning of the pipeline, in order to feed the downstream .map() function in a multithreaded manner.

Notice also that we’re projecting a 2-dimensional Stream out of the .map() function. The top-level stream contains a substream of Stream objects. The composite type it returns is actually a Stream<Stream<PricingRecord>>. This is a common scenario in stream-based programming and the solution to unpack and harvest the substream is to use the .flatMap() function. It provides the affordance we need for working with structure in a 2-dimensional stream. Note that the ReferencePipeline also provides us with a .substream(n) function for working with n-dimensional streams. In my example, we use .flatMap() to unpack and return a cast over the elements to coerce them into PricingRecord objects.

Finally, look at the last expression in the chain the .collect(). To collect the stream is to terminate it, which means to enumerate its full contents. Basically this means to load them into memory, however there are many ways in which you might want to do this. For this we have what are called Collectors; they allow us to describe how we want the contents of the stream organized when they are pulled out.

Usage Tip:

If you want a flat list use:
Collectors.toList // ArrayList

If you want a map or dictionary-like structure, use:
Collectors.groupingBy // Map

The .groupingBy() function that I use above allows us to aggregate our stream into groups based on a common characteristic. The Maps that .groupingBy() projects are very useful because you can represent both the input to the function and its output as a “pair”, e.g. Map.SimpleEntry (key value pair).

For completeness I should provide the PricingRecord class:

public class PricingRecord {
	private String symbol;
	private String date;
	private double open;
	private double high;
	private double low;
	private double close;
	private int volume;
	private double adjustedClose;

	public PricingRecord (String symbol, String date, String open, String high, String low, String close, String volume, String adjustedClose){
		this.setSymbol(symbol);
		this.date = date;
		this.open = Double.parseDouble(open);
		this.high = Double.parseDouble(high);
		this.low = Double.parseDouble(low);
		this.close = Double.parseDouble(close);
		this.volume = Integer.parseInt(volume);
		this.adjustedClose = Double.parseDouble(adjustedClose);
	}

	public String toString(){
		return String.format("symbol=%s,date=%s,open=%.2f,close=%.2f,high=%.2f,low=%.2f,volume=%s,adjustedClose=%.2f", this.symbol, this.date, this.open, this.close, this.high, this.low, this.volume, this.adjustedClose);

	}

	public String getDate() {
		return date;
	}
	public void setDate(String date) {
		this.date = date;
	}
	public double getOpen() {
		return open;
	}
	public void setOpen(double open) {
		this.open = open;
	}
	public double getHigh() {
		return high;
	}
	public void setHigh(double high) {
		this.high = high;
	}
	public double getLow() {
		return low;
	}
	public void setLow(double low) {
		this.low = low;
	}
	public double getClose() {
		return close;
	}
	public void setClose(double close) {
		this.close = close;
	}
	public int getVolume() {
		return volume;
	}
	public void setVolume(int volume) {
		this.volume = volume;
	}
	public double getAdjustedClose() {
		return adjustedClose;
	}
	public void setAdjustedClose(double adjustedClose) {
		this.adjustedClose = adjustedClose;
	}
	public String getSymbol() {
		return symbol;
	}
	public void setSymbol(String symbol) {
		this.symbol = symbol;
	}

}

This is a simple entity class. It just serves to represent our logical notion of a PricingRecord, however I want you to notice the .toString(). It simply prints out the object, but notice in the last expression of the code example how we’re able to print a concatenation of these objects out to the console as a String. The .reduce() function allows us to accumulate the result of each symbol’s data and print it out in a logically separated and intelligible way. Reducers are what I like to think of as “distillers” of information in the streams we process. In this way, they can be made to aggregate and refine information from the stream as it passes through the pipeline.

Finally, in order to run the code example as is, you’ll need to pull in the jcabi-http library to get the nice fluent web api that I’m using. Add this to your pom.xml and resolve the imports.

<dependency>
	<groupId>com.jcabi</groupId>
	<artifactId>jcabi-http</artifactId>
	<version>1.8</version>
</dependency>

The introduction of this functional programming model into Java, is a leap forward for the platform. It signals a shift in not only the way we write the code in the language, but also how we think about solving problems with it. So called higher-order problem solving requires higher-order tools. This technology is compelling because it give us these high-level tools and a fun syntax that allows us to focus on problem to be solved, not on the cruft it takes to solve it. This makes life better for everyone..

NTLM Authentication in Java with JCifs

In enterprise software development contexts, one of the frequent needs we encounter is working with FileSystems remotely via CIFS, sometimes referred to as SMB.  If you are using Java in these cases, you’ll want JCifs, a pure Java CIFS implementation.  In this post, I’ll show you how to remotely connect to a Windows share on an Active Directory domain and read/write a file.

In your pom.xml place this dependency:

<dependency>
    <groupId>jcifs</groupId>
    <artifactId>jcifs</artifactId>
    <version>1.3.17</version>
</dependency>

Here is a simple class with a main, you can run to see how it works:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.UnknownHostException;
import java.util.logging.Level;
import java.util.logging.Logger;

import jcifs.UniAddress;
import jcifs.smb.NtlmPasswordAuthentication;
import jcifs.smb.SmbException;
import jcifs.smb.SmbFile;
import jcifs.smb.SmbFileInputStream;
import jcifs.smb.SmbFileOutputStream;
import jcifs.smb.SmbSession;

public class Program {

	public static void main(String[] args) throws SmbException, UnknownHostException, Exception {
	    
        final UniAddress domainController = UniAddress.getByName("DOMAINCONTROLLERHOSTNAME");
		 
	    final NtlmPasswordAuthentication credentials = new NtlmPasswordAuthentication("DOMAIN.LOCAL", "USERNAME", "SECRET");
	 
	    SmbSession.logon(domainController, credentials);
	    
	    SmbFile smbFile = new SmbFile("smb://localhost/share/foo.txt", credentials);
	    
	    //write to file
	    new SmbFileOutputStream(smbFile).write("testing....and writing to a file".getBytes());
	    
	    //read from file
	    String contents = readFileContents(smbFile);
	    
	    System.out.println(contents);

	}

	private static String readFileContents(SmbFile sFile) throws IOException {

		BufferedReader reader = reader = new BufferedReader(
				new InputStreamReader(new SmbFileInputStream(sFile)));

		StringBuilder builder = new StringBuilder();
		
		String lineReader = null;
		while ((lineReader = reader.readLine()) != null) {
			builder.append(lineReader).append("\n");
		}
		return builder.toString();
	}

}

As you can see its quite trivial to reach out across your network and interact with Files and Directories in Windows/Samba Shares. Being able to authenticate via NTLM is convenient and tidy for this purpose, not to mention the FileSystem API is straight forward and powerful.

Enjoy the power..

XML Interoperability of Serialized Entities in Java and .NET

Abstract:

In order to exchange structured data directly between the platforms, we must be able to easily take the marshalled or serialized definition of the object and turn it into an object in memory.  There are standard ways of marshalling of objects to XML in both Java and .NET.  I have found it a little frustrating in the past when I’ve had to adopt large frameworks or external machinery in order to easily move structured data between the JVM and CLR.   It seems that we should be able to bring these worlds together in a simple set of OOTB idioms, while providing a convenient way (one liner) to move back and forth between object and stringified forms.   For this I have created a minimal helper class for both languages that does the following:

  • Provides a common API between languages for moving between XML string and Objects (entities)
  • Provides adaptation capabilities between canonical XML representations for both Java’s JAXB and .NET’s XmlSerializer
  • Provides a façade to the underlying language and framework mechanics for going between representations
  • Implementation of SerializationHelper.java
  • Implementation of SerializationHelper.cs

The Need for Interoperable Xml Representation of Entities in Java and .NET

Both the Java and .NET ecosystems provide many ways to work with XML, JSON, Binary, YAML, etc. serialization.  In this article I’m focused on the base case between the standard platform-level mechanisms for moving between XML and Object graphs in memory.  The Web Services stacks in both platforms are of course built on top of their respective XML binding or serialization standards.  The standards however differ, in some slight but important ways.  Here I do not seek to build a bullet proof general purpose adapter between languages.  I’ll leave that to the WS-* ppl.  However, I think there is a common and often overlooked ability to do marshalling with XML with little to no additional framework or specialized stack.  Here are some scenarios that make sense with this kind capability.

  • Intersystem Messaging
  • Transforming and Adapting Data Structures
  • Database stored and shared XML
  • Queue-based storage and shared XML
  • File-based storage and shared XML
  • Web Request/Response shared XML

The Specifications:

Java:

JAXB (Java XML Binding)

JSR: 222

.NET

XmlSerializer

Version >= .NET 2.0

First, we need to understand the default differences between the XML output by JAXB and XmlSerializer. To start we’ll create the same entity in both Java and C#. Then we can compare them.

The entity: DataObject

.NET Entity Class:

[Serializable]
public class DataObject
{
   public string Id { get; set; }
   public string Name { get; set; }
   public bool Processed { get; set; }
}

Java Entity Class:

public class DataObject implements Serializable {

	private String id;
	private String name;
	private boolean processed = false;

	public String getId() {
		return id;
	}

	public void setId(String id) {
		this.id = id;
	}

	public String getName() {
		return name;
	}

	public void setName(String name) {
		this.name = name;
	}

	public boolean isProcessed() {
		return processed;
	}

	public void setProcessed(boolean processed) {
		this.processed = processed;
	}
}

Java Entity XML:

<DataObject>
  <id>ea9b96a6-1f8a-4563-9a15-b1ec0ea1bc34</id>
  <name>blah</name>
  <processed>false</processed>
</DataObject>

.NET Entity XML:

<DataObject xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Id>b3766011-a1ab-41bf-9ce2-8566fca5736f</Id>
  <Name>blah</Name>
  <Processed>false</Processed>
</DataObject>

The notable differences in the XML are these:

  • xsi and xsd namespaces are put in by .NET and not by Java
  • The casing of the element names are different.  In fact, they follow the style convention used to create the entity.  The property naming styles between the languages are as follows:
    • Java: CamelCase
    • .NET: PascalCase

Let’s have a look at how we can use a class called SerializationHelper to round-trip objects to xml and back objects. We want it to easily dehydrate (stringify) and rehydrate (objectify) data objects.

The implementation of this class in both Java and C# provides the following api:

String serialize(Object object)
Object deserialize(String str, Class klass)

This is useful for quickly reversing objects to XML and visaversa.

I’ll walk you through how to use it with some tests.

Round Tripping (Java Usage):

@Test
public void can_round_trip_a_pojo_to_xml() throws Exception
{
	SerializationHelper helper = new SerializationHelper();
	DataObject obj = buildDataObject();

	String strObj = helper.serialize(obj);

	DataObject obj2 = (DataObject) helper.deserialize(strObj, DataObject.class);

	Assert.isTrue(obj.getId().equals(obj2.getId()));
	Assert.isTrue(obj.getName().equals(obj2.getName()));

}

Round Tripping (C# Usage):

[TestMethod]
public void can_round_trip_a_poco_to_xml()
{
    SerializationHelper helper = new SerializationHelper();
    DataObject obj = BuildDataObject();

    string strObj = helper.serialize(obj);

    DataObject obj2 = (DataObject)helper.deserialize(strObj, typeof(DataObject));

    Assert.IsTrue(obj.Id.Equals(obj2.Id));
    Assert.IsTrue(obj.Name.Equals(obj2.Name));
}

No problem. A simple single line expression reverses the representation. Now lets see if we can move the stringified representations between runtimes to become objects.

Adapting .NET XML to Java (Java Usage):

@Test
public void can_materialize_an_object_in_java_from_net_xml() throws Exception
{
	SerializationHelper helper = new SerializationHelper();

	String netStrObj = Files.toString(new File("DOTNET_SERIALIZED_DATAOBJECT.XML"), Charsets.UTF_8);

	DataObject obj2 = (DataObject) helper.deserialize(netStrObj, DataObject.class);

	Assert.isTrue(obj2.getName().equals("blah"));
}

Behind the scenes here there is a StreamReaderDelegateunder the hood in the SerializationHelper that is intercepting the inbound XML and camel-casing the names before it attempts to bind them onto the DataObject instance directly.

SerializationHelper.java:

public class SerializationHelper {

	public String serialize(Object object) throws Exception{
		StringWriter resultWriter = new StringWriter();
		StreamResult result = new StreamResult( resultWriter );
		XMLStreamWriter xmlStreamWriter =
		           XMLOutputFactory.newInstance().createXMLStreamWriter(result);

		JAXBContext context = JAXBContext.newInstance(object.getClass());
		Marshaller marshaller = context.createMarshaller();
		marshaller.marshal(new JAXBElement(new QName(object.getClass().getSimpleName()), object.getClass(), object), xmlStreamWriter);

		String res = resultWriter.toString();
	    return res;
	}

	public Object deserialize(String str, Class klass) throws Exception{

        InputStream is = new ByteArrayInputStream(str.getBytes("UTF-8"));
        XMLStreamReader reader = XMLInputFactory.newInstance().createXMLStreamReader(is);
        reader = new CamelCaseTransfomingReaderDelegate(reader, klass);

		JAXBContext context = JAXBContext.newInstance(klass);
		Unmarshaller unmarshaller = context.createUnmarshaller();

		JAXBElement elem = unmarshaller.unmarshal(reader, klass);
		Object object = elem.getValue();

		return object;
	}

	//adapts to Java property naming style
	private static class CamelCaseTransfomingReaderDelegate extends StreamReaderDelegate {

		Class klass = null;

        public CamelCaseTransfomingReaderDelegate(XMLStreamReader xsr, Class klass) {
        	super(xsr);
        	this.klass = klass;
        }

        @Override
        public String getLocalName() {
            String nodeName = super.getLocalName();
            if (!nodeName.equals(klass.getSimpleName()))
            {
            	nodeName = nodeName.substring(0, 1).toLowerCase() +
            			   nodeName.substring(1, nodeName.length());
            }
            return nodeName.intern(); //NOTE: intern very important!..
        }
    }
}

Note the deserialize method is able to do just-in-time fixup of the property name xml nodes to ensure they meet the expection (a camelCased fieldname) of the default jaxb unmarshalling behavior.

Now to go from XML produced by the default JAXB serializer to .NET objects with the same api. To do this I’ll switch back to C# now.

Adapting Java XML to .NET (C# Usage):

[TestMethod]
public void can_materialize_an_object_in_net_from_java_xml()
{
    string javaStrObj = File.ReadAllText("JAVA_SERIALIZED_DATAOBJECT.XML");

    SerializationHelper helper = new SerializationHelper();

    DataObject obj2 = (DataObject)helper.deserialize(javaStrObj, typeof(DataObject));

    Assert.isTrue(obj2.getName().equals("blah"));
}

In this case, I’m using a custom XmlReader that adapts the XML from Java style property names to .NET style. The pattern in Java and .NET is roughly the same for adapting the XML into a consumable form. This is the convenience and power that using an intermediary stream reader gives you. It basically applies changes to the node names it needs to bind them to the correct property names. The nice thing is that this happens just-in-time, as the XML being deserialized into a local Object.

Here is the C# implementation of the same SerializationHelper api in .NET.

SerializationHelper.cs:

public class SerializationHelper
{

    public string serialize(object obj)
    {
        using (MemoryStream stream = new MemoryStream())
        {
            XmlSerializer xs = new XmlSerializer(obj.GetType());
            xs.Serialize(stream, obj);
            return Encoding.UTF8.GetString(stream.ToArray());
        }
    }

    public object deserialize(string serialized, Type type)
    {
        using (MemoryStream stream = new MemoryStream(Encoding.UTF8.GetBytes(serialized)))
        {
            using (var reader = new PascalCaseTransfomingReader(stream))
            {
                XmlSerializer xs = new XmlSerializer(type);
                return xs.Deserialize(reader);
            }
        }
    }

    private class PascalCaseTransfomingReader : XmlTextReader
    {
        public PascalCaseTransfomingReader(Stream input) : base(input) { }

        public override string this[string name]
        {
            get { return this[name, String.Empty]; }
        }

        public override string LocalName
        {
            get
            {
                // Capitalize first letter of elements and attributes.
                if (base.NodeType == XmlNodeType.Element ||
                    base.NodeType == XmlNodeType.EndElement ||
                    base.NodeType == XmlNodeType.Attribute)
                {
                    return base.NamespaceURI == "http://www.w3.org/2000/xmlns/" ?
                           base.LocalName : MakeFirstUpper(base.LocalName);
                }
                return base.LocalName;
            }
        }

        public override string Name
        {
            get
            {
                if (base.NamespaceURI == "http://www.w3.org/2000/xmlns/")
                    return base.Name;
                if (base.Name.IndexOf(":") == -1)
                    return MakeFirstUpper(base.Name);
                else
                {
                    // Turn local name into upper, not the prefix.
                    string name = base.Name.Substring(0, base.Name.IndexOf(":") + 1);
                    name += MakeFirstUpper(base.Name.Substring(base.Name.IndexOf(":") + 1));
                    return NameTable.Add(name);
                }
            }
        }

        private string MakeFirstUpper(string name)
        {
            if (name.Length == 0) return name;
            if (Char.IsUpper(name[0])) return name;
            if (name.Length == 1) return name.ToUpper();
            Char[] letters = name.ToCharArray();
            letters[0] = Char.ToUpper(letters[0]);
            return NameTable.Add(new string(letters));
        }

    }
}

I think it’s important to have a thorough understanding and good control of the basics of serialization. In some cases, we’re just consuming a serialized object from a message queue, a file, or a database. The ability to move entities between process and stack boundaries should be easy.

It should take only 1 line of code.