Saturday, 9 October 2010

Improving JarClassLoader

Providing a Java application easily runnable and platform independent can still turn out to be a challenge. I recently wrote a graphical console application for a students' lab experiment. What made it kind of special is that it requires access to the serial interface and therefore native libraries. Yet it has to run in the labs environment and on the students' PCs. So I have to support at least 32-bit and 64-bit Linux and Windows (XP and 7).

Obviously, I chose Java Web Start for a start and happily found that it supports embedding native libraries. You can even choose the libraries to link depending on the operating system (system property "os.name") and architecture ("os.arch"). But then additional requirements came up. For one experiment it turned out that the application must be started with varying arguments from a batch file. Of course there is javaws, but it does not support passing arbitrary command line arguments to the application. And writing JNLP files with embedded arguments for all required combinations soon becomes a maintenance problem. This problem is aggravated because we have one lab room without a network connection. This requires another set of JNLP files using local files instead of network resources.

The solution would be to package everything in a single JAR that can be used both for Web Start and for command line start with "java -jar ...". The latter way to start an application has the old and well known problem that you cannot embed jarred libraries in the JAR. You have to unpack them and this may lead to unexpected effects if the libraries use information from META-INF. Worse, you cannot simply pack native libraries in the JAR. But there is help: the JarClassLoader. While there are other utilities for the job, this is the only one that claims to support loading native libraries from the JAR. It does not claim to support Web Start, though. And that's where the first issue came up.

The JarClassLoader needs the location of the JAR in order to search through it. It uses this code to find the location:
    pd = getClass().getProtectionDomain();
    CodeSource cs = pd.getCodeSource();
    URL urlTopJAR = cs.getLocation();
Provided that the JarClassLoader.class is packed in the JAR, this returns the URL of the JAR. JarClassLoader assumes this to be a file and proceeds accordingly. But with Web Start it is the network resource (something like "http://some.host/my.jar"). (With older versions of Web Start [pre JDK6] it can also be the URL of the cached file as described here.) So I changed the way this URL is handled to:
    if (!urlTopJAR.getProtocol().equals("jar:")) {
        urlTopJAR = new URL("jar:" + urlTopJAR.toString() + "!/");
    }
    ...
    loadJar(((JarURLConnection)urlTopJAR.openConnection()).getJarFile());
Now JarClassLoader works for both for Web Start and command line start.

Another issue is loading the native libraries. JarClassLoader keeps its promise, the libraries load fine. But it cannot handle different architectures. Take Linux as an example. A dynamic library is usually named "libMyCode.so" no matter if it is compiled for 32-bit or 64-bit. If both libraries have to coexist on the system, they are kept in different directories (typically "/usr/lib/" and /usr/lib64"). When loading such a library from Java, you do it with "System.load("MyCode")". This eventually calls the classloader's findLibrary method. JarClassLoader returns the path to the first entry that matches the library name, it does not consider the architecture.

The solution I implemented is to put the native libraries in subdirectories within the JAR and extend the JarClassLoader's findLibrary method to first find all entries matching the library name and then score them using os.name and os.arch. Finally the best match is returned. If you're interested, download the complete modified JarClassLoader. I have also submitted the modifications to the maintainer, so maybe they'll become "official" one day.

Probably I have missed some resource on the web and the problem has been solved before. But anyway, I now have a solution that fits my needs perfectly. And it shows that the promise "Write once, run anywhere" can be kept — although it requires some efforts from time to time.

Thursday, 15 April 2010

Last week I started running Windows 7 in a KVM (guest) on my Ubuntu laptop (host) using libvirt for configuration. Installation went fine and Windows 7 came up as expected. After some reboots, however, I noticed that Windows asked me to select the network type (home, work, public) after each boot. I tried a few more times (you always hope nasty effects to go away magically), but the problem remained.

Looking closer, I found that Windows 7 considered itself to be connected to a new network after each boot (reporting "Network N", with N counting up). I checked the software bridge created by libvirt. It seemed OK, providing NAT access to the Internet to my Windows guest; and both the guest and the gateway always had the same IP addresses assigned. I googled a bit, but found no solution. Then I came across an article in the German computer journal "c't", explaining "Network Location Awareness" (NLA). I freely admit that I had never heard about this before. (Having Ubuntu as host and Windows as a guest OS only may give a hint why.)

So I learned that the important issue it is not the IP address but the gateway's MAC address as seen by Windows at the network interface. Obviously (I didn't try), even if the IP address changes, Windows 7 still considers itself to be connected to a specific network as long as the MAC address of the gateway remains the same.

Checking again, I found that indeed the gateway as seen by the guest (i.e. interface vnet0) had a different MAC address after each boot. (You can find out the MAC-address used by Windows to reach a certain IP-address by issuing a "arp -a" in a command window.) Normally, this is not a problem because MAC-addresses are considered "second class" when configuring networks. DHCP provides a new client in the network with a default gateway network address (IP-address), and it is up to the address resolution protocol (ARP) to find the corresponding MAC address. NLA kind of reverses this scheme by making the MAC the decisive bit of information for network configuration.

Unfortunately, libvirt provides no possibility to configure a fixed MAC address for the bridge's interface. (Don't mix this up with the possibility to set the MAC address of the guest's emulated ethernet interface.) I've filed this as bug report for libvirt, but of course this will not result in a short term solution.

An obvious approach is to configure the MAC address of the bridge interface the client is attached to (using ifconfig vnet0 hw ether "Gateway MAC address", with Gateway MAC address being one of the previously used arbitrary addresses). But this cannot easily be done, because this interface is created dynamically when the client VM is started. Searching the web for another possibility to set a fixed MAC address, I found the approach to set the MAC address on the host side of the bridge, i.e. virbr0. When I tried this (ifconfig virbr0 hw ether "Gateway MAC address") and started the VM, two things happened: Windows 7 reconnected to a known network (good!), and Windows 7 reported "no internet connectivity" (bad!).

Let's look at the successful part first. After receiving the IP-address of the default gateway, Windows sends an ARP request to get the MAC address. This is obviously answered by the bridge, replying with the MAC address of the host side interface of the bridge (virbr0).

But why is there no connectivity? Are packets from the host not sent to the guest or packets from the guest not sent to the host? Using ebtables to insert log statements in the packet's paths on my Linux system (more about processing of packets here) I found out that packets sent by the guest to the MAC-address of the gateway (the Gateway MAC address) are simply discarded. I didn't look in the code why. I would have expected the bridge to forward them to the host side interface of the bridge and process them there as incoming packets. But obviously packets with the bridge's MAC adress are treated some way special. But wait, why were things working before I started meddling around with the setup? I rebooted the host and the guest and found that in the default setup both the host side interface (virbr0) and the guest side interface (vnet0) of the bridge have the same MAC address. This is a bit unusal if you consider physical bridges that have different ethernet controllers for each port.

Now I tried setting the MAC-addresses of both virbr0 and vnet0 to the Gateway MAC address. Success! Everything worked now. But this brought me back to the problem that the guest side interface is created during VM startup and I would somehow have to receive an event and change the MAC-address right after the creation of this interface.

Working with ebtables, however, had teached me about a lot of tricks to manipulate addresses during packet processing. Among those is the possibility to change the destination MAC-address. So I added a filter that changes the destination MAC-address of all packets targeted at the gateway to the MAC-address of the receiving interface (ebtables -t broute -A BROUTING -d "Gateway MAC address" -j redirect --redirect-target CONTINUE). Note that you do not need to know the MAC-address of the receiving interface to create such a filter because the rule uses the MAC-address of the receiving interface as new destination MAC-address automatically. This is also the reason why I do not have to restrict this rule to the guest side interface. Packets received on the host side interface should already have the "Gateway MAC address"as destination, so there is no problem in reassigning that same address again.

The short of it is that with two statements in a startup script
  • ifconfig virbr0 hw ether "Gateway MAC address"
  • ebtables -t broute -A BROUTING -d "Gateway MAC address" -j redirect --redirect-target CONTINUE
I have finally solved my Windows 7 network problems - and learned a lot about Linux bridges.