Thursday, 15 April 2010

Last week I started running Windows 7 in a KVM (guest) on my Ubuntu laptop (host) using libvirt for configuration. Installation went fine and Windows 7 came up as expected. After some reboots, however, I noticed that Windows asked me to select the network type (home, work, public) after each boot. I tried a few more times (you always hope nasty effects to go away magically), but the problem remained.

Looking closer, I found that Windows 7 considered itself to be connected to a new network after each boot (reporting "Network N", with N counting up). I checked the software bridge created by libvirt. It seemed OK, providing NAT access to the Internet to my Windows guest; and both the guest and the gateway always had the same IP addresses assigned. I googled a bit, but found no solution. Then I came across an article in the German computer journal "c't", explaining "Network Location Awareness" (NLA). I freely admit that I had never heard about this before. (Having Ubuntu as host and Windows as a guest OS only may give a hint why.)

So I learned that the important issue it is not the IP address but the gateway's MAC address as seen by Windows at the network interface. Obviously (I didn't try), even if the IP address changes, Windows 7 still considers itself to be connected to a specific network as long as the MAC address of the gateway remains the same.

Checking again, I found that indeed the gateway as seen by the guest (i.e. interface vnet0) had a different MAC address after each boot. (You can find out the MAC-address used by Windows to reach a certain IP-address by issuing a "arp -a" in a command window.) Normally, this is not a problem because MAC-addresses are considered "second class" when configuring networks. DHCP provides a new client in the network with a default gateway network address (IP-address), and it is up to the address resolution protocol (ARP) to find the corresponding MAC address. NLA kind of reverses this scheme by making the MAC the decisive bit of information for network configuration.

Unfortunately, libvirt provides no possibility to configure a fixed MAC address for the bridge's interface. (Don't mix this up with the possibility to set the MAC address of the guest's emulated ethernet interface.) I've filed this as bug report for libvirt, but of course this will not result in a short term solution.

An obvious approach is to configure the MAC address of the bridge interface the client is attached to (using ifconfig vnet0 hw ether "Gateway MAC address", with Gateway MAC address being one of the previously used arbitrary addresses). But this cannot easily be done, because this interface is created dynamically when the client VM is started. Searching the web for another possibility to set a fixed MAC address, I found the approach to set the MAC address on the host side of the bridge, i.e. virbr0. When I tried this (ifconfig virbr0 hw ether "Gateway MAC address") and started the VM, two things happened: Windows 7 reconnected to a known network (good!), and Windows 7 reported "no internet connectivity" (bad!).

Let's look at the successful part first. After receiving the IP-address of the default gateway, Windows sends an ARP request to get the MAC address. This is obviously answered by the bridge, replying with the MAC address of the host side interface of the bridge (virbr0).

But why is there no connectivity? Are packets from the host not sent to the guest or packets from the guest not sent to the host? Using ebtables to insert log statements in the packet's paths on my Linux system (more about processing of packets here) I found out that packets sent by the guest to the MAC-address of the gateway (the Gateway MAC address) are simply discarded. I didn't look in the code why. I would have expected the bridge to forward them to the host side interface of the bridge and process them there as incoming packets. But obviously packets with the bridge's MAC adress are treated some way special. But wait, why were things working before I started meddling around with the setup? I rebooted the host and the guest and found that in the default setup both the host side interface (virbr0) and the guest side interface (vnet0) of the bridge have the same MAC address. This is a bit unusal if you consider physical bridges that have different ethernet controllers for each port.

Now I tried setting the MAC-addresses of both virbr0 and vnet0 to the Gateway MAC address. Success! Everything worked now. But this brought me back to the problem that the guest side interface is created during VM startup and I would somehow have to receive an event and change the MAC-address right after the creation of this interface.

Working with ebtables, however, had teached me about a lot of tricks to manipulate addresses during packet processing. Among those is the possibility to change the destination MAC-address. So I added a filter that changes the destination MAC-address of all packets targeted at the gateway to the MAC-address of the receiving interface (ebtables -t broute -A BROUTING -d "Gateway MAC address" -j redirect --redirect-target CONTINUE). Note that you do not need to know the MAC-address of the receiving interface to create such a filter because the rule uses the MAC-address of the receiving interface as new destination MAC-address automatically. This is also the reason why I do not have to restrict this rule to the guest side interface. Packets received on the host side interface should already have the "Gateway MAC address"as destination, so there is no problem in reassigning that same address again.

The short of it is that with two statements in a startup script
  • ifconfig virbr0 hw ether "Gateway MAC address"
  • ebtables -t broute -A BROUTING -d "Gateway MAC address" -j redirect --redirect-target CONTINUE
I have finally solved my Windows 7 network problems - and learned a lot about Linux bridges.