Protocol for Dynamic Address Allocation

Time to handle task 2 in the sequence of events that needs to happen to get address allocation working on the mesh - namely, to draw up a specification for a protocol. We had one suggestion, for random address allocation (node assigns its own address, checks if it's free. If not, goto 1). We considered this, however decided against it due to the fact that we want to work with both IPv4 and IPv6, (with IPv4 being the priority right now due to its ubiquity, even though IPv6 would be more useful long term), and as a result, too many collisions would occur, making this incredibly inefficient. Furthermore, it would only push the problem elsewhere - how to get the parameters to the node, such as address range, subnet masks, etc. that are needed for allocation of an IP address. Supposing we don't then the node has free roam over the entire address space, meaning that it cannot be linked to other networks!

So, with that done, we set about coming up with a way of dynamically allocating IP addresses, in such a way that allowed us to stay reliable and fault tollerant. We quickly converged on the idea of a heartbeat signal, on top of vanilla DHCP - the server currently implementing DHCP would emit a heartbeat, regularly, and if it dies, somebody else takes to implementing DHCP. But, how to decide who gets this role? If we don't decide it, there's a race condition, and we could end up with 2 or more DHCP servers active at any given time, which can cause major issues with the network. We toyed with the idea of going in IP address order, but then decided that it would be easier if there was a determined order, in order of the nodes connecting to the network (or rather, getting their IP addresses).

With that done, there was a new issue - if a node goes down in the middle of this list somewhere, other nodes need informing, and the list needs to be kept up. We decided to put that information in the heartbeat itself. Another issue, solved in the same way, was synchronising the lease file, which every DHCP server has - if we don't, the new server could be allocating addresses the first one already did. So, put it in the heartbeat.

But how to detect if a node goes down, that isn't the active DHCP server? Why, have a circular buffer of nodes, of course. Each node sends a heartbeat signal to its next node, and when a node goes down, the next node can detect this, and inform the active DHCP server, while in the meantime starting to receive heartbeats from the node before it in the buffer. Simple, right?

But what when two networks merge? This was a bit of a deeper issue. It was obvious that one node needed to shut down its DHCP server, so the other could take over. But how to decide which one does, in a way that ensures reliability each time? We toyed with the idea of a coin toss (or its digital counterpart), but if both decide to shut down, or both decide the other should, then what? Then we thought about the fact that, in all likelihood, one would receive a heartbeat from the other at a different time than the opposite - the chances of both heartbeats getting to their destinations at the same time can be considered pretty much 0. So, the first to receive a heartbeat from the other should shut down, right? Wrong. There's another optimization, that was thought of while writing the formal specification - the server with the least number of nodes should shut down, and if they have the same number of nodes, then do a random thing.

With the above in place, it is now possible to start a network up automatically - simply have a node bring its DHCP online, start a heartbeat, and go into the above loop. If another DHCP server is already on the network, the new one will shut down, and that's the end of the matter. If not, there's now an active DHCP server.

There is a little more to it than that, and the specification in its current form can be found here.

Discussions

PointyOintment wrote 07/20/2014 at 08:39

>With that done, there was a new issue - if a node goes down in the middle of this list somewhere, other nodes need informing, and the list needs to be kept up. We decided to put that information in the heartbeat itself. Another issue, solved in the same way, was synchronising the lease file, which every DHCP server has - if we don't, the new server could be allocating addresses the first one already did. So, put it in the heartbeat.

What happens if the DHCP server goes down right after it allocates an address, but before sending the next heartbeat?

>But how to detect if a node goes down, that isn't the active DHCP server? Why, have a circular buffer of nodes, of course. Each node sends a heartbeat signal to its next node, and when a node goes down, the next node can detect this, and inform the active DHCP server, while in the meantime starting to receive heartbeats from the node before it in the buffer. Simple, right?

So you're going to implement a virtual token ring on a network with a mesh topology, where the nodes can't be guaranteed to be next to each other in ring order? Sounds very inefficient. Also, what happens if a bunch of nodes join at some point in the ring? Then the heartbeat is delayed by that many nodes in getting to the last node in the ring (i.e. the one right before the one the heartbeat was at when they joined), and it might time out even though there’s no break before it.

———

I have another suggestion that I believe would accomplish everything your proposal does and more (though it’s much more complex).

The network must always have a certain number greater than one of active DHCP servers. Rather than allocating addresses directly out of the pool of available addresses, they each maintain a short list (~5?) of reserved addresses to allocate from, which the other DHCP servers will not allocate from, because their short lists will not contain those addresses. This guarantees that no address will be allocated to multiple devices by different DHCP servers. Every DHCP server knows the addresses of the others and exchanges heartbeats with them, synchronizing the lease file and refilling their short lists from the general pool of available addresses. They also need to sync their short lists to keep them mutually exclusive (or just sync the pool, but conflict resolution might be harder that way, and it’s more data to sync if sunc in full). There will have to be a conflict resolution method for refilling the short lists, but I think it would be a lot easier to do that than try to resolve conflicts between addresses after they've been allocated. It could be as simple as both DHCP servers that claimed an address from the pool waiting a random amount of time, and whichever one reacts first gets the contested address to add to its short list, and the other one claims a different address. If there are no addresses left in the pool to add to the short lists, the DHCP servers keep allocating addresses from their short lists until they run out. When they run out, they will no longer respond with address offers to joining devices’ discovery messages. In this way, the joining device is guaranteed to get an address even if the nearest DHCP server to it doesn’t have one to allocate.

If there are too few DHCP servers on the network, the existing ones will ask another device to become a DHCP server. They should probably pick a device as far away from them on the mesh as possible so that the distribution of DHCP servers on the mesh is more uniform. If there are too many, then they pick one to deactivate, at which point it becomes just another device on the network. In this case, they should probably pick the one nearest to the rest, for the same reason. If using a routing algorithm that doesn’t allow full knowledge of the network, such as BATMAN, they can each pick a device to activate/deactivate as a DHCP server and decide by a vote. (BATMAN hardly allows any knowledge of the layout of the network, so latency should probably be used as the metric in this case.)

A device joining the network will, as usual, send out a discovery message looking for DHCP servers. At least one will respond, and the joining device will then have to choose which one to use—the easiest thing to do would be to simply choose the first one to respond, because it’s probably closest. The others can be ignored, because if they don’t receive address requests back from the joining device, they won’t allocate the addresses they offered. The device that just joined will then remember which DHCP server gave it its address and exchange heartbeats with that DHCP server. These heartbeats include a list of the other DHCP servers' addresses. If a device notices a lack of heartbeat from its associated DHCP server, it will assume that DCHP server has gone offline suddenly and reassociate with a different one (because it knows all of their addresses), keeping the same address it already had. If a DHCP server notices a lack of heartbeat from any of its associated devices, it will first ask the other DHCP servers if that device is now associated with any of them, and, if not, add that address back to its short list (if the short list isn’t full) or to the general pool. If any device cannot contact any DHCP server, it assumes there are none left on the network (or that it’s founding a network) and becomes one. This, according to what has been discussed so far, would result in every remaining device becoming a DHCP server. Then they can then decide which ones to deactivate. With network IDs and the resulting merging process, described below, each device would still become a DHCP server, but would create its own network fragment. These fragments would then re-merge. This results in an extremely resilient network.

If a DHCP server goes down from the perspective of the other DHCP servers, they should first attempt to contact the devices it allocated addresses to, to see if any of them can still contact it. If so, then messages can just be routed via them. (If the mesh routing works well enough, this would happen without them even noticing.) If it can't be contacted, the other DHCP servers ask all of the devices that previously belonged to the downed one to reassociate with another DHCP server, again, using their existing IP addresses. (If some of them are not contactable, but continue to be in contact with their DHCP server, this just becomes a separate network.) If necessary, the remaining DHCP servers then ask another device on the network to become a DHCP server, as described above, to replace the lost one. Also, for optimization, it might be good to have devices automatically reassociate with different DHCP servers if the network topology changes to bring them closer together. There should be some hysteresis, though, to keep them from switching back and forth quickly.

What about merging networks? Well, what triggers a merger? Obviously, two distinct networks coming into range of each other. There are two cases for this: case 1, where some IP addresses are allocated to a device on each network, and case 2, where the networks don’t have any allocated IP addresses that overlap. In case 2, the most obvious way to detect proximity of another network is to detect that the destination IP address on a received packet isn’t allocated on the current network. However, this won’t necessarily work for case 1, and it isn’t reliable even in case 2. An analogous detection method for case 1 is receiving a packet from one’s own IP address, or receiving replies from two other devices in response to only one message being sent. This, however, is also unreliable. Therefore, we can discount this detection method. In case 2, it might be possible for the networks to merge passively, using their other synchronization and conflict resolution methods to bring all of the DHCP servers into agreement. However, the situation could easily transition to case 1 during the merger, causing more difficult conflicts.

Therefore, I think it’s necessary for each network to have an ID of some sort, agreed upon and distributed to the other devices by the DHCP servers. This could be a UUID or a traditional SSID. This ID would facilitate reliable detection of two networks coming into range of each other.

In the case of using an SSID, any device receiving a packet with an SSID on it that doesn’t match that of the current network would notify its DHCP server, which would initiate a merger. However, if the SSID is the same, as would be the case if the network broke in two and then came back together, no such notification would be sent. In this case, there could be IP address conflicts, so passive merging might not be reliable. Therefore, an SSID alone is not sufficient to avoid conflicts. However, an SSID has the advantage of being human-readable.

The advantage of using a UUID is that there could not be a case where the two networks think they’re one, so there could be no passive mergers, and so no unnoticed IP address conflicts. With UUIDs, a breakaway network’s DHCP server(s) would create and propagate a new UUID for the new network fragment as soon as it notices the break. (Only a fragment composed of a majority of the previous network could keep its current UUID, because only it would be able to know that there was no larger fragment resulting from the break.) Therefore, whenever a network loses a DHCP server (without that DHCP server telling the others it’s about to go offline and telling its associated devices to reassociate), a new UUID might need to be created for the remainder of the network. This leads to the idea of using not a UUID but instead some kind of hash of the active DHCP servers’ MAC addresses. (Such a hash might also be useful to enable joining devices to authenticate the DHCP server that offers them an address. I haven’t really thought about that.)

Therefore, I think it’s necessary to use both types of ID. The SSID would be the human-readable network name, enabling users to choose a network to join, and enabling multiple distinct networks to operate in the same geographical area without attempting to merge. The UUID or DHCP server MAC address hash would identify every network fragment uniquely to facilitate reliable detection of the necessity to merge. With both types of ID being used together, networks would merge when their SSIDs are the same but their UUIDs/hashes are different.

Now we come to the actual process of merging. Once the two fragments discover that they share an SSID but not a UUID/hash, they will know that they need to merge. Their DHCP servers will set up provisional communications via whatever edge nodes happened to come into range of the other network. Once this is in place, their DHCP servers can negotiate which DHCP servers to deactivate and which to leave active, and then create and propagate a new UUID/hash for the new combined fragment, and synchronize the DHCP leases and short lists. To resolve address conflicts, they will choose one device to keep its address and one to switch to a different address based on some factor such as relative traffic volumes. DHCP servers should probably get priority over other devices to keep their addresses. If there are too many devices to merge the fragments without address conflicts, they will not merge, or some kind of subnet thing will happen. If the UUID or hash of either merging fragment happens to change during the merger process (e.g. by one of the fragments breaking or merging with a third fragment), then the merger should be aborted and retried. In this way, all fragments in the area with the same SSID will eventually merge.

Are you sure? yes | no

connorwood71 wrote 07/20/2014 at 21:02

And this is exactly why I love openness. I've read through your entire proposal, however I have not yet had a chance to go through in any detail, and as such will not comment on the technical superiority of it until tomorrow morning (BST). I can make one comment though: to the best of my knowledge, 802.11s (the routing protocol we use), due to the fact that it is built upon 802.11, already has SSIDs in place.

I love the fact that we're not the only ones thinking about this, and I'm toying with the idea (time is a constraint, but at the moment not much of one), of trying both out, and comparing the two. I cannot, obviously, battle test them to any major degree, due to being on a student budget, but it will be an interesting endeavor nonetheless. Indeed, it is probable that both protocols out-perform each other, in different scenarios.

Thanks for the proposal, I'll take a more serious look in the morning :)

Are you sure? yes | no

Compiling for OpenWRT

Starting on the Code, Massive issue

Discussions

Become a Hackaday.io Member