This change makes it much easier to reason about ndhc's behavior
and properly handle errors.
It is a very large changeset, but there is no way to make this
sort of change incrementally. Lease acquisition is tested to
work.
It is highly likely that some bugs were both introduced and
squashed here. Some obvious code cleanups will quickly follow.
If a packet send failed because the carrier went down without a
netlink notification, then assume the hardware carrier was lost while
the machine was suspended (eg, ethernet cable pulled during suspend).
Simulate a netlink carrier down event and freeze the dhcp state
machine until a netlink carrier up event is received.
The ARP code is not yet handling this issue everywhere, but the
window of opportunity for it to happen there is much shorter.
Mostly reverts the previous commit and instead teaches ndhc to properly
handle the case when it is communicating with a DHCP relay agent on
its local segment rather than directly with a DHCP server.
different segment.
The network fingerprinting would never complete if the DHCP server was
on a different segment before this change, since it would be impossible
for the ARP messages sent by ndhc to ever reach the DHCP server
(and vice-versa).
Now just give up trying to find the hardware address after two tries
and assume that the DHCP server cannot be reached by ARP.
An alternative would be to fingerprint the relay agent instead, but
to do so would require a lot more work as the giaddr field is only
meaningful in the client->server message path, not in the
server->client path. Thus it would require gathering the source IP
for DHCP replies sent by unicast or broadcast and ferrying along
this information to the ARP checking code where it would be used
in place of the DHCP server address.
This is entirely possible to do, but is quite a bit more work.