This is done by performing one synchronous query for carrier state at
the start of the program; after that, we just monitor the nlsocket for
carrier state changes and update the cached state accordingly.
The benefit is that ifchd needs to do a lot less work and this should
reduce the CPU cycle consumption; prior to this commit, the CPU time
ends up being a few CPU-minutes per month.
Before the handling would constantly acculmulate a prefix of previous
incomplete commands. Now it still has a latent defect where the entire
buffer will be discarded given a spurious command, but ndhc shouldn't
generate such commands so it shouldn't matter.
I only built with this flag to mitigate accidental UB. Now that
UBSan exists, there's no point as UBSan does better and actually
allows offending code to be located and fixed easily.
The existing code tracked the fingerprinting completion state with
a bool, which is insufficient; what is required is a tristate
(none|inprogress|done) where the inprogress state reverts to
none on carrier down, but done state stays done on carrier down.
We could use 32-bit fetches with the same technique on 64-bit
architectures, or SIMD could be used to do very fast 128-bit
fetches, but this isn't a performance bottleneck and this
method is very simple and relatively fast.
That trivial warning fix causes warnings with -Wcast-qual, which
is a more valid warning than whatever was causing the warnings
under -Wall -pedantic with the normal Makefile.
There's really no advantage to using signalfd in ndhc, particularly
since the normal POSIX signal API is now used for handling SIGCHLD in
ndhc-master. So just use the tried and true volatile sig_atomic_t set
and check approach.
The only intended behavior change is in the dhcp RELEASE state --
before there would be a spurious attempt at renewing a nonexistent
lease when the RENEW signal was received.
Suppose a system call such as bind() fails in the sockd subprocess in
request_sockd_fd(). sockd will suicide(). This will send a SIGCHLD to
the master process, which the master process should respond to by
calling suicide(), forcing a process supervisor to respawn the entire
ndhc program.
But, this doesn't reliably happen prior to this commit because of the
interaction between request_sock_fd() and signalfd() [or equivalently
self-pipe-trick] signal handling.
request_sock_fd() makes ndhc-master synchronously wait for a response
from sockd via safe_recvmsg(). The normal goto-like signal handling
path is suppressed when using signalfd() , so when SIGCHLD is received,
it will not be handled until io is dispatched for the signalfd or pipe.
But such code will never be reached because ndhc-master is waiting in
safe_recvmsg() and thus never polls signal fd status.
So, revert to using traditional POSIX sigaction() for SIGCHLD, which
provides exactly the required behavior for proper functioning.
Apparently I had forgotten the counter-intuitive semantics of
signalfd(): it's necessary to BLOCK the signals that will be
handled exclusively by signalfd() so that the default POSIX
signal handling mechanism won't intercept the signals first.
The lack of response to ctrl+c is a legitimate bug that is
now properly fixed; ba046c02c729c fixed that issue, but
regressed the handling of other signals.
If we get no response to three renews (unicast), switch to sending
rebinds (broadcast). Servers are supposed to always reply with
a DHCPACK or DHCPNAK even if the server doesn't update its internal
lease duration database, so this behavior should be RFC compliant.
'(c)' may not be a valid substitute for 'Copyright' in some legal
domains/interpretations. So be safe, since I obviously am asserting
copyright on my legal work.
It breaks with the existing whitelists on the latest glibc and is
just too much maintenance burden. It also causes the most questions
for new users.
Something like openbsd's pledge() would be fine, but I have no
intention of maintaining such a thing.
Most of the value-gain would come from disallowing high-risk
syscalls like ptrace() and the perf syscalls, anyway.
ndhc already uses extensive defense-in-depth and wasn't using
seccomp on non-(x86|x86-64) platforms, so it's not a huge loss.