safe_recv(..., len), when used on a blocking fd, will attempt
to call recv and collect data until either EOF, a hard error,
or len bytes are collected.
The previous commit used safe_recv() in a blocking mode to read
a single byte into a buffer that was larger than a byte. This
would cause ndhc to stall as safe_recv() would try to fill that
buffer when no more data would ever be sent.
This issue would only happen if ndhc is supposed to run a script.
Introduce and use safe_recv_once() that will correct this problem and
fill the semantic gap for blocking fds. I add a new call because in
some cases the above behavior might be required for a blocking fd, too.
Note that the above issue is not a problem for nonblocking fds; the
EAGAIN or EWOULDBLOCK path will return a short read.
The motivation here is to be safe in cases where the script
is setting up firewall rules or tunnels and where subsequent
tasks require these to be complete before starting.
I expect that this is a common case where a script is used.
The implementation behaves almost identically to how ifchd works.
The purpose of the changing xid is to prevent misinterpreting
delayed messages, and the rare chance where it's randomly the
same breaks this sequence-id property.
Before these were generated from a freshly seeded PRNG which reduces the
state space of possible DUIDs and skews the distribution of both DUIDs
and IAIDs as a function of the PRNG choice.
None of this really matters much in practice, but do things right.
These are standard, either in POSIX or C23.
The semantics are slightly different as the error path does not
enforce null-termination in the function itself, so enforce that
by hand. As a nice side effect, this makes those error paths
easier to audit.
This change guards against possible regressions introduced by
f3766990f9 if rfkill were to
prevent netlink carrier up events from being delivered while
rfkill is in effect for the interface.
The current code pads with an extra character that is then rewritten
into a null character. This isn't necessary with post-C99
implementations of standardized snprintf, so get rid of it.
Also add a note warning that nk_generate_env() and nk_execute()
are not async signal safe and are thus unsuitable for use in
multithreaded processes.
nk_execute() could be rewritten to be async signal safe without much
trouble, as the only problem point is snprintf() which is not guaranteed
to be async signal safe by POSIX.
However, nk_generate_env() performs chroot() if a chroot_path is
specified, and chroot() is not async signal safe in POSIX.
Additionally, malloc() can be called in rare cases where user
information fields are very long, and malloc() is obviously not async
signal safe. Finally, snprintf() is used here, too, but it could be
replaced.
Converting to posix_spawn() is a no-go because posix_spawn() has no
facility for changing rlimits or chroot on the spawned process.
In summary, I don't think the gains are worth it. Multithreaded
processes should just not fork().
If no 'script-file = SCRIPTFILE' is specified in the configuration
file and if no '-X SCRIPTFILE' or '--script-file SCRIPTFILE'
command argument is provided, then this functionality is entirely
inactive and no associated subprocess is spawned.
Otherwise, ndhc will spawn a subprocess that runs as root that has the
sole job of forking off a subprocess that exec's the specified script in
a sanitized and fixed-state environment whenever a new DHCPv4 lease is
acquired.
Note that this script is provided no information about ndhc or the
DHCP state in the environment or in any argument fields; it is the
responsibility of this script to gather whatever information it needs
from either the filesystem or syscalls. This design is intended to
avoid the historical problems that are associated with dhcp clients
invoking scripts.
The path of the scriptfile cannot be changed after ndhc is initially
run; ndhc forks off the privsep script subprocess that executes scripts
after it has read the configuration file and command arguments, but
before it begins processing network data; thus, it is impossible for the
network-handling process to modify or influence the script assuming
proper OS memory protection.
The privsep channel communicates that the script should be run by simply
writing a newline; anything else will result in ndhc terminating itself.
Before the recommended way to update system state after a change in
lease information was to run the fcactus program and watch the
associated leasefile for the interface for modification; now no external
program is needed for this job.
This can be enabled via the s6-notify configure option; see
http://www.skarnet.org/software/s6/notifywhenup.html for details.
ndhc will signal that it is ready when the first valid lease is
obtained. Programs dependent on a working network interface
can then simply use s6-wait on the associated ndhc service dir.
A typical command line option assuming the s6 service directory
notification-fd contains '3' would be '--s6-notify 3', and
a typical configure file option would be 's6-notify 3'.
This notation originally had the sole advantage of implying
that the pointer was non-NULL, but it has seen little use and
now clang presents warnings about it in certain contexts.
This is done by performing one synchronous query for carrier state at
the start of the program; after that, we just monitor the nlsocket for
carrier state changes and update the cached state accordingly.
The benefit is that ifchd needs to do a lot less work and this should
reduce the CPU cycle consumption; prior to this commit, the CPU time
ends up being a few CPU-minutes per month.
Before the handling would constantly acculmulate a prefix of previous
incomplete commands. Now it still has a latent defect where the entire
buffer will be discarded given a spurious command, but ndhc shouldn't
generate such commands so it shouldn't matter.
I only built with this flag to mitigate accidental UB. Now that
UBSan exists, there's no point as UBSan does better and actually
allows offending code to be located and fixed easily.