emo/ndhc - ndhc - Project Segfault Git

emo/ndhc

Author	SHA1	Message	Date
Nicholas J. Kain	4575f74164	Remove legacy support for exiting after obtaining a DHCP lease.	2020-10-20 06:55:04 -04:00
Nicholas J. Kain	ade4e988af	Remove legacy support for forking to background.	2020-10-20 06:55:04 -04:00
Nicholas J. Kain	58067200d6	Remove legacy support for writing a pidfile.	2020-10-20 06:55:04 -04:00
Nicholas J. Kain	4d33c00e04	Use poll() instead of epoll() for ndhc-master.	2020-10-20 05:58:29 -04:00
Nicholas J. Kain	06a541261e	Stop using signalfd and audit signal handling code. There's really no advantage to using signalfd in ndhc, particularly since the normal POSIX signal API is now used for handling SIGCHLD in ndhc-master. So just use the tried and true volatile sig_atomic_t set and check approach. The only intended behavior change is in the dhcp RELEASE state -- before there would be a spurious attempt at renewing a nonexistent lease when the RENEW signal was received.	2020-10-20 04:42:58 -04:00
Nicholas J. Kain	8d89ca9f19	Reliably force restart when a subprocess has a fatal error. Suppose a system call such as bind() fails in the sockd subprocess in request_sockd_fd(). sockd will suicide(). This will send a SIGCHLD to the master process, which the master process should respond to by calling suicide(), forcing a process supervisor to respawn the entire ndhc program. But, this doesn't reliably happen prior to this commit because of the interaction between request_sock_fd() and signalfd() [or equivalently self-pipe-trick] signal handling. request_sock_fd() makes ndhc-master synchronously wait for a response from sockd via safe_recvmsg(). The normal goto-like signal handling path is suppressed when using signalfd() , so when SIGCHLD is received, it will not be handled until io is dispatched for the signalfd or pipe. But such code will never be reached because ndhc-master is waiting in safe_recvmsg() and thus never polls signal fd status. So, revert to using traditional POSIX sigaction() for SIGCHLD, which provides exactly the required behavior for proper functioning.	2020-10-20 04:41:51 -04:00
Nicholas J. Kain	f0340b1475	Correct `ba046c02c7`. Apparently I had forgotten the counter-intuitive semantics of signalfd(): it's necessary to BLOCK the signals that will be handled exclusively by signalfd() so that the default POSIX signal handling mechanism won't intercept the signals first. The lack of response to ctrl+c is a legitimate bug that is now properly fixed; `ba046c02c7` fixed that issue, but regressed the handling of other signals.	2020-10-19 13:03:35 -04:00
Nicholas J. Kain	4df035ced3	Make sure xids in packets sent conform to RFC2131 pg36 table5.	2020-10-19 05:48:52 -04:00
Nicholas J. Kain	ba046c02c7	Make custom SIGUSR signals work again. These were broken long ago when converting from signal() to sigprocmask(). This change also makes ctrl+c work again.	2020-09-02 21:04:59 -04:00
Nicholas J. Kain	56b6ae2cd3	Quit using NULL macro.	2018-10-26 07:17:39 -04:00
Nicholas J. Kain	05a075aeb2	Replace '(c)' with 'Copyright'. '(c)' may not be a valid substitute for 'Copyright' in some legal domains/interpretations. So be safe, since I obviously am asserting copyright on my legal work.	2018-10-26 07:11:16 -04:00
Nicholas J. Kain	8983df3c86	Update copyright dates.	2018-02-18 08:25:10 -05:00
Nicholas J. Kain	e08d3b15b5	Remove seccomp support. It breaks with the existing whitelists on the latest glibc and is just too much maintenance burden. It also causes the most questions for new users. Something like openbsd's pledge() would be fine, but I have no intention of maintaining such a thing. Most of the value-gain would come from disallowing high-risk syscalls like ptrace() and the perf syscalls, anyway. ndhc already uses extensive defense-in-depth and wasn't using seccomp on non-(x86\|x86-64) platforms, so it's not a huge loss.	2018-02-09 03:33:04 -05:00
Nicholas J. Kain	e8d97205e9	Compile cleanly with -Wsign-conversion. I didn't notice anything that worried me.	2018-02-09 03:16:59 -05:00
Nicholas J. Kain	759b6bd831	Update to the new ncmlib random API.	2017-08-24 02:36:31 -04:00
Nicholas J. Kain	7af3e64a99	arp: Remove reply_offset, and keep previous ARP packet after epoll. ARP packets aren't split across multiple receive events, so reply_offset is pointless, and we implicitly assume that the previous ARP packet data is still available after a forced sleep.	2017-04-10 08:56:11 -04:00
Nicholas J. Kain	4fdde404aa	Remove unused client_config_t foreground variable.	2017-01-19 05:18:04 -05:00
Nicholas J. Kain	c38fd2be9b	Convert logical booleans in client_config_t to bool type.	2017-01-19 05:13:30 -05:00
Nicholas J. Kain	571b22c4b2	Rename client_state_t init variable to program_init. Easier to grep. No functional change.	2017-01-19 05:05:35 -05:00
Nicholas J. Kain	931530786b	Convert logically boolean client_state_t variables from uint8_t to bool.	2017-01-19 05:01:23 -05:00
Nicholas J. Kain	b8ee0bd5c2	Update copyright dates to 2017.	2017-01-13 20:15:27 -05:00
Nicholas J. Kain	29498f5341	Remove ifsPrevState and set non-infinite timeout on a send error. We instead check carrier status as needed. This approach is more robust. For a simple example, imagine link state changes that happen while the machine is suspended.	2017-01-13 20:15:27 -05:00
Nicholas J. Kain	04ec7c8f4b	Update to latest write_pid semantics and don't write pidfile by default. There was no way to disable writing pidfiles before. pidfiles are an unreliable method of tracking processes, anyway; process supervisors are strongly recommended. If a pidfile is really needed, it can be explicitly specified.	2016-05-06 15:00:31 -04:00
Nicholas J. Kain	5ab36719f1	If a fd closes unexpectedly in epoll, print error and exit with failure. Before the exit code would be success and no error message would print, and it required a bit of control flow tracing to determine what would actually happen. No direct functional change (unless the supervising process cares about the return code of ndhc on exit).	2015-12-25 02:44:54 -05:00
Nicholas J. Kain	277f0f67c5	When converting timeout from ll to int, also guard against underflow.	2015-05-27 15:00:02 -04:00
Nicholas J. Kain	abb1b54bfe	Fix an overflow that can cause spuriously short epoll timeouts. Lease times and arp timeouts are all calculated using long long, but epoll takes its timeout argument as an int. Guard against timeouts > INT_MAX but < UINT_MAX wrapping and causing spuriously short timeouts when converted to a signed int. This problem has been observed in the wild. Thanks to thypon for a detailed strace that pointed me towards this issue.	2015-05-27 12:58:42 -04:00
Nicholas J. Kain	ba875d4b2e	Failsafe should only trigger is the new timeout is also zero. This is what I get for rushing!	2015-05-27 12:35:16 -04:00
Nicholas J. Kain	f061a78a18	Fix dumb mistake in patch before last; epoll timeout is in ms, not s.	2015-05-27 12:29:46 -04:00
Nicholas J. Kain	8273b383ab	Add a failsafe to prevent epoll busy-spin.	2015-05-27 12:23:16 -04:00
Nicholas J. Kain	b3bd13d45f	Fix the return values of dhcp_packet_get and arp_packet_get. This corrects a bug where stale dhcp packets would get reprocessed, causing very bad behavior; an issue that was introduced in the coroutine conversion.	2015-02-18 11:02:13 -05:00
Nicholas J. Kain	3ede5fbe33	Handle the release and renew signals again.	2015-02-18 07:31:19 -05:00
Nicholas J. Kain	69cf41f1b1	Only process one epoll event at a time. If ndhc were a high-performance program that handled lots of events, this change would harm performance. But it is not, and it implicitly believes that events come in one at a time. Processing batches would make it harder to assure correctness while also never allocating memory at runtime. The previous structure was fine when everything was handled immediately by callbacks, but it isn't now.	2015-02-18 05:36:13 -05:00
Nicholas J. Kain	99ce918a31	Use a coroutine instead of several callback state machines. This change makes it much easier to reason about ndhc's behavior and properly handle errors. It is a very large changeset, but there is no way to make this sort of change incrementally. Lease acquisition is tested to work. It is highly likely that some bugs were both introduced and squashed here. Some obvious code cleanups will quickly follow.	2015-02-18 05:31:13 -05:00
Nicholas J. Kain	37aa866ae4	Move action dispatch out of main epoll loop.	2015-02-15 06:48:49 -05:00
Nicholas J. Kain	61387408d0	Separate event state gathering from action dispatch in main epoll loop. This is the first step towards using coroutines.	2015-02-15 06:38:03 -05:00
Nicholas J. Kain	5b82be8b00	If ifchd interactions fail, terminate. Ideally we would pause and resume state, but for now just bail out. If ndhc is process-supervised, it will recover to the proper state quickly.	2015-02-14 20:47:14 -05:00
Nicholas J. Kain	170f87c0e7	Propagate returns through ifchange_(deconfig\|bind). While doing so remove unnecessary argument null checks and make sure not to alter the stored interface state if the ifch requests failed.	2015-02-14 19:10:23 -05:00
Nicholas J. Kain	44175bd77c	Make ifch requests synchronous just like sockd requests. This change paves the way for allowing ifch to notify the core ndhc about failures. It would be far too difficult to reason about the state machine if the requests to ifch were asynchronous. Currently ndhc assumes that ifch requests never fail, but this is not always true because of eg, rfkill.	2015-02-14 16:49:50 -05:00
Nicholas J. Kain	61a48b0fb6	Fix the rfkill waiting.	2015-02-14 15:33:02 -05:00
Nicholas J. Kain	04840c261d	Fix some c99 struct initializer uninitialized member warnings that clang detects and GCC misses.	2015-02-13 23:25:42 -05:00
Nicholas J. Kain	702d8b0c5b	Mark pointer arguments that cannot ever be null as [static 1]. Also constify some cases, too.	2015-02-13 23:14:08 -05:00
Nicholas J. Kain	911d4cc58e	Fix the dhcp state bootstrapping when rfkill is set #3 .	2015-02-13 19:08:50 -05:00
Nicholas J. Kain	2e679ed491	Fix the dhcp state bootstrapping when rfkill is set #2 .	2015-02-13 18:35:44 -05:00
Nicholas J. Kain	a8af406307	Fix the dhcp state bootstrapping when rfkill is set.	2015-02-13 18:07:14 -05:00
Nicholas J. Kain	79a97131bc	Handle the case where the rfkill is set when ndhc is initializing.	2015-02-13 17:50:24 -05:00
Nicholas J. Kain	cf81573082	Fix a dumb typo in the previous commit.	2015-02-13 16:56:56 -05:00
Nicholas J. Kain	e3d4d4c1aa	rfkill: Add support for reacting to radio kill switch events. In order for this to work, the correct rfkill index must be specified with the rfkill-idx option. It might be possible to auto-detect the corresponding rfkill-idx option, but I'm not sure if there's a guaranteed mapping between rfkill name and interface name, as it seems that rfkills should represent phy devices and not wlan devices. The rfkill indexes can be found by checking /sys/class/rfkill/rfkill<IDX>.	2015-02-13 16:25:36 -05:00
Nicholas J. Kain	c58a071f52	Update copyright dates.	2015-02-13 01:54:57 -05:00
Nicholas J. Kain	07cbd88049	Just use raw sockets for listening to DHCP requests. A UDP SO_BROADCAST socket was previously used only for receiving RENEWING packets, and it added needless complexity and was somewhat fragile.	2014-04-16 01:00:36 -04:00
Nicholas J. Kain	0884d96d1e	PR_SET_PDEATHSIG is not fully reliable, so instead maintain a pair of AF_UNIX SOCK_STREAM sockets between the master processes and each subprocess, and poll for the HUP event. At the same time, be specific about the events that are checked in epoll when dispatching on an event.	2014-04-15 23:19:24 -04:00

1 2

60 Commits