In production we've had several incidents over the years where a process
has a signal handler registered for SIGHUP or one of the SIGUSR signals
which can be used to signal a request to reload configs, rotate log
files, and the like. While this may seem harmless enough, what we've
seen happen repeatedly is something like the following:
1. A process is using SIGHUP/SIGUSR[12] to request some
application-handled state change -- reloading configs, rotating a log
file, etc;
2. This kind of request is deprecated and removed, so the signal handler
is removed. However, a site where the signal might be sent from is
missed (often logrotate or a service manager);
3. Because the default disposition of these signals is terminal, sooner
or later these applications are going to be sent SIGHUP or similar
and end up unexpectedly killed.
I know for a fact that we're not the only organisation experiencing
this: in general, signal use is pretty tricky to reason about and safely
remove because of the fairly aggressive SIG_DFL behaviour for some
common signals, especially for SIGHUP which has a particularly ambiguous
meaning. Especially in a large, highly interconnected codebase,
reasoning about signal interactions between system configuration and
applications can be highly complex, and it's inevitable that on occasion
a callsite will be missed.
In some cases the right call to avoid this will be to migrate services
towards other forms of IPC for this purpose, but inevitably there will
be some services which must continue using signals, so we need a safe
way to support them.
This patch adds support for the -H/--require-handler flag, which matches
on processes with a userspace handler present for the signal being sent.
With this flag we can enforce that all SIGHUP reload cases and SIGUSR
equivalents use --require-handler. This effectively mitigates the case
we've seen time and time again where SIGHUP is used to rotate log files
or reload configs, but the sending site is mistakenly left present after
the removal of signal handler, resulting in unintended termination of
the process.
Signed-off-by: Chris Down <chris@chrisdown.name>
When the p/e-cores support (via the '5' key) was added
in the patch referenced below, I intentionally omitted
that key from the top primary help screen. This seemed
appropriate since it only applied to select Intel cpus
and, besides, that screen was getting kind of crowded.
[ it remains an objective to fit on a 80x24 terminal ]
Upon reflection, I found a way to squeeze it into that
help screen and have decided to included it. Hopefully
its presence will encourage use of top's new provision
on any Intel platforms that distinguish between cores.
Reference(s):
Sep, 2022 - exploit p/e-cores provision
commit 00f5c74b1b
Signed-off-by: Jim Warner <james.warner@comcast.net>
In the commits referenced below special code was added
to make the bottom window sticky and fix the bug after
'Cap_nl_clreos' was traded for the 'Cap_clr_eol' loop.
However, there's always major overhead associated with
interacting with a terminal. So we'll only abandon the
single 'Cap_nl_clreos' putp in favor of repeated calls
with 'Cap_clr_eol' when a bottom window isn't present.
Reference(s):
. May, 2022 - bottom window batch bug fix
commit 793f3e85ae
. May, 2022 - bottom window made sticky
commit 0f2a755b0b
Signed-off-by: Jim Warner <james.warner@comcast.net>
Please, do not look at the actual changes made by this
commit. Trust me they will vastly improve performance.
Signed-off-by: Jim Warner <james.warner@comcast.net>
vmstat <n> would update most fields, but the memory statistics
were only fetched the first time.
References:
https://bugs.debian.org/1027963
Signed-off-by: Craig Small <csmall@dropbear.xyz>
procps 3.3.17 the c option changed the command/args field
to cmd but this got removed as part of newlib
Functionality is back in with a test case.
References:
https://bugs.debian.org/1026326
Signed-off-by: Craig Small <csmall@dropbear.xyz>
The man page said it cannot show changes to comm, such as when you
use prctl(). In fact, ps can see this. The args field may not change
because its due to the path of the executable but comm can.
The field comm no longer shows defunct for zombie processes, use the
state field for this as it could be obscured if not the last
column anyhow.
Signed-off-by: Craig Small <csmall@dropbear.xyz>
When the skill program was ported to the new API the code to filter
on PID, used by the -p option, was missed. It is now restored.
References:
https://bugs.debian.org/1025915
While ps used the correct type for PIDS_VM_RSS the test
did not. For some reason this only appeared to be an issue
for s390x
References:
https://bugs.debian.org/1025495
Signed-off-by: Craig Small <csmall@dropbear.xyz>
Put the man-po pot file under version control like its po/*.pot sibling.
Makefile now auto-matically generates the list of man pages as they are
in a single directory. There were some missing!
pot file target is dependent on the untranslated man pages
When downloading from translation project, run po4a to sync the new
po files correctly.
Downloaded man po files from translation project and synched.
Thanks to @gorean for the info
References:
procps-ng/procps#258
Signed-off-by: Craig Small <csmall@dropbear.xyz>
Commit c8384e682c ("pgrep: add pwait") changed from the old i_am_pkill
logic, but mistakenly missed a break in the pkill case. This results in
showing -e/--echo twice when running `pkill -h'.
Signed-off-by: Chris Down <chris@chrisdown.name>
Replace AC_CHECK_FUNC by AC_CHECK_FUNCS otherwise HAVE_PIDFD_OPEN will
never be defined resulting in the following build failure if pidfd_open
is available but __NR_pidfd_open is not available:
pgrep.c: In function 'pidfd_open':
pgrep.c:748:17: error: '__NR_pidfd_open' undeclared (first use in this function); did you mean 'pidfd_open'?
748 | return syscall(__NR_pidfd_open, pid, flags);
| ^~~~~~~~~~~~~~~
| pidfd_open
This build failure is raised since the addition of pwait in version
3.3.17 and
c8384e682c
Fixes:
- http://autobuild.buildroot.org/results/f23a5156e641b2ebdd673973dec0f9c87760c688
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
Thanks to @kabe-gl for this patch.
w command shows ????? for LOGIN@ column when compiled on 32bit environment with -D_TIME_BITS=64.
References:
#256
Signed-off-by: Craig Small <csmall@dropbear.xyz>
Just as our library was made responsive to a potential
missing 'core id', the top program should also change.
That's because he has his own PRETENDECORE #define and
if that was activated on a platform without 'core id',
the 'CpP' notations would have otherwise been omitted.
Reference(s):
. Oct, 2022 - library fix for missing 'core id'
commit b89e3230b2
Signed-off-by: Jim Warner <james.warner@comcast.net>
Tracking what we do to the library so the N:N:N version strings are
updated. This is just a NEWS item for previous commit.
References:
commit b89e3230b2
______________________________ original newlib message
----------------------------------- ( minus git hash )
When long command line options were introduced, in the
patch shown below, the string associated with the enum
'WRONG_switch_fmt' became obsolete. However, that enum
and its string were never removed. Well, now they are.
Reference(s):
. Sep, 2021 - getopt and long cmdline options
commit ........................................
Signed-off-by: Jim Warner <james.warner@comcast.net>
When support for graphs was refactored, in that commit
referenced below, the logic for our 'MEMGRAPH_OLD' was
lost while the #define itself remained in the .h file.
Faced with deleting the #define or restoring the logic
I chose the latter. Thus, if one wanted to be reminded
how overstated 'used' memory once was, it can be done.
Reference(s):
. Sep, 2022 - refactored graph support
commit 2d5b51d1a2
Signed-off-by: Jim Warner <james.warner@comcast.net>
When long command line options were introduced, in the
patch shown below, the string associated with the enum
'WRONG_switch_fmt' became obsolete. However, that enum
and its string were never removed. Well, now they are.
Reference(s):
. Sep, 2021 - getopt and long cmdline options
commit c91b371485
Signed-off-by: Jim Warner <james.warner@comcast.net>
In that commit referenced below, I removed the command
line help text from any translation so the TP wouldn't
delay our 4.0.1 release any further. In looking to the
future, when we might be able to reverse that, I found
gettext tools blocking use of the compile conditional.
They are too primitive for the original approach so we
must modify that exclusion mechanism hack accordingly.
____________________________excerpted program comments
The provision excluding some strings is intended to be
used very sparingly. It exists in case we collide with
some translation project person in a position to delay
a future release over his or her personal preferences.
If it's ever enabled, it will produce a fatal compiler
error as our only option since those gettext tools are
far too primitive to be influenced with a conditional.
They always ignore a '_X()' macro no matter its state.
Reference(s):
commit 8a368bfb05
Signed-off-by: Jim Warner <james.warner@comcast.net>
The provision excluding some strings is intended to be
used very sparingly. It exists in case we collide with
some translation project person in a position to delay
a future release over his or her personal preferences.
(it's currently used only on v4.0.1 command line help)
Signed-off-by: Jim Warner <james.warner@comcast.net>
Prior to this commit, when the '5' key was struck, top
would check for the presence of e-cores just one time.
That meant if a some cpu was brought online, and it in
turn exposed a new e-core after top has started, users
needed a top restart to activate the new '5' feature.
So, now we'll check for any e-cores with each '5' key.
Signed-off-by: Jim Warner <james.warner@comcast.net>
Wow, after this we'll eliminate one 'jmp' instruction!
[ plus we can also save a single precious whitespace ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
The commit that changed configure.ac was supposed to check for when
someone removes ncurses using the flag --without-ncurses
Unfortunately the change didn't check if the user was specifying
--without or --with meaning if they didn use --with-ncurses the
configure script would error out.
Signed-off-by: Craig Small <csmall@dropbear.xyz>
References:
commit 8128641814procps-ng/procps#251