The former variable length structure created potential
problems for library users like that referenced below.
We will now parallel the same approach newlib uses for
the configure options --enable-oomem & --with-systemd.
Thus, the --enable-oomem and OOMEM_ENABLE #define have
been eliminated and the --with-systemd option (#define
WITH_SYSTEMD) will hereafter impact one function only.
The proc_t struct itself will now *never* be impacted.
Reference(s):
https://gitlab.com/procps-ng/procps/issues/31
Signed-off-by: Jim Warner <james.warner@comcast.net>
Since support already exists in the newlib branch this
represents an equivalent master branch implementation,
and this commit message is shared with 2 more patches.
Beginning with linux-4.5, the following new fields are
being added under that /proc/<pid>/status pseudo file:
. RssAnon - size of resident anonymous memory
. RssFile - size of resident file mappings
. RssShmem - size of resident shared memory
p.s. Locked resident memory support was also added but
isn't directly related to the kernel 4.5 enhancements.
p.p.s. Archlinux, Debian-stretch and Fedora-23 already
are currently using a 4.5 linux kernel (as of 6/2/16).
Signed-off-by: Jim Warner <james.warner@comcast.net>
In some environments, 100 * nr_active_objs is calculated at first,
and lower 32bit of the result is divided by nr_objs.
If 100 * nr_active_objs > 42949672, %use will be incorrect.
Signed-off-by: Takayuki Nagata <tnagata@redhat.com>
Multiple scanf()s use the GNU-permitted %Lu. This is not supported in
other libraries and isn't to the POSIX specification. The L modifier
is only used for floats in POSIX.
Replacing %Lu with %llu is the same for GNU libc (scanf(3) says as much)
but means other libraries will work fine.
Closes: #19
References:
http://pubs.opengroup.org/onlinepubs/009695399/functions/fscanf.html
wish folks (craig) would use these in their .gitconfig
[core]
whitespace = trailing-space, space-before-tab, blank-at-eof
[apply]
whitespace = warn
Signed-off-by: Jim Warner <james.warner@comcast.net>
Added function procps_linux_version() which used to be an
exported integer instead. Also changed the method of obtaining
the linux version (more correctly the os release) to use a specific
procfs entry. This works for both Linux and FreeBSD.
This patch will bring three of our man pages into line
with the recent refactor of the libprocps wchan logic.
[ and also eliminates more damn eol whitespace which ]
[ snuck in our repo with the commit referenced below ]
Reference(s):
http://www.freelists.org/post/procps/WCHAN,11
commit cf4788c28d
Signed-off-by: Jim Warner <james.warner@comcast.net>
Several Debian based distributions were recently found
to have omitted a kernel configuration option that had
the effect of rendering /proc/#/stat and /proc/#/wchan
useless for providing any 'sleeping in function' info.
That problem also prompted a reevaluation of the whole
approach to wchan matters which had grown increasingly
complex as our library evolved over the last 13 years.
The net result was a decision to rely on /proc/#/wchan
which arrived along with the 2.5 kernel. This then let
us vastly simplify the internal code plus the external
interface which will benefit both the top and ps pgms.
Reference(s):
http://www.freelists.org/post/procps/WCHAN,11https://lkml.org/lkml/2008/11/6/12https://bugs.debian.org/711592
Signed-off-by: Jim Warner <james.warner@comcast.net>
It doesn't make any sense to have the binary version strings
embedded into the library. The version strings are defined
already either in the Makefile or in include/c.h
This commit adds a lxc container name to every proc_t.
If a process is not running in a container, then a '-'
will be provided, making such a field always sortable.
Unlike other proc_t character pointers, lxc containers
will find many duplicate shared values. So rather than
strdup 'em (with a later free required upon reuse), we
try to keep track of those already seen and share that
address among all tasks running within each container.
We rely on the lines in the task's cgroup subdirectory
which may initially seem somewhat unsophisticated. But
the lxc library itself uses a similar approach when it
is called to list active containers. In that case, the
/proc/net/unix directory is parsed for the '/lxc' eye-
catcher, with potential complications from hashed path
and names that are too long (something we don't face).
[ too bad docker abandoned lxc - our commit won't do ]
[ anything for the users of those kind of containers ]
Reference(s):
https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1424253https://bugs.launchpad.net/ubuntu/+source/procps/+bug/1424253
Signed-off-by: Jim Warner <james.warner@comcast.net>
Under a lxc container, the /proc/meminfo 'MemFree' and
'MemAvailable' amounts will be equal, unless memory is
being limited via cgroups in which case 'MemAvailable'
could exceed that for 'MemTotal'. And when a container
has been nested, there exist additional memory quirks.
A program might then display used or available amounts
greater than total memory (assuming unsigned honored),
or negative values (should a signed cast be employed).
This anomaly primarily impacted the top and free pgms.
Thus, two simple sanity checks have been introduced to
avoid any illogical kb_main_available or kb_main_used.
( Busybox top & free also display anomalous although )
( different results when running in a lxc container. )
Reference(s):
https://bugzilla.redhat.com/show_bug.cgi?id=1153817
Signed-off-by: Jim Warner <james.warner@comcast.net>
This will be required for subdir-objects, otherwise automake will have
problems with more than one Makefile.am having rules to build the same
files.
Tested that it builds and both `make check` and `make distcheck` work.
Tested `make install` and compared the tree with the one installed
before this commit, both installed the binaries to the same locations.
The binaries are also in the same location in the build tree (for
instance, ps/pscommand is still there.)
Checked the binaries for the correct libraries linked into them. Binary
sizes matched before and after this change.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
This is required for out-of-tree build to work, since many source files
include e.g. proc/*.h which is not under the include/ directory.
Tested that `make distcheck` starts working after this patch.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Let's not report zero for kb_main_available when older
kernels don't have MemAvailable. Instead, if we simply
duplicate the 'free' amount we can avoid all ancillary
problems, such as those involving top's graphing mode.
Reference(s):
http://www.freelists.org/post/procps/kb-main-available-etc,3
Signed-off-by: Jim Warner <james.warner@comcast.net>
This commit just ensures recalculation of some amounts
for iterative processes, like top. It also trades some
repeated runtime calls to sysconf for a one time cost.
Reference(s):
http://www.freelists.org/post/procps/systemd-support-to-library,7
. fall-back calculations
commit b779855cf1
Signed-off-by: Jim Warner <james.warner@comcast.net>
[ plus remove just a little darn trailing whitespace ]
Reference(s):
. systemd migrated to library
commit 9d8ad6419f
. added library documentation
commit a74fb8fade
Signed-off-by: Jim Warner <james.warner@comcast.net>
The stderr message regarding ELF notes appears on some
systems (openSUSE-13.1 for example) but I have not yet
isolated why. Since at startup we go on to determine a
Hertz value the old fashion way, this patch just turns
off the useless message until the cause is understood.
Signed-off-by: Jim Warner <james.warner@comcast.net>
This commit adds support for fallback calculation
of the MemAvailable field if not exported by the
kernel. The MemAvailable field appeared in kernel
3.14, but it's possible to calculate it from other
fields since 2.6.27 (splitLRU changes).
Nowadays the usernames can be 32 characters long
(typically OpenShift usernames use the whole length)
and the old limit was preventing us from processing
them correctly.
The macro change affects the proc_t structure size.
This commit adds a new switch -a/--available that
appends a new column called 'available' to the
output. The column displays an estimation
of how much memory is available for starting
new applications, without swapping. Unlike the data
provided by the 'cached' or 'free' fields, this
field takes into account page cache and also that
not all reclaimable memory slabs will be reclaimed
due to items being in use.
The subtraction was marked as reinforcing the misconception,
that memory in the page cache can be considered free.
The Cached value is not a sum of page cache and tmpfs,
as the tmpfs memory lives in the page cache and therefore
it's an inseparable part of it.
tmpfs has become much more widely used since distributions use it for
/tmp (Fedora 18+). In /proc/meminfo, memory used by tmpfs is accounted
into "Cached" (aka "NR_FILE_PAGES",
http://lxr.free-electrons.com/source/mm/shmem.c#L301 ).
The tools just pass it on, so what top, free and vmstat report as
"cached" is the sum of page cache and tmpfs.
free has the extremely useful "-/+ buffers/cache" output. However, now
that tmpfs is accounted into "cached", those numbers are way off once
you have big files in /tmp.
Fortunately, kernel 2.6.32 introduces "Shmem", which makes tmpfs memory
usage accessible from userspace (
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4b02108ac1b3354a22b0d83c684797692efdc395 ).
This patch substracts Shmem from Cached to get the actual page cache
memory. This makes both issues mentioned above disappear. For older
kernels, Shmem is not available (hence zero) and this patch is no-op.
Additionally:
* Update the man pages of free and vmstat to explain what is happening
* Finally drop "MemShared" from the /proc/meminfo parser, it has been
dead for 10+ years and is only causing confusion ( removed in kernel
2.5.54, see
https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=fe04e9451e5a159247cf9f03c615a4273ac0c571 )
vmstat -d or vmstat -p would crash mysteriously under different
circumstances. The problem was eventually tracked down to /sys not
being mounted which meant is_disk() always returned false.
The partition would then be attempted to be linked to a non-existent
disk causing a segfault.
vmstat will now not link to a disk if none exists.
The change in testing will skip those tests when /sys/block doesn't
exist.
Many thanks to Daniel Schepler for his analysis and suggestions.
Bug-Debian: http://bugs.debian.org/736628
Under some circumstances the ksh shell doesn't fork new processes
when executing scripts and the script is interpreted by the
parent process. That makes the execution faster, but it means
ksh needs to reuse the /proc/PID/cmdline for the new script name
and arguments while the file length needs to stay untouched.
The fork is skipped only when the new cmdline is shorter than
the parent's cmdline and the rest of the file is filled
with '\0'. This is perfectly ok until we try to read the cmdline
of such process. As the read_unvectored() function replaces
all zeros with chosen separator, these trailing zeros are replaced
with spaces in case of the ps tool. Consequently it appends
multiple spaces at the end of the arguments string even when these
zeros do not represent any separators and therefore shouldn't
be replaced.
With this commit the read_unvectored() function skips the
replacement of trailing zeros and separates valid content only.
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1057600
While 'invisible' thread subdirectories are accessible
under /proc/ with stat/opendir calls, they have always
been treated as non-existent, as is true with readdir.
This patch trades the /proc/#/ns access convention for
the more proper /proc/#/task/#/ns approach when thread
access is desired. In addition some namespace code has
been simplified and made slightly more efficient given
the calloc nature of proc_t acquisition and its reuse.
Reference(s):
commit a01ee3c0b3
Signed-off-by: Jim Warner <james.warner@comcast.net>
Previously the shared memory column was always zero
for 2.6 series kernels (and later) due to the fact,
that the value was taken from the MemShared entry
that disappeared with 2.6 series kernels.
Later a new Shmem entry appeared in the /proc/meminfo
file and the 'shared' column now displays either
the MemShared or the Shmem value (depending on their
presence - the presence is mutually exclusive).
If none of the two entries is exported by the kernel,
then the column is zero.
An unpleasant thing happened when I comitted the shmem support
for the 'free' tool. We already had a merge request from
Adrian Brzezinski in the queue, doing exactly the same.
As Adrian deserves credits, I'm reverting the change
and re-applying with the next commit in order to make
him a part of the project history.
Previously the shared memory column was always zero
for 2.6 series kernels (and later) due to the fact,
that the value was taken from the MemShared entry
that disappeared with 2.6 series kernels.
Later a new Shmem entry appeared in the /proc/meminfo
file and the 'shared' column now displays either
the MemShared or the Shmem value (depending on their
presence - the presence is mutually exclusive).
If none of the two entries is exported by the kernel,
then the column is zero.
It seems in some cases procs_running field of /proc/stat can contain 0 even if vmstat itself is running. At least this can be reproduced on Linux 3.9.3 compiled with BFS scheduler.
Since getstat() decrements value of procs_running by 1, we get overflow:
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 667732 918996 57376 911260 21 30 36 40 98 45 14 82 4 1
4294967295 0 667728 916716 57376 911264 8 0 8 0 1958 3733 28 7 65 1
0 0 667700 915996 57376 911416 24 0 152 0 1735 3600 23 5 71 1
4294967295 0 667700 915872 57376 911392 0 0 0 0 1528 3165 21 4 76 0
Function getbtime() currently makes the assumption that btime==0 equals
btime not being present in /proc/stat. This is not quite accurate, as
timestamp 0 is, in fact, also a valid time (Epoch), and /proc/stat may
report it as such.
We introduce a flag to indicate whether btime was found in /proc/stat.
In this way, btime==0 becomes a valid case, provided /proc/stat
actually reports this as the boot time.
procps can still detect the case of btime actually not being reported
by the kernel.
Signed-off-by: Markus Mayer <mmayer@broadcom.com>