This commit adds support for fallback calculation
of the MemAvailable field if not exported by the
kernel. The MemAvailable field appeared in kernel
3.14, but it's possible to calculate it from other
fields since 2.6.27 (splitLRU changes).
Nowadays the usernames can be 32 characters long
(typically OpenShift usernames use the whole length)
and the old limit was preventing us from processing
them correctly.
The macro change affects the proc_t structure size.
This commit adds a new switch -a/--available that
appends a new column called 'available' to the
output. The column displays an estimation
of how much memory is available for starting
new applications, without swapping. Unlike the data
provided by the 'cached' or 'free' fields, this
field takes into account page cache and also that
not all reclaimable memory slabs will be reclaimed
due to items being in use.
The subtraction was marked as reinforcing the misconception,
that memory in the page cache can be considered free.
The Cached value is not a sum of page cache and tmpfs,
as the tmpfs memory lives in the page cache and therefore
it's an inseparable part of it.
tmpfs has become much more widely used since distributions use it for
/tmp (Fedora 18+). In /proc/meminfo, memory used by tmpfs is accounted
into "Cached" (aka "NR_FILE_PAGES",
http://lxr.free-electrons.com/source/mm/shmem.c#L301 ).
The tools just pass it on, so what top, free and vmstat report as
"cached" is the sum of page cache and tmpfs.
free has the extremely useful "-/+ buffers/cache" output. However, now
that tmpfs is accounted into "cached", those numbers are way off once
you have big files in /tmp.
Fortunately, kernel 2.6.32 introduces "Shmem", which makes tmpfs memory
usage accessible from userspace (
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4b02108ac1b3354a22b0d83c684797692efdc395 ).
This patch substracts Shmem from Cached to get the actual page cache
memory. This makes both issues mentioned above disappear. For older
kernels, Shmem is not available (hence zero) and this patch is no-op.
Additionally:
* Update the man pages of free and vmstat to explain what is happening
* Finally drop "MemShared" from the /proc/meminfo parser, it has been
dead for 10+ years and is only causing confusion ( removed in kernel
2.5.54, see
https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=fe04e9451e5a159247cf9f03c615a4273ac0c571 )
vmstat -d or vmstat -p would crash mysteriously under different
circumstances. The problem was eventually tracked down to /sys not
being mounted which meant is_disk() always returned false.
The partition would then be attempted to be linked to a non-existent
disk causing a segfault.
vmstat will now not link to a disk if none exists.
The change in testing will skip those tests when /sys/block doesn't
exist.
Many thanks to Daniel Schepler for his analysis and suggestions.
Bug-Debian: http://bugs.debian.org/736628
Under some circumstances the ksh shell doesn't fork new processes
when executing scripts and the script is interpreted by the
parent process. That makes the execution faster, but it means
ksh needs to reuse the /proc/PID/cmdline for the new script name
and arguments while the file length needs to stay untouched.
The fork is skipped only when the new cmdline is shorter than
the parent's cmdline and the rest of the file is filled
with '\0'. This is perfectly ok until we try to read the cmdline
of such process. As the read_unvectored() function replaces
all zeros with chosen separator, these trailing zeros are replaced
with spaces in case of the ps tool. Consequently it appends
multiple spaces at the end of the arguments string even when these
zeros do not represent any separators and therefore shouldn't
be replaced.
With this commit the read_unvectored() function skips the
replacement of trailing zeros and separates valid content only.
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1057600
While 'invisible' thread subdirectories are accessible
under /proc/ with stat/opendir calls, they have always
been treated as non-existent, as is true with readdir.
This patch trades the /proc/#/ns access convention for
the more proper /proc/#/task/#/ns approach when thread
access is desired. In addition some namespace code has
been simplified and made slightly more efficient given
the calloc nature of proc_t acquisition and its reuse.
Reference(s):
commit a01ee3c0b3
Signed-off-by: Jim Warner <james.warner@comcast.net>
Previously the shared memory column was always zero
for 2.6 series kernels (and later) due to the fact,
that the value was taken from the MemShared entry
that disappeared with 2.6 series kernels.
Later a new Shmem entry appeared in the /proc/meminfo
file and the 'shared' column now displays either
the MemShared or the Shmem value (depending on their
presence - the presence is mutually exclusive).
If none of the two entries is exported by the kernel,
then the column is zero.
An unpleasant thing happened when I comitted the shmem support
for the 'free' tool. We already had a merge request from
Adrian Brzezinski in the queue, doing exactly the same.
As Adrian deserves credits, I'm reverting the change
and re-applying with the next commit in order to make
him a part of the project history.
Previously the shared memory column was always zero
for 2.6 series kernels (and later) due to the fact,
that the value was taken from the MemShared entry
that disappeared with 2.6 series kernels.
Later a new Shmem entry appeared in the /proc/meminfo
file and the 'shared' column now displays either
the MemShared or the Shmem value (depending on their
presence - the presence is mutually exclusive).
If none of the two entries is exported by the kernel,
then the column is zero.
It seems in some cases procs_running field of /proc/stat can contain 0 even if vmstat itself is running. At least this can be reproduced on Linux 3.9.3 compiled with BFS scheduler.
Since getstat() decrements value of procs_running by 1, we get overflow:
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 667732 918996 57376 911260 21 30 36 40 98 45 14 82 4 1
4294967295 0 667728 916716 57376 911264 8 0 8 0 1958 3733 28 7 65 1
0 0 667700 915996 57376 911416 24 0 152 0 1735 3600 23 5 71 1
4294967295 0 667700 915872 57376 911392 0 0 0 0 1528 3165 21 4 76 0
Function getbtime() currently makes the assumption that btime==0 equals
btime not being present in /proc/stat. This is not quite accurate, as
timestamp 0 is, in fact, also a valid time (Epoch), and /proc/stat may
report it as such.
We introduce a flag to indicate whether btime was found in /proc/stat.
In this way, btime==0 becomes a valid case, provided /proc/stat
actually reports this as the boot time.
procps can still detect the case of btime actually not being reported
by the kernel.
Signed-off-by: Markus Mayer <mmayer@broadcom.com>
One recent patch to dynamic buffer management involved
over-allocating the buffer increase to lessen calls to
xrealloc. That was successful, but the actual increase
amount did not attempt to optimize size or alignments.
With this commit, we'll copy an approach recently used
by the top program and round up buffer sizes to 1 KiB.
More importantly, while buffers are quickly reaching a
KiB optimum multiple, no memcpy will ever be employed!
To illustrate just how effective top's algorithm would
be, just change the initial and subsequent allocations
from the current 1024 bytes to just a single byte then
add an fprintf. Those one byte reallocations while on
the way to optimum buffer size will be a one-time cost
and won't represent any recurring performance penalty.
( gosh, that top program *must be* one fart smeller, )
( or was that a smart feller, i can't remember which )
Reference)s):
commit 6d605f521c
commit a45dace4b8
Signed-off-by: Jim Warner <james.warner@comcast.net>
Each process in Linux has a /proc/<pid>/ns directory which contains
symbolic links to pipes that identify which namespaces that process
belongs to. This patch adds support for ps to display that information
optionally.
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
An earlier commit attempted to cleanse our environment
of all useless trailing whitespace. But the effort did
not catch 'empty' lines with a single space before ^J.
This commit hopefully finishes off the earlier effort.
In the meantime, let's pray that contributors' editors
are configured so that such wasted crap is disallowed!
Reference(s):
commit fe75e26ab6
Signed-off-by: Jim Warner <james.warner@comcast.net>
When utility buffers were introduced for file2str read
requests, a subtle change was inadvertently introduced
such that a read of zero no longer returns a -1 value.
This commit ensures that zero bytes read returns a -1.
And although the solution differs from a merge request
submitted by sergey.senozhatsky@gmail.com, a thank you
is offered for revealing this potential abend problem.
References(s):
commit a45dace4b8http://gitorious.org/procps/procps/merge_requests/11
Signed-off-by: Jim Warner <james.warner@comcast.net>
Signed-off-by: Craig Small <csmall@enc.com.au>
When dynamic buffers were recently introduced for read
of the status, stat and statm subdirectories one extra
call to read() was required for end-of-file detection.
This patch avoids most all such extra calls to read().
Additionally, the frequency of memory reallocations is
reduced by overallocating each increase more than 25%.
Reference)s):
commit a45dace4b8
Signed-off-by: Jim Warner <james.warner@comcast.net>
Signed-off-by: Craig Small <csmall@enc.com.au>
Internal changes to libproc means the revision number
is incremented. This does not mean an ABI or API change has
occured, we just do the stuff under the covers better or in this
case reduce the compile warnings mainly.
See Jim, I do read the commit messages :)
readproc.c: In function 'stat2proc' :
readproc.c:516: warning: use of assignment suppression and length modifier together in gnu_scanf format
readproc.c:516: warning: use of assignment suppression and length modifier together in gnu_scanf format
Signed-off-by: Gilles Espinasse <g.esp@free.fr>
A recent Debian bug report, dealing with release 3.2.8
and its even more restrictive buffer sizes (1024) used
in stat, statm and status reads via file2str calls, is
a reminder of what could yet happen to procps-ng. Size
needs are determined by kernel evolution and/or config
options so that bug could resurface even though buffer
size is currently 4 times the old procps-3.2.8 limits.
Those sizes were raised from 1024 to 4096 bytes in the
patch submitted by Eric Dumazet, and referenced below.
This patch makes libprocps immune to future changes in
the amount of stuff that is ultimately found in a proc
'stat', 'statm' or 'status' subdirectory. We now trade
the former static buffer of 4096 bytes for dynamically
allocated buffers whose size can be increased by need.
Even though this change is solely an internal one, and
in no way directly affects the API or the ABI, libtool
suggests that the LIBprocps_REVISION be raised. I hope
Craig remembers to do that just before a next release.
We don't want a repeat of the procps-ng-3.3.4 boo-boo,
but with no API/ABI impact that probably can't happen.
p.s. A big thanks to Jaromir Capik <jcapik@redhat.com>
who reviewed my original version and, of course, found
some of my trademark illogic + unnecessary code. After
his coaxing, he helped make this a much better commit.
Reference(s):
. procps-3.2.8
http://bugs.debian.org/702965
. allow large list of groups
commit 7933435584
Signed-off-by: Jim Warner <james.warner@comcast.net>
Reviewed by: Jaromir Capik <jcapik@redhat.com>
The entire tree's polluted with inappropriate trailing
whitespace. This commit rids our environment of all of
those useless keystrokes. Unfortunately, it sure ain't
a permanent solution and requires every contributor to
instruct their editor(s) to prevent or eliminate them.
Plus it's strongly recommended we all insert something
like what's shown below to our '.gitconfig' file so as
to provide at least some warnings when we try to apply
any patches (git am) that do contain the #@!%& things!
References(s):
~/.gitconfig excerpt ---------------------------------
[core]
whitespace = trailing-space, space-before-tab, blank-at-eof
[apply]
whitespace = warn
--------------------------------- ~/.gitconfig excerpt
Signed-off-by: Jim Warner <james.warner@comcast.net>
freeproc was missing from the libproc API which meant while you could
allocate proc structures, you couldn't free them!
Bug-Debian: http://bugs.debian.org/681653
Fixes error which did not happen always. Changes of being affected by
the bug where greater the more there where pids defined as pmap argument.
The debian bug referral can almost certainly reproduce the problem,
especially when tried multiple times in row.
pmap: malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr)
(((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct
malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >=
(unsigned long)((((__builtin_offsetof (struct malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) -
1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) ==
0)' failed.
Reported-by: lee <lee@yun.yagibdah.de>
Bug-Debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=688180
Signed-off-by: Sami Kerola <kerolasa@iki.fi>
Current linux kernels output no more than 32 groups
in /proc/{pid}/status.
Plan is to increase this limit.
This patch allows ps to not core dump if the buffer used to read status
file was too small.
# ps aux
Signal 11 (SEGV) caught by ps (procps-ng version 3.3.3).
ps:display.c:59: please report this bug
Also increases the size of the buffer from 1024 to 4096, since even with
32 groups we are close to the limit.
cat /proc/12731/status | wc
39 128 961
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
When the maj_delta and min_delta fields were added to
the proc_t, they necessitated some compiler generated
padding bytes.
With this slight reordering, those padding bytes are
no longer generated. And since the original commit
already broke the library ABI, now is an opportune
time to correct that misalignment.
Reference:
commit 7753bd1004
Signed-off-by: Jim Warner <james.warner@comcast.net>
Both options provide more information about a process using -X and -XX
flags. The data comes from /proc/PID/smaps so it may vary.
Signed-off-by: Craig Small <csmall@enc.com.au>
In preparation for top scrollable environment display,
the new flag PROC_EDITENVRCVT was added to mirror the
existing single vector string handling for cgroup and
cmdline.
Signed-off-by: Jim Warner <james.warner@comcast.net>
The control group hierarchies for any particular task
could conceivably grow quite large. However, the
library might impose an arbitrary limit of 1024 bytes
via fill_cgroup_cvt.
Two utility buffers of 128 KiB each were already
available for command line use. This commit simply
trades the smaller 1024 byte stack based buffers for
those much larger existing ones. Thus, truncation
can be avoided with no additional run-time costs.
Signed-off-by: Jim Warner <james.warner@comcast.net>
Some inconsistencies have emerged during development
of support for these relatively new proc_t fields.
For example, a PROC_FILLCGROUP flag (via file2strvec)
could return NULL in cgroup whereas PROC_EDITCGRPCVT
(via fill_cgroup_cvt) *almost* guaranteed a return
address (as is true for PROC_EDITCMDLCVT and cmdline).
But even PROC_EDITCGRPCVT could return NULL if the
kernel version was less than 2.6.24. Then with NULL
ps would display a "-" while top would show "n/a".
And while unlikely, with the PROC_FILLSTATUS flag (via
status2proc) a NULL supgid address was theoretically
possible and both ps and top would then show "n/a".
This commit standardizes the following usage:
. PROC_FILLSTATUS (via status2proc)
guarantees a valid supgid address
representing either a true comma
delimited list or "-"
. PROC_FILLCGROUP plus
PROC_EDITCGRPCVT (via fill_cgroup_cvt)
guarantees a cgroup single vector
representing either a true control
group hierarchy or "-"
And as was true before, the following remains true:
PROC_FILLCOM or
PROC_FILLARG (via file2strvec)
may return a NULL cmdline pointer
. PROC_FILLCGROUP (via file2strvec)
may return a NULL cgroup pointer
. PROC_FILLCOM or
PROC_FILLARG plus
PROC_EDITCMDLCVT (via fill_cmdline_cvt)
guarantees a cmdline single vector
representing either a true command
line or a bracketed program name
. PROC_FILLSTATUS plus
PROC_FILLSUPGRP (via supgrps_from_supgids)
guarantees a valid supgrp address
representing either a true comma
delimited list or "-"
Signed-off-by: Jim Warner <james.warner@comcast.net>
There soon will be slab types per cgroup meaning the name of the slab
will have the cgroup name in parathensis after the slab name. This
minor change increases the slab name size to cater for this.
Signed-off-by: Craig Small <csmall@enc.com.au>
Fix the build where it seems a code fix for Linux was likely untested
on other systems.
Define SCHED_BATCH in test-schedbatch, for systems that don't have it;
the corresponding RH BZ#741090 patch used the magic value 3 in output.c
anyway.
Bug-Debian: http://bugs.debian.org/677055
The problem is that in ./proc/sysinfo.c uptime(), it is not
considered that the "savelocale" string is overwritten by the
subsequent call to setlocale(). Hence restoring the locale later on
won't work this way. "savelocale" ought to be a copy of the string
pointed to by setlocale()'s return-value.
Bug-Redhat: https://bugzilla.redhat.com/show_bug.cgi?id=548711
Backported-by: Sami Kerola <kerolasa@iki.fi>
There have been some internal changes to the library, so the revision
will be incremented for this release. There is no ABI or API changes
so its no real impact.