library: add item origin (as comments) to header files

A lack of documentation seems to be the major obstacle
to releasing this new library. So, in an effort to get
the ball rolling again, this patch adds the origins of
each item as a comment to six of the new header files.

However, before reviewing how such changes may benefit
that documentation objective, it seemed appropriate to
first reflect on newlib's background & current status.

___________________________________________ BACKGROUND
Discussions about and work on a new library began back
in July 2012 but quickly died. After a lull of 2 years
those discussions were resumed in August 2014 but soon
died also (and no code survived the gitorious demise).

With those early discussions, the recommended approach
was to encapsulate all of the libprocps data offerings
in individual functions. When it came to extensibility
it was suggested we should rely on symbols versioning.

Unfortunately that approach would have made for a huge
Application Programming Interface virtually impossible
to master or even document. And, runtime call overhead
would have been substantial for ps and especially top.

So, an alternative design was sought but there were no
new suggestions/contributions via freelists or gitlab.
Thus, in spite of a lack of library design experience,
the procps-ng team (Craig & Jim) set out to develop an
alternative API, more concise and with lower overhead.

Reference(s):
. 07/01/2012, begin library design discussion
https://www.freelists.org/post/procps/Old-library-calls
. 08/12/2014, revival of library design discussion
https://www.freelists.org/post/procps/libprocs-redesign

_____________________________________ DESIGN EVOLUTION
Our newlib branch first appeared on June 14, 2015. And
our current API actually represents the 4th generation
during the past 3 years of evolution. First, there was
a basic 'new', 'get' and 'unref' approach, using enums
to minimize the proliferation of 'get' function calls.

Then, in anticipation of other programs like ps, where
multiple fields times multiple processes would greatly
increase the number of 'get' function calls, a concept
of 'chains' was introduced. This became generation #2.

Such 'chains' proved unnecessarily complex so 'stacks'
replaced them. This was considered the 3rd generation,
but too many implementation details were still exposed
requiring those users to 'alloc', 'read', 'fill', etc.

Finally, a 4th generation emerged representing several
refinements to standardize and minimize those exported
functions, thus hiding all implementation details from
the users. Lastly, handling of 'errno' was normalized.

Reference(s):
. 06/14/2015, revival of new API discussion
https://www.freelists.org/post/procps/The-library-API-again
. 06/24/2015, birth of the newlib branch
https://www.freelists.org/post/procps/new-library
. 06/29/2015, 2nd generation introduced 'chains'
https://www.freelists.org/post/procps/new-library,8
. 07/22/2015, 3rd generation introduced 'stacks'
https://www.freelists.org/post/procps/newlib-stacks-vs-chains
. 06/18/2016, 4th generation refinements begin
https://www.freelists.org/post/procps/newlib-generation-35
. 11/10/2017, 4th generation standardized 'errno'
https://www.freelists.org/post/procps/some-more-master-newlib-stuff

_______________________________________ CURRENT DESIGN
Central to this new design is a simple 'result' struct
reflecting an item plus its value (thanks to a union).
As a user option, these item structures can be grouped
into 'stacks', yielding many results with just 1 call.
Such a 'stack' can be seen as a variable length record
whose content/order is determined solely by the users.

Within that 'result' structure, the union has standard
C language types so there is never a doubt how a value
should be used in a printf statement. Given that linux
requires a least a 32-bit platform the only difference
in capacity surrounds 'long' integers. And, where such
types might be used, the 32-bit maximums are adequate.

The items themselves are simply enumerators defined in
the respective header files. A user can name any items
of interest then the library magically provides result
structure(s). The approach was proven to be extensible
without breaking the ABI (in commit referenced below).

The 6 major APIs each provide for the following calls:
. 'new' ---------> always required as the first call .
. 'ref' -------------------------> strictly optional .
. 'unref' --------> optional, if ill-behaved program .
. 'get' --------------------> retrieve a single item .
. 'select' ----------------> retrieve multiple items .

And the 'get' and 'select' functions provide for delta
results representing the difference between successive
get/select calls (or a 'new' then  'get/select' call).

For the <diskstats>, <pids>, <slabinfo> & <stat> APIs,
where results are unpredictable, a 'reap' function can
return multiple result structures for multiple stacks.

The <pids> API differs from others in that those items
of interest must be provided at 'new' or 'reset' time,
a function unique to this API. And the <pids> 'select'
function requires PIDs or UIDs which are to be fetched
which then operates as a subset of 'reap'. Lastly, the
'get' function is an iterator for successive PIDs/TIDs
returning items previously identified via 'new/reset'.

To provide assistance to users during development, the
special header 'proc/xtra-procps-debug.h' is available
to check type usage against library expectations. That
check is activated by including this header explicitly
or via build using: ./configure '-DXTRA_PROCPS_DEBUG'.

Reference(s):
. 08/05/2016, type validation introduced
https://www.freelists.org/post/procps/newlib-types-validation
commit e3270d463d
. 08/11/2016, extensibility while preserving ABI example
https://www.freelists.org/post/procps/new-meminfo-fields
commit 09e1886c9e

_________________________ INITIAL DOCUMENTATION EFFORT
The initial attempt, referenced below, dealt primarily
with the <pids> interface. Separate man pages for each
exported function were created. Plus there was another
document describing the items, among other miscellany.

Adopting such an approach encounters several problems:

1. In order to use these man pages, users are required
to already know how to use the library. Or alternately
one could randomly search each of them while trying to
ascertain which function call satisfies their need and
what exactly was the proper compliment/order required.

2. While we can explain what all of those <pids> items
represent, that certainly isn't true for all the APIs.
See the gaps in kernel documentation for <meminfo> and
complete lack of documentation with that <vmstat> API.

3. Our documentation effort should take pains to avoid
unnecessary implementation details. Here's an example:
. "The pointer to info will have memory"
. "allocated and a structure created."

Alternatively, the following conveys user requirements
while not offering any internal implementation detail:
. "You must provide the address of a NULL"
. "info structure pointer."

Reference(s):
. 01/04/2017, initial documentation offering
https://www.freelists.org/post/procps/Using-reap-and-get
commit 2598e9f2ce

___________________ RECOMMENDED DOCUMENTATION APPROACH
I recommend that the newlib documentation consist of 3
man pages only. The first would cover the 5 major APIs
and their common functions. The second would deal with
the <pids> API exclusively, explaining how it differs.
Any remaining exported libproc functions which are yet
to be included could be represented in a 3rd document.

For these new documents the following are are assumed:

1. Since we will not be able to document all items, we
shouldn't try to document any items. We should instead
rely on proc(5) or Documentation/filesystems/proc.txt.

2. Program development often involves referencing some
header file(s). So, make that an absolute requirement.

3. With the addition of item origins, represented with
this commit, and considering that 'types' were already
present, the header file might be all some users need.

4. And who knows, when a user of our libproc complains
about gaps in their documentation, it might prompt the
kernel folks to correct those long standing omissions.

To summarize, I suggest that we replace that libproc.3
document with a more general one explaining the basics
of accessing this new library and the common calls for
most of the major interfaces. We can then create a new
document (libproc-pids.3?), which explains differences
in using the <PIDS> application programming interface.
A final document (libproc-misc.3?) covers what's left.

Signed-off-by: Jim Warner <james.warner@comcast.net>
This commit is contained in:
Jim Warner
2018-12-20 00:00:00 -06:00
committed by Craig Small
parent d9f88246f6
commit 96d59cbf46
6 changed files with 575 additions and 567 deletions

View File

@ -30,125 +30,127 @@ extern "C" {
enum pids_item {
PIDS_noop, // ( never altered )
PIDS_extra, // ( reset to zero )
PIDS_ADDR_END_CODE, // ul_int
PIDS_ADDR_KSTK_EIP, // ul_int
PIDS_ADDR_KSTK_ESP, // ul_int
PIDS_ADDR_START_CODE, // ul_int
PIDS_ADDR_START_STACK, // ul_int
PIDS_CGNAME, // str
PIDS_CGROUP, // str
PIDS_CGROUP_V, // strv
PIDS_CMD, // str
PIDS_CMDLINE, // str
PIDS_CMDLINE_V, // strv
PIDS_ENVIRON, // str
PIDS_ENVIRON_V, // strv
PIDS_EXE, // str
PIDS_EXIT_SIGNAL, // s_int
PIDS_FLAGS, // ul_int
PIDS_FLT_MAJ, // ul_int
PIDS_FLT_MAJ_C, // ul_int
PIDS_FLT_MAJ_DELTA, // s_int
PIDS_FLT_MIN, // ul_int
PIDS_FLT_MIN_C, // ul_int
PIDS_FLT_MIN_DELTA, // s_int
PIDS_ID_EGID, // u_int
PIDS_ID_EGROUP, // str
PIDS_ID_EUID, // u_int
PIDS_ID_EUSER, // str
PIDS_ID_FGID, // u_int
PIDS_ID_FGROUP, // str
PIDS_ID_FUID, // u_int
PIDS_ID_FUSER, // str
PIDS_ID_LOGIN, // s_int
PIDS_ID_PGRP, // s_int
PIDS_ID_PID, // s_int
PIDS_ID_PPID, // s_int
PIDS_ID_RGID, // u_int
PIDS_ID_RGROUP, // str
PIDS_ID_RUID, // u_int
PIDS_ID_RUSER, // str
PIDS_ID_SESSION, // s_int
PIDS_ID_SGID, // u_int
PIDS_ID_SGROUP, // str
PIDS_ID_SUID, // u_int
PIDS_ID_SUSER, // str
PIDS_ID_TGID, // s_int
PIDS_ID_TID, // s_int
PIDS_ID_TPGID, // s_int
PIDS_LXCNAME, // str
PIDS_MEM_CODE, // ul_int
PIDS_MEM_CODE_PGS, // ul_int
PIDS_MEM_DATA, // ul_int
PIDS_MEM_DATA_PGS, // ul_int
PIDS_MEM_RES, // ul_int
PIDS_MEM_RES_PGS, // ul_int
PIDS_MEM_SHR, // ul_int
PIDS_MEM_SHR_PGS, // ul_int
PIDS_MEM_VIRT, // ul_int
PIDS_MEM_VIRT_PGS, // ul_int
PIDS_NICE, // s_int
PIDS_NLWP, // s_int
PIDS_NS_IPC, // ul_int
PIDS_NS_MNT, // ul_int
PIDS_NS_NET, // ul_int
PIDS_NS_PID, // ul_int
PIDS_NS_USER, // ul_int
PIDS_NS_UTS, // ul_int
PIDS_OOM_ADJ, // s_int
PIDS_OOM_SCORE, // s_int
PIDS_PRIORITY, // s_int
PIDS_PROCESSOR, // u_int
PIDS_PROCESSOR_NODE, // s_int
PIDS_RSS, // ul_int
PIDS_RSS_RLIM, // ul_int
PIDS_RTPRIO, // s_int
PIDS_SCHED_CLASS, // s_int
PIDS_SD_MACH, // str
PIDS_SD_OUID, // str
PIDS_SD_SEAT, // str
PIDS_SD_SESS, // str
PIDS_SD_SLICE, // str
PIDS_SD_UNIT, // str
PIDS_SD_UUNIT, // str
PIDS_SIGBLOCKED, // str
PIDS_SIGCATCH, // str
PIDS_SIGIGNORE, // str
PIDS_SIGNALS, // str
PIDS_SIGPENDING, // str
PIDS_STATE, // s_ch
PIDS_SUPGIDS, // str
PIDS_SUPGROUPS, // str
PIDS_TICS_ALL, // ull_int
PIDS_TICS_ALL_C, // ull_int
PIDS_TICS_ALL_DELTA, // s_int
PIDS_TICS_BLKIO, // ull_int
PIDS_TICS_GUEST, // ull_int
PIDS_TICS_GUEST_C, // ull_int
PIDS_TICS_SYSTEM, // ull_int
PIDS_TICS_SYSTEM_C, // ull_int
PIDS_TICS_USER, // ull_int
PIDS_TICS_USER_C, // ull_int
PIDS_TIME_ALL, // ull_int
PIDS_TIME_ELAPSED, // ull_int
PIDS_TIME_START, // ull_int
PIDS_TTY, // s_int
PIDS_TTY_NAME, // str
PIDS_TTY_NUMBER, // str
PIDS_VM_DATA, // ul_int
PIDS_VM_EXE, // ul_int
PIDS_VM_LIB, // ul_int
PIDS_VM_RSS, // ul_int
PIDS_VM_RSS_ANON, // ul_int
PIDS_VM_RSS_FILE, // ul_int
PIDS_VM_RSS_LOCKED, // ul_int
PIDS_VM_RSS_SHARED, // ul_int
PIDS_VM_SIZE, // ul_int
PIDS_VM_STACK, // ul_int
PIDS_VM_SWAP, // ul_int
PIDS_VM_USED, // ul_int
PIDS_VSIZE_PGS, // ul_int
PIDS_WCHAN_NAME // str
// returns origin, see proc(5)
// ------- -------------------
PIDS_ADDR_END_CODE, // ul_int stat
PIDS_ADDR_KSTK_EIP, // ul_int stat
PIDS_ADDR_KSTK_ESP, // ul_int stat
PIDS_ADDR_START_CODE, // ul_int stat
PIDS_ADDR_START_STACK, // ul_int stat
PIDS_CGNAME, // str cgroup
PIDS_CGROUP, // str cgroup
PIDS_CGROUP_V, // strv cgroup
PIDS_CMD, // str stat or status
PIDS_CMDLINE, // str cmdline
PIDS_CMDLINE_V, // strv cmdline
PIDS_ENVIRON, // str environ
PIDS_ENVIRON_V, // strv environ
PIDS_EXE, // str exe
PIDS_EXIT_SIGNAL, // s_int stat
PIDS_FLAGS, // ul_int stat
PIDS_FLT_MAJ, // ul_int stat
PIDS_FLT_MAJ_C, // ul_int stat
PIDS_FLT_MAJ_DELTA, // s_int stat
PIDS_FLT_MIN, // ul_int stat
PIDS_FLT_MIN_C, // ul_int stat
PIDS_FLT_MIN_DELTA, // s_int stat
PIDS_ID_EGID, // u_int status
PIDS_ID_EGROUP, // str [ EGID based, see: getgrgid(3) ]
PIDS_ID_EUID, // u_int status
PIDS_ID_EUSER, // str [ EUID based, see: getpwuid(3) ]
PIDS_ID_FGID, // u_int status
PIDS_ID_FGROUP, // str [ FGID based, see: getgrgid(3) ]
PIDS_ID_FUID, // u_int status
PIDS_ID_FUSER, // str [ FUID based, see: getpwuid(3) ]
PIDS_ID_LOGIN, // s_int loginuid
PIDS_ID_PGRP, // s_int stat
PIDS_ID_PID, // s_int as: /proc/<pid>
PIDS_ID_PPID, // s_int stat or status
PIDS_ID_RGID, // u_int status
PIDS_ID_RGROUP, // str [ RGID based, see: getgrgid(3) ]
PIDS_ID_RUID, // u_int status
PIDS_ID_RUSER, // str [ RUID based, see: getpwuid(3) ]
PIDS_ID_SESSION, // s_int stat
PIDS_ID_SGID, // u_int status
PIDS_ID_SGROUP, // str [ SGID based, see: getgrgid(3) ]
PIDS_ID_SUID, // u_int status
PIDS_ID_SUSER, // str [ SUID based, see: getpwuid(3) ]
PIDS_ID_TGID, // s_int status
PIDS_ID_TID, // s_int as: /proc/<pid>/task/<tid>
PIDS_ID_TPGID, // s_int stat
PIDS_LXCNAME, // str cgroup
PIDS_MEM_CODE, // ul_int statm
PIDS_MEM_CODE_PGS, // ul_int statm
PIDS_MEM_DATA, // ul_int statm
PIDS_MEM_DATA_PGS, // ul_int statm
PIDS_MEM_RES, // ul_int statm
PIDS_MEM_RES_PGS, // ul_int statm
PIDS_MEM_SHR, // ul_int statm
PIDS_MEM_SHR_PGS, // ul_int statm
PIDS_MEM_VIRT, // ul_int statm
PIDS_MEM_VIRT_PGS, // ul_int statm
PIDS_NICE, // s_int stat
PIDS_NLWP, // s_int stat or status
PIDS_NS_IPC, // ul_int ns/
PIDS_NS_MNT, // ul_int ns/
PIDS_NS_NET, // ul_int ns/
PIDS_NS_PID, // ul_int ns/
PIDS_NS_USER, // ul_int ns/
PIDS_NS_UTS, // ul_int ns/
PIDS_OOM_ADJ, // s_int oom_score_adj
PIDS_OOM_SCORE, // s_int oom_score
PIDS_PRIORITY, // s_int stat
PIDS_PROCESSOR, // u_int stat
PIDS_PROCESSOR_NODE, // s_int stat
PIDS_RSS, // ul_int stat
PIDS_RSS_RLIM, // ul_int stat
PIDS_RTPRIO, // s_int stat
PIDS_SCHED_CLASS, // s_int stat
PIDS_SD_MACH, // str [ PID/TID based, see: sd-login(3) ]
PIDS_SD_OUID, // str "
PIDS_SD_SEAT, // str "
PIDS_SD_SESS, // str "
PIDS_SD_SLICE, // str "
PIDS_SD_UNIT, // str "
PIDS_SD_UUNIT, // str "
PIDS_SIGBLOCKED, // str status
PIDS_SIGCATCH, // str status
PIDS_SIGIGNORE, // str status
PIDS_SIGNALS, // str status
PIDS_SIGPENDING, // str status
PIDS_STATE, // s_ch stat or status
PIDS_SUPGIDS, // str status
PIDS_SUPGROUPS, // str [ SUPGIDS based, see: getgrgid(3) ]
PIDS_TICS_ALL, // ull_int stat
PIDS_TICS_ALL_C, // ull_int stat
PIDS_TICS_ALL_DELTA, // s_int stat
PIDS_TICS_BLKIO, // ull_int stat
PIDS_TICS_GUEST, // ull_int stat
PIDS_TICS_GUEST_C, // ull_int stat
PIDS_TICS_SYSTEM, // ull_int stat
PIDS_TICS_SYSTEM_C, // ull_int stat
PIDS_TICS_USER, // ull_int stat
PIDS_TICS_USER_C, // ull_int stat
PIDS_TIME_ALL, // ull_int stat
PIDS_TIME_ELAPSED, // ull_int stat
PIDS_TIME_START, // ull_int stat
PIDS_TTY, // s_int stat
PIDS_TTY_NAME, // str stat
PIDS_TTY_NUMBER, // str stat
PIDS_VM_DATA, // ul_int status
PIDS_VM_EXE, // ul_int status
PIDS_VM_LIB, // ul_int status
PIDS_VM_RSS, // ul_int status
PIDS_VM_RSS_ANON, // ul_int status
PIDS_VM_RSS_FILE, // ul_int status
PIDS_VM_RSS_LOCKED, // ul_int status
PIDS_VM_RSS_SHARED, // ul_int status
PIDS_VM_SIZE, // ul_int status
PIDS_VM_STACK, // ul_int status
PIDS_VM_SWAP, // ul_int status
PIDS_VM_USED, // ul_int status
PIDS_VSIZE_PGS, // ul_int stat
PIDS_WCHAN_NAME // str wchan
};
enum pids_fetch_type {