Details
-
Type:
Task
-
Status: In Progress
-
Priority:
Low
-
Resolution: Unresolved
-
Affects Version/s: 3.7.8, 3.10.5, 3.12.1, 3.13.0
-
Fix Version/s: None
-
Component/s: Promise type: processes
-
Labels:
Description
At present, we get our knowledge for use with process promises from running ps and parsing the output.
Each platform gets a different set of flags, all have quirks to their output that complicate parsing.
Parsing ps output is uncommonly erratic: it prints out its header line before it has its other data, but tries to produce a tabular display (the ASCII-art way), which can get mangled by data from one column over-flowing to under another's heading; on some platforms, when a row is cramped, fields may abut their neighbours to reduce over-flow, while others always leave at least one space. The STIME field can be a time or a date; the month and day of a date may be separated by space or juxtaposed.
We have a long history of little quirky bugs - often hard to reproduce - associated with parsing ps output.
One recurrent suggestion for solving this problem is to access the /proc filesystem directly; while there are still differences between platforms, there likely aren't more such differences than between versions of ps; and reading /proc directly would eliminate all of the problems with identifying the right part of each line of ps output to associate with each header.
If we support any platform that lacks a /proc/ (Windows ?), we'll need to consider whether to drop support for process promises on that platform, hack some other platform-specific process-information code or retain support for parsing ps on that platform.
There are known to be complications (e.g. from our recent work on Solaris to be zone-aware while still getting full command-lines) which can require privilege escalation in order to get the information out of /proc/; these may prove problematic for a /proc/-based implementation.
This change would also require rethinking the promiser in process promises; at present it's a regex matching a line of the ps output. A fairly natural choice would be to make the promiser be the command-line of the process; but there may well be Real World uses in which the promiser is the user or pid. (I recently wrote a test-case that matched on pid, for example; it could be re-written to avoid that, but the point here is that we don't like forcing users to rewrite Real World policies.) So this change could be disruptive, at least for some users.
Internally, it would make sense to replace much of the existing process-information lookup code with an abstract API with an enum for known process fields (not necessarily all supported by all platforms), a const char *GetProcessInfo(pid, field) and an Rlist *GetProcesses(field, value), probably along with some regex-ish variants on the latter. This would isolate the (cross-platform) process promise code from platform-specific details of reading /proc/.
This would be a very significant piece of work, likely requiring a developer or three to work for some months. It would need extensive testing. The cross-platform complications imply significant work.
I am filing this issue because it is frequently asked for ... so we can link all requests for it to this and keep the relevant discussion in one place.
Attachments
Issue Links
- relates to
-
CFE-1536 [freebsd,netbsd,openbsd] processes: promise
-
- Open
-
-
CFE-1653 Parsing of ps output lines could be done better
-
- Open
-
-
CFE-1075 processes promises fails on Solaris for long command lines
-
- Done
-
-
CFE-1548 18_examples/ouputs/check_outputs.cf FAIL
-
- Done
-
-
CFE-1573 processes_select complains about Unacceptable model uncertainty examining processes
-
- Done
-
-
CFE-2161 Solaris: Processes with large memory usage triggers ps misparsing due to missing spaces between numbers
-
- Done
-
-
CFE-1991 sys variable containing default gateway
-
- Done
-