Uploaded image for project: 'CFEngine Community'
  1. CFEngine Community
  2. CFE-1655

Replace parsing of ps output with access to the /proc/ filesystem



    • Type: Task
    • Status: In Progress
    • Priority: Low
    • Resolution: Unresolved
    • Affects Version/s: 3.7.8, 3.10.5, 3.12.1, 3.13.0
    • Fix Version/s: None
    • Labels:


      At present, we get our knowledge for use with process promises from running ps and parsing the output.
      Each platform gets a different set of flags, all have quirks to their output that complicate parsing.

      Parsing ps output is uncommonly erratic: it prints out its header line before it has its other data, but tries to produce a tabular display (the ASCII-art way), which can get mangled by data from one column over-flowing to under another's heading; on some platforms, when a row is cramped, fields may abut their neighbours to reduce over-flow, while others always leave at least one space. The STIME field can be a time or a date; the month and day of a date may be separated by space or juxtaposed.
      We have a long history of little quirky bugs - often hard to reproduce - associated with parsing ps output.

      One recurrent suggestion for solving this problem is to access the /proc filesystem directly; while there are still differences between platforms, there likely aren't more such differences than between versions of ps; and reading /proc directly would eliminate all of the problems with identifying the right part of each line of ps output to associate with each header.

      If we support any platform that lacks a /proc/ (Windows ?), we'll need to consider whether to drop support for process promises on that platform, hack some other platform-specific process-information code or retain support for parsing ps on that platform.
      There are known to be complications (e.g. from our recent work on Solaris to be zone-aware while still getting full command-lines) which can require privilege escalation in order to get the information out of /proc/; these may prove problematic for a /proc/-based implementation.

      This change would also require rethinking the promiser in process promises; at present it's a regex matching a line of the ps output. A fairly natural choice would be to make the promiser be the command-line of the process; but there may well be Real World uses in which the promiser is the user or pid. (I recently wrote a test-case that matched on pid, for example; it could be re-written to avoid that, but the point here is that we don't like forcing users to rewrite Real World policies.) So this change could be disruptive, at least for some users.

      Internally, it would make sense to replace much of the existing process-information lookup code with an abstract API with an enum for known process fields (not necessarily all supported by all platforms), a const char *GetProcessInfo(pid, field) and an Rlist *GetProcesses(field, value), probably along with some regex-ish variants on the latter. This would isolate the (cross-platform) process promise code from platform-specific details of reading /proc/.

      This would be a very significant piece of work, likely requiring a developer or three to work for some months. It would need extensive testing. The cross-platform complications imply significant work.
      I am filing this issue because it is frequently asked for ... so we can link all requests for it to this and keep the relevant discussion in one place.


          Issue Links



              davidlee David Lee
              a10050 Edward Welbourne (Inactive)
              1 Vote for this issue
              9 Start watching this issue