Details
-
Type:
Bug
-
Status: Open
-
Priority:
Medium
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Component/s: cf-execd
-
Labels:None
Description
After weeks of searching, I figured out why cf-execd would occasionally stop running on my /laptop/:
At each iteration of @ScheduleRun()@, it would try to reset all classes and then call @DetectEnvironment()@ (cf-execd.c:532) , which, in turn, would call @GetInterfacesInfo(ctx);@ (sysinfo.c:2692) . But the latter, would just call @exit(FAILURE)@ if network interfaces cannot be read from the system!
(actually, this problem seemed to be happening when my machine goes into suspend/resume, freezing the network services. Linux kernel π = 3,14)
This would totally stop cf-engine from running on the system, I would have to restart cf-execd manually.
To my horror, I also noticed that this ScheduleRun would in fact run an expensive evaluation of classes both in the RELOAD_FULL case and when no promises have changed at all. Why do we need to load monitor promises in execd?
Why do we need to re-evaluate the system hostname, CPU classes etc every 5min, considering that the forked cf-agent will repeat that detection?
Please comment the patch I have here:
https://github.com/xrg/cfengine-core/tree/xrg-3.6-rfc-execd
Can we keep the classes loaded (not re-evaluate) throughout the life of cf-execd process, like in that patch?
Ideally, I would even remove the "interfaces" classes, but these would break the /subject/ of cf-execd's mails, suppressing some previous functionality. But, IPv4 address has proven to be inadequate for identifying a host (in favour of PK hash).