Details
Description
It appears the cf-execd is allow to run multiple times. There is a policy (cfe_internal_limit_robot_agents_processes_kill_cf_execd, in CFE_cfengine.cf) to kill excess cf-execd, but in some situations kill is not allowed to work or the agent is not running on linux:
bundle agent cfe_internal_limit_robot_agents
{
processes:
linux::
"cf-execd"
process_count => check_execd("2"),
comment => "Check cf-execd process if exceed the number",
handle => "cfe_internal_limit_robot_agents_processes_check_cf_execd";
This eventually lead up to so many cf-execd processes that it will cause resource issues on the machine.
Using the following shell script, it can be demonstrated (on a non linux machine)
And here is the output:
- ./spawn.sh
spawning cf-execd (1/5)
sleeping 61 seconds avoid lock....
spawning cf-execd (2/5)
sleeping 61 seconds avoid lock....
spawning cf-execd (3/5)
sleeping 61 seconds avoid lock....
spawning cf-execd (4/5)
sleeping 61 seconds avoid lock....
spawning cf-execd (5/5)
sleeping 61 seconds avoid lock....
0 225 1 0 5:50AM ?? 0:00.14 /var/cfengine/bin/cf-execd
0 977 1 0 6:28AM ?? 0:00.02 /var/cfengine/bin/cf-execd
0 1054 1 0 6:30AM ?? 0:00.01 /var/cfengine/bin/cf-execd
0 1081 1 0 6:31AM ?? 0:00.01 /var/cfengine/bin/cf-execd
0 1103 1 0 6:33AM ?? 0:00.01 /var/cfengine/bin/cf-execd
0 1122 1 0 6:34AM ?? 0:00.00 /var/cfengine/bin/cf-execd
0 1141 1 0 6:35AM ?? 0:00.00 /var/cfengine/bin/cf-execd
I believe there should be two changes, one to the policy, and one for the code to internally check if a cf-execd is already running.
Attachments
Release management
Issue Links
- relates to
-
CFE-1799 cfe_internal_limit_robot_agents_processes_check_cf_execd problem for Linux Container
-
- Done
-