Affects Version/s: 3.7.1
Fix Version/s: None
Opening a bug to help with profiling information on a busy cf-serverd per Dimitrios (see "relevant mailing list thread":https://groups.google.com/forum/#!topic/help-cfengine/gn3e8pw_nU8).
- This system is 8 CPUs / 8GB RAM / 4GB swap (0 swap currently used) / VMWare virtualized.
- about 5500 clients (22000 divided by 4 because of load-balancer)
- 5 minute schedule, 3 minute splaytime, custom failsafe (see comments for details)
- Huge directory tree with ~5 files per host, recursively copied by all clients
Summary of the profiling data from the comments:
- @vmstat@ shows high @sys@ CPU usage, no swapping, low I/O
- @strace@ shows most CPU time is spent in @futex()@ system call; this indicates high lock contention
- @gdb@ backtraces show most of the threads in cf-serverd waiting on lastseen db LMDB mutex for I/O
- This was remedied by touching @am_policy_hub@ file in @state/@ directory - see redmine #7640