In the Mender client there is this mechanism, which counts the number of state transitions, and aborts (errors) the state loading if the number of state loadings has exceeded MaximumStateDataStoreCountExceeded. This causes the client to go straight back to the idle state. The reason for having this logic is to detect loops caused either caused by a faulty ArtifactRollbackVerifyReboot script (the only one that can loop), or a spontaneous reboot that keeps happening in one of the Update Module steps.
The problem is that this does not take into account state transitions that happen because we are waiting for the server. This was fixed in a very crude way for retry operations, but these are still bounded by a certain number of executions. Ideally, the counter should not be increased at all as long as the input is coming from the server. This will allow it to wait indefinitely long.
- Find a way to not increase the state transition counter when it's a response from the server which is causing the state loop to happen.
- Make server retries not increase the counter, and revert this.