Uploaded image for project: 'Mender'
  1. Mender
  2. MEN-2643

Mender client hangs on certain errors from update modules

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open
    • Priority: High
    • Resolution: Unresolved
    • Affects Version/s: 2.0.1
    • Fix Version/s: None
    • Days in progress:
      0

      Description

      We're using an update module one functionality of which is to cat the rootfs image onto the rootfs partition as follows:

      	Download)
      		# [snip!]
      		rootfs="$(cat stream-next)"
      		cat "$rootfs" > $passive_rootfs
      

      This is partially based on the reimplementation of the default client functionality available in the client repo.

      Due to a bug in our partition layout, we made the B rootfs partition smaller than the A. The update of course could not succeed, but mender client - instead of failing and notifying the backend - just got stuck.

      Below is a snippet from the mender client output in --debug mode:

      INFO[0062] State transition: update-fetch [Download_Enter] -> update-store [Download_Enter]  module=mender
      DEBU[0062] handle update install state                   module=state
      DEBU[0062] status reported, response 204 No Content      module=client_status
      DEBU[0062] Read data from device manifest file: device_type=pumpkin-mt8516  module=device
      DEBU[0062] Current manifest data: pumpkin-mt8516         module=device
      INFO[0062] no public key was provided for authenticating the artifact  module=installer
      DEBU[0062] checking if device [pumpkin-mt8516] is on compatibile device list: [pumpkin-mt8516]
        module=installer
      DEBU[0062] installer: processing script: ArtifactCommit_Enter_10_uboot-env-set-rw  module=installer
      DEBU[0062] installer: processing script: ArtifactCommit_Error_10_uboot-env-set-ro  module=installer
      DEBU[0062] installer: processing script: ArtifactCommit_Leave_10_uboot-env-set-ro  module=installer
      DEBU[0062] Executing ModuleInstaller.Initialize          module=modules
      DEBU[0062] Returning artifact name from /etc/mender/artifact_info file.  module=device
      DEBU[0062] Read data from device manifest file: artifact_name=pumpkin-brgl-partsize-strace-0.6  module=device
      DEBU[0062] Current manifest data: pumpkin-brgl-partsize-strace-0.6  module=device
      DEBU[0062] Read data from device manifest file: artifact_name=pumpkin-brgl-partsize-strace-0.6  module=device
      DEBU[0062] Read data from device manifest file: device_type=pumpkin-mt8516  module=device
      DEBU[0062] Current manifest data: pumpkin-mt8516         module=device
      DEBU[0062] installer: successfully read artifact [name: pumpkin-brgl-partsize-strace-0.7; version: 3; compatible devices: [pumpkin-mt8516]]  module=installer
      DEBU[0062] Executing ModuleInstaller.PrepareStoreUpdate  module=modules
      DEBU[0062] Calling module: /usr/share/mender/modules/v3/va-ota Download /var/lib/mender/modules/v3/payloads/0000/tree  module=modules
      DEBU[0062] Executing ModuleInstaller.StoreUpdate         module=modules
      INFO[0165] Update module output: cat: write error        module=modules
      INFO[0165] Update module output: : No space left on device  module=modules
      DEBU[0167] Executing ModuleInstaller.FinishStoreUpdate   module=modules
      

      Nothing more happens until reboot.

      While this shouldn't be triggered in normal situation (partitions should be the same), I'm afraid that there are other ways the module can fail and make the device non-updatable, because the client wouldn't detect it.

      I've been trying to check what the client is doing internally using strace because I assumed that maybe if the process is killed with SIGPIPE then it would not be correctly handled, but the module process exits with exit_group(1) - the same as if I were to add an exit 1 line to the scripts (which works fine).

      Any ideas on how to proceed? Is this a known bug?

        Attachments

        1. debug-modules-hang.diff
          1 kB
        2. mender-strace.txt
          52 kB
        3. va-ota
          2 kB

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              brgl Bartosz Golaszewski
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Summary Panel