From mboxrd@z Thu Jan 1 00:00:00 1970 X-GM-THRID: 6976957543224442880 X-Received: by 2002:a05:6512:138a:: with SMTP id p10mr10900273lfa.505.1625495444222; Mon, 05 Jul 2021 07:30:44 -0700 (PDT) X-BeenThere: isar-users@googlegroups.com Received: by 2002:a05:6512:3f94:: with SMTP id x20ls1438684lfa.0.gmail; Mon, 05 Jul 2021 07:30:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxgZFKsVBk0HN518A1emv8z000ktAivm7age2d2XFCASVMo2/tz6A4AhtMXt27JMyR5kVnr X-Received: by 2002:a19:c508:: with SMTP id w8mr10637564lfe.446.1625495443146; Mon, 05 Jul 2021 07:30:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625495443; cv=none; d=google.com; s=arc-20160816; b=Ck8Wj2l+jNVfK/WQxeWKuLU1JgtHtorNCAY95RZwcDt4kqQfHXxa+tKioZ2OsCqM/v FVC7SoKDl72IOHXxYW/F2YMjJQfS3YyCsoD/R2i/Na9ZfLlm8x3rTYajBSYTfT86QFiw WvjnJBokzr9Nn4UvE61HWIEr393N5FfD94DAQj4GT/zjY2LqBsuktzy/A2PaGcODLJoy uFfwf3Ni1kF5oaiM9dG7I/Eeqd8oWIXpDKMJdvhXfggDKAPLMzPxw8xyo5G+O7i8kcqT 1mWvAl4mDIlMFnis+Z/woQQNZLTCw4w6gVas+523Wkp7B7JHEJiRmBYBc0z6pgH1N6K9 tpwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-language:content-transfer-encoding:in-reply-to:mime-version :user-agent:date:message-id:from:references:cc:to:subject; bh=3/dLooFd6eO9MNdjgHNtmeFJ/GREkhQRW83KcDsW8TI=; b=atMjWzaRtAmmn+s7urt1SG/flSrJfprztu2RJTnIXGVYyldc4S/H+GAK1pyDax9sfN mzWa7eOPRJ1716uEw/0a9WQO2gYUqbtCbgItjwzYLwKM9tWyRdwmRMj+7qe453AQmhG5 eQkSAaWmuU3otYsalv6WX/1/HRPcLtJFjbrs607y5Zz0NBZmxzDu656X79MHQNsOpEi8 5F0yEgzK8n+1V8jYTOY2Xx+yRJp1AZo3RucpxgaiieupjzO6W8aKlyl0rQB1WCdlomjV Bd+DQ7UCSHiAhDG0Knic5ExHIjiacv11TKTvWEHQA73K9wYjR6S2Bn4uWQ0006powV/j 86yQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of amikan@ilbers.de designates 85.214.156.166 as permitted sender) smtp.mailfrom=amikan@ilbers.de Return-Path: Received: from shymkent.ilbers.de (shymkent.ilbers.de. [85.214.156.166]) by gmr-mx.google.com with ESMTPS id j2si269555lfe.8.2021.07.05.07.30.42 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Mon, 05 Jul 2021 07:30:43 -0700 (PDT) Received-SPF: pass (google.com: domain of amikan@ilbers.de designates 85.214.156.166 as permitted sender) client-ip=85.214.156.166; Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of amikan@ilbers.de designates 85.214.156.166 as permitted sender) smtp.mailfrom=amikan@ilbers.de Received: from [192.168.67.164] (mm-244-52-214-37.mgts.dynamic.pppoe.byfly.by [37.214.52.244] (may be forged)) (authenticated bits=0) by shymkent.ilbers.de (8.15.2/8.15.2/Debian-8) with ESMTPSA id 165EUfkG027420 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 5 Jul 2021 16:30:41 +0200 Subject: Re: [PATCH] mount: Cleanup reference counters before build To: Jan Kiszka , "Moessbauer, Felix" , "isar-users@googlegroups.com" , Baurzhan Ismagulov Cc: "henning.schild@siemens.com" References: <20210623115823.136514-1-amikan@ilbers.de> <36ad6945-e841-c8fb-3779-44768667e7a4@ilbers.de> From: Anton Mikanovich Message-ID: <6574c75e-c25b-8060-3f26-681ef842b995@ilbers.de> Date: Mon, 5 Jul 2021 17:30:35 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=unavailable autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on shymkent.ilbers.de X-TUID: q/byxCj2GVH2 29.06.2021 16:49, Jan Kiszka wrote: > We need a clearer picture what it fixes, what it possibly can't fix (I > doubt it can fix all error cases), and if we still want it then. Maybe > start with documenting the API changes and their reasoning. > > Also describe more clearly how to reproduce the issues you were seeing > and trying to fix. That is key to understand if the new solution is > actually a solution or just shifting the problem from left to right. To reproduce the original issue (fixed by `Rebuild mount logic` patchset), a task running twice for two targets (linux-mainline in our case) should fail, the second task run should start and succeed: To have the original issue fixed by `Rebuild mount logic` patchset be reproduced you need one recipe to be included into two targets, and this recipe should also failed on the first build attempt, but succeed on second one. Moreover, this issue will be reproduced only in case the second run will start directly after the fail: >bitbake mc:de0-nano-soc-buster:isar-image-base mc:stm32mp15x-buster:isar-image-base >... >ERROR: Logfile of failure stored in: /opt/isar/build/tmp/work/debian-buster-armhf/linux-mainline/5.4.70-r0/temp/log.do_dpkg_build.12459 >NOTE: recipe linux-mainline-5.4.70-r0: task do_dpkg_build: Failed >ERROR: Task (mc:stm32mp15x-buster:/opt/isar/meta-isar/recipes-kernel/linux/linux-mainline_5.4.70.bb:do_dpkg_build) failed with exit code '1' >NOTE: Running task 304 of 350 (mc:de0-nano-soc-buster:/opt/isar/meta-isar/recipes-kernel/linux/linux-mainline_5.4.70.bb:do_dpkg_build) >NOTE: recipe linux-mainline-5.4.70-r0: task do_dpkg_build: Started >WARNING: mc:de0-nano-soc-buster:linux-mainline-5.4.70-r0 do_dpkg_build: /opt/isar/build/tmp/deploy/buildchroot-target/debian-buster-armhf//home/builder/linux-mainline: Couldn't unmount, retrying... In most cases second one does not start and bitbake exit due to the error, but when it start the build is went to the infinite loop. In that scenario: 1) mc:stm32mp15x-buster:linux-mainline:do_dpkg_build perform mounting of WORKDIR to buildchroot-target/WORKDIR, then failed and leave WORKDIR to be mounted. 2) mc:de0-nano-soc-buster:linux-mainline:do_dpkg_build perform the same mount again which lead to circullar mount WORKDIR->WORKDIR->buildchroot-target/WORKDIR, then package build is performed without errors. 3) mc:de0-nano-soc-buster:linux-mainline:do_dpkg_build stuck inside dpkg_undo_mounts. When package build finish it will put run.dpkg_undo_mounts.PID shell function into WORKDIR/temp and execute. But when run.dpkg_undo_mounts.PID try to unmount buildchroot-target/WORKDIR the kernel will also unmount circullar mount to WORKDIR which is locked by run.dpkg_undo_mounts.PID itself. This scenario can be demonstrated with the following steps: $ mkdir dirA $ mkdir dirB $ echo "sudo umount /home/user/dirB" > dirA/um $ sudo mount --bind dirA dirB $ mount | grep dir /dev/sdb1 on /home/user/dirB type ext4 (rw,relatime) $ sudo mount --bind dirA dirB $ mount | grep dir /dev/sdb1 on /home/user/dirB type ext4 (rw,relatime) /dev/sdb1 on /home/user/dirB type ext4 (rw,relatime) /dev/sdb1 on /home/user/dirA type ext4 (rw,relatime) $ bash ~/dirA/um umount: /home/user/dirB: target is busy. So mount rebuild patchset actually make dpkg_undo_mounts to be always run even after build fail and not allow double mounting. It was done only for dpkg_do_mounts/dpkg_undo_mounts because only for this scenario double mount is critical and can lead to dead lock. Implementation can be completed for all the mounts if needed. We understand that the API change is quite significant. I'm checking whether the final unmounting could be solved without change the API. P.S. in case anyone will try to reproduce the original issue this patch can help: diff --git a/meta-isar/recipes-kernel/linux/linux-mainline_5.4.70.bb b/meta-isar/recipes-kernel/linux/linux-mainline_5.4.70.bb index 209ad9c..49b9739 100644 --- a/meta-isar/recipes-kernel/linux/linux-mainline_5.4.70.bb +++ b/meta-isar/recipes-kernel/linux/linux-mainline_5.4.70.bb @@ -32,3 +32,11 @@ dpkg_configure_kernel_append() {      grep "CONFIG_ROOT_NFS=y" ${S}/${KERNEL_BUILD_DIR}/.config || \          bbfatal "Self-check failed: CONFIG_ROOT_NFS not enabled"  } + +dpkg_runbuild_prepend() { +    if [ ! -e "${WORKDIR}/stampfile" ]; then +        touch "${WORKDIR}/stampfile" +        sleep 5 +        exit 1 +    fi +} -- Anton Mikanovich Promwad Ltd. External service provider of ilbers GmbH Maria-Merian-Str. 8 85521 Ottobrunn, Germany +49 (89) 122 67 24-0 Commercial register Munich, HRB 214197 General Manager: Baurzhan Ismagulov