From mboxrd@z Thu Jan 1 00:00:00 1970 X-GM-THRID: 6629320495720300544 X-Received: by 2002:a2e:8887:: with SMTP id k7-v6mr1871276lji.3.1543943450200; Tue, 04 Dec 2018 09:10:50 -0800 (PST) X-BeenThere: isar-users@googlegroups.com Received: by 2002:a2e:2a44:: with SMTP id q65-v6ls2409271ljq.1.gmail; Tue, 04 Dec 2018 09:10:49 -0800 (PST) X-Google-Smtp-Source: AFSGD/VB+iM5MVXegpSj+RaO5RgqZg+nfZ2aOn+hzdS/AqET7Yly0AfOY6E4WPduK88KXWA/Nl/p X-Received: by 2002:a2e:98cd:: with SMTP id s13-v6mr1893277ljj.30.1543943449552; Tue, 04 Dec 2018 09:10:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543943449; cv=none; d=google.com; s=arc-20160816; b=DjUNieIbM6KWt7DfmKzeMVZNkXBRw2MT+Zl6hXOhpDw7sRYhregSGSxM1K4UBWmoO8 i/KlV1mxnxwT/zggohGlDYq37e2e16QLvH3FAndnCSG3CE/rbPwYlCpsLMT4inVbyKn0 gpgnPNqTWYg3o4ZK8ZzIbFevpM+Q2b1BImL4zRpAhelXrsV1TZJkOT2w+FYIg4Y4NmYP 9M6W5VUO/Kywhj8fyliNvf+/HPLjtpwa60EMrGr7Q/ct8IGEWIMLw0EdHguyRcs3Ze9U 2xkNXk8zXd0yL1AhgrbLBBBAZBPp5vFD3XqnsfPLaFVwPOBVeo3kBF9CnVbLjFl8ur6y iOPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:from:references:cc:to:subject; bh=SiP+vqvE5MUYNnuTtxgZfouJriRw9z/3YsAxSaEWrBE=; b=oCN0VLOzfbF04mR87nBo+4hw2xsV0aOR2lLSP2lnzCHT3clduZN3Gwr/KOIoYHL2ON a8Jf/dJ4sj7EJb7nHmohD5ix7rMa/G0//3L7iijzAzp9uc72mpdqOID9Rx9lKBdW/FOo p5N7qTxwS03vNG+8IAJlo+bq2YV8p6FpAEi8EPhzFHtD+Kumetl7ySl2zov0C9rVr6mB JKIdDUq0UDM2WtYVmXud7P0DB5jpO1fWa8E/3xqqh7XOVpugfvKsMLMdGHLOXdWOGhbc yzQfuK4/F8PIiHo9o8xd5Oik5opZVTj5sJTaOLPsjGoelAUcWTkyi7O29kev8ZLP32Ye j4pw== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of jan.kiszka@siemens.com designates 194.138.37.40 as permitted sender) smtp.mailfrom=jan.kiszka@siemens.com Return-Path: Received: from gecko.sbs.de (gecko.sbs.de. [194.138.37.40]) by gmr-mx.google.com with ESMTPS id t5-v6si473392lje.3.2018.12.04.09.10.49 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 09:10:49 -0800 (PST) Received-SPF: pass (google.com: domain of jan.kiszka@siemens.com designates 194.138.37.40 as permitted sender) client-ip=194.138.37.40; Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of jan.kiszka@siemens.com designates 194.138.37.40 as permitted sender) smtp.mailfrom=jan.kiszka@siemens.com Received: from mail2.sbs.de (mail2.sbs.de [192.129.41.66]) by gecko.sbs.de (8.15.2/8.15.2) with ESMTPS id wB4HAmb0016196 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 4 Dec 2018 18:10:48 +0100 Received: from [167.87.41.97] ([167.87.41.97]) by mail2.sbs.de (8.15.2/8.15.2) with ESMTP id wB4HAmCd015179; Tue, 4 Dec 2018 18:10:48 +0100 Subject: Re: [PATCH] isar-bootstrap: Fix and cleanup bind mounting To: "Maxim Yu. Osipov" , "Hombourger, Cedric" Cc: Henning Schild , isar-users References: <6f5714bc-d5f5-c08f-c408-b32bab9169fc@siemens.com> <20181129193929.61a35056@md1za8fc.ad001.siemens.net> <2229f975-0752-ebe3-c165-979e1d5864b2@siemens.com> <61a6a17c-06e0-a13b-591e-3ea8bc09632e@ilbers.de> <405c22d0-48cd-4ea4-4c1b-c78e6c5570ed@siemens.com> <3daa2bd836424990a478b3981f9ca222@svr-ies-mbx-02.mgc.mentorg.com> From: Jan Kiszka Message-ID: <66f3fd81-c996-2583-f0d5-ce9db583fe24@siemens.com> Date: Tue, 4 Dec 2018 18:10:47 +0100 User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-TUID: +qpl3FHWoEpg On 04.12.18 17:59, Maxim Yu. Osipov wrote: > Hi Jan, Cedric, > > Another question: > > Which debian/kernel do you use inside your VM/docker? Container is kasproject/kas-isar:latest here, kernel is $random-host. > > Is it also stretch? Yes. > > The problem is reproducible at the the same point on stretch systems (with > kernel SMP Debian 4.9.110-3+deb9u6 (2018-10-08) x86_64 GNU/Linux) > > when commands are launched by hand: What was your Isar baseline for that? I tested with this branch: https://github.com/siemens/isar/commit/03e394b2aa58b5b1404fde76881774baf4a541bc > > Command 1) > bitbake  -c cache_base_repo multiconfig:qemuarm-stretch:isar-image-base > multiconfig:qemuarm64-stretch:isar-image-base multiconfig:qemuamd64- > stretch:isar-image-base Did you check if everything was unmounted at this point already? > > Command 2) > sudo rm -rf tmp > > Command 3) > sed -i -e 's/#ISAR_USE_CACHED_BASE_REPO ?= "1"/ISAR_USE_CACHED_BASE_REPO ?= > "1"/g' conf/local.conf > > No problems detected at this point - the same mounts etc. > > The next command hungs on the last task (according strace bitbake tries to > unmount /sys /dev) And who is holding back that mounts (lsof)? How does pstree -apl look like? BTW, can you recover the build system for this stage? As I said, one of the symptoms or side-effect is removal of device nodes on those side when /dev is mounted at the wrong time. Jan > > Command 4) > bitbake multiconfig:qemuarm-stretch:isar-image-base > multiconfig:qemuarm64-stretch:isar-image-base > multiconfig:qemuamd64-stretch:isar-image-base > > Maxim. > > On 12/4/18 6:45 PM, Hombourger, Cedric wrote: >> Good catch & analysis Jan! >> In our CI, our build script is checking for any mounts relative to the current >> directory before purging them >> >> -----Original Message----- >> From: Jan Kiszka [mailto:jan.kiszka@siemens.com] >> Sent: Tuesday, December 4, 2018 6:43 PM >> To: Maxim Yu. Osipov >> Cc: Henning Schild ; isar-users >> ; Hombourger, Cedric >> Subject: Re: [PATCH] isar-bootstrap: Fix and cleanup bind mounting >> >> On 04.12.18 15:24, [ext] Jan Kiszka wrote: >>> On 04.12.18 11:49, Maxim Yu. Osipov wrote: >>>> On 12/3/18 3:59 PM, Jan Kiszka wrote: >>>>> On 30.11.18 10:20, Maxim Yu. Osipov wrote: >>>>>> Hi Jan, >>>>>> >>>>>> I've just tried this patch (on the 'next' with reverted patch >>>>>> d40a9ac0) and ran "fast" CI >>>>>> >>>>>> isar$mount | wc -l >>>>>> 34 >>>>>> >>>>>> isar$./scripts/ci_build.sh -q -f >>>>>> >>>>>> CI script hung on CI stage when dpkg-base is modified causing >>>>>> rebuilding recipes based on dpkg-base. >>>>>> >>>>>> The mount reports less (!) mount points than before launching the script. >>>>>> >>>>>> mount | wc -l >>>>>> 31 >>>>> >>>>> Any news on what's different on your side? Where exactly does your >>>>> build hang? Was your CI environment in a clean state when running >>>>> this test? Before the comment lots of things leaked. >>>> >>>> >>>> On my stretch laptop (i7-6820HQ CPU @ 2.70GHz (8 cores) with SSD) >>>> the reported problem is reproducible (I rerun 'ci_build.sh -q -f' >>>> several times in clean state) it hung and with the less mount points >>>> (the mount points before and after running are attached). >>>> >>>> The strange thing that I observe two bitbake processes: >>>> >>>> myo      26373  0.0  0.3 153116 29732 pts/0    Sl+  12:31   0:01 >>>> python3 /home/myo/work/isar/src/trunk/isar/bitbake/bin/bitbake >>>> multiconfig:qemuarm-stretch:isar-image-base >>>> multiconfig:qemuarm64-stretch:isar-image-base >>>> multiconfig:qemuamd64-stretch:isar-image-base >>>> >>>> myo      26379  2.5  0.6 328476 50028 ?        Sl   12:31   0:40 >>>> python3 /home/myo/work/isar/src/trunk/isar/bitbake/bin/bitbake >>>> multiconfig:qemuarm-stretch:isar-image-base >>>> multiconfig:qemuarm64-stretch:isar-image-base >>>> multiconfig:qemuamd64-stretch:isar-image-base >>>> >>> >>> We run multiple bitbake sessions after each other. Maybe the first one never >>> terminates (get stuck), and that is also why the rm after the first session >>> fails. You need to stop the build there and analyse what is keeping the mount >>> points busy. >> >> Wait... If I terminate a build from inside the container (i.e. "natively") and >> then quickly try to delete the build artifacts, I can trigger that infamous >> empty /dev bug - on the host. That has always been the problem, and that is one >> reason why we encapsulate things into containers. >> >> The reason for this is that bitbake's cooker waits for the last sub-process to >> finish before it calls the cleanup hook that does all the unmounting. If you >> delete something before that, you step into the mount point and purge its >> content. That /may/ be the issue here as well as we run rm directly after >> bitbake. >> >> IOW: Possibly just a known limitation of current Isar design /wrt to unmounting >> in isar_handler() that now surfaces in CI. I would not be surprised you can >> resolve that by waiting for the last cooker instance to terminate before >> deleting tmp. >> >> Jan >> > > -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux