From mboxrd@z Thu Jan 1 00:00:00 1970 X-GM-THRID: 6521316140522668032 X-Received: by 10.46.53.20 with SMTP id z20mr401305ljz.31.1518631469710; Wed, 14 Feb 2018 10:04:29 -0800 (PST) X-BeenThere: isar-users@googlegroups.com Received: by 10.25.190.21 with SMTP id o21ls1529906lff.8.gmail; Wed, 14 Feb 2018 10:04:29 -0800 (PST) X-Google-Smtp-Source: AH8x224leUVfgcXBhSabzHGIxaXhsXEI67nXoy6eFD0mEvmzPQf0+liB52a70/YFeNHkCtoAHfmX X-Received: by 10.25.80.68 with SMTP id z4mr460364lfj.31.1518631468246; Wed, 14 Feb 2018 10:04:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518631468; cv=none; d=google.com; s=arc-20160816; b=cm0/0coB6CDk7Nzrpv8ONlN8BqVWHFDBrGh3BzDZp41zdvVfxDyK6w7cOJs60BcetR JXEIpd8l3ZIjv1Bb/LLLwAvgdrdsnXr8uIkaJrLp9Qju8lFjSZ2ZXgHD3tsGFFAt3wto fQFql3M7sI2nNdB/id6qT1E1btiRIFV4T2nWzIxEB/DO4OX6Ng0ie2bKeBQvpMvF2gG3 0GsrEoTPOJwFp5RCEyFP4p+rrOb9CocnFxG96M/gFDpts6GzO4GuLQtHjpn7UAab49eh sK+E7dYtO1DHbWtMdZXlVso+AVu4IQYdGrRdLVtYZx9zB16iRwuvVrT+PE8R21QFWr18 ygSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:from:references:to:subject :arc-authentication-results; bh=R5eIXGRvQ2bsrDVrrbHqHOC4eSvR1siZMK+HKPy/ptk=; b=gHQLLJLwAty0crDPGMhr5Pjqm4ZUGsH+YlRAg/JRP9TrSPlVNFTBJ680lg023Bdggu DA9QMZyusTSggDDBquBk02beZ4b5DEub9L7PBOvAPnvaiuCTZA3/zN+zLu0yuFyWZa1n TgK+yIP3zjEITXENmoUD+rqh7jz6KvR/nxmN0+vIx8r0KDD1s1iv4+S3oFCwtnQ1oMRa KOsz3UfG8SkkYFLUTDsTTcI1eA6rlVS+PLWUnWfnG4OnxewcnI2tL5dN/SWiRb1IPzGu aSZqVJQ0J04scf7G76ifWubP/Mu79ZhmGvVZo/uISHUmKYtVxySBPr32pwSPrbABU7Rt dPcg== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of jan.kiszka@siemens.com designates 192.35.17.14 as permitted sender) smtp.mailfrom=jan.kiszka@siemens.com Return-Path: Received: from david.siemens.de (david.siemens.de. [192.35.17.14]) by gmr-mx.google.com with ESMTPS id o26si720770ljc.5.2018.02.14.10.04.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 14 Feb 2018 10:04:28 -0800 (PST) Received-SPF: pass (google.com: domain of jan.kiszka@siemens.com designates 192.35.17.14 as permitted sender) client-ip=192.35.17.14; Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of jan.kiszka@siemens.com designates 192.35.17.14 as permitted sender) smtp.mailfrom=jan.kiszka@siemens.com Received: from mail2.siemens.de (mail2.siemens.de [139.25.208.11]) by david.siemens.de (8.15.2/8.15.2) with ESMTPS id w1EI4RHu006259 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 14 Feb 2018 19:04:27 +0100 Received: from [139.25.68.37] (md1q0hnc.ad001.siemens.net [139.25.68.37] (may be forged)) by mail2.siemens.de (8.15.2/8.15.2) with ESMTP id w1EI4RV5014645; Wed, 14 Feb 2018 19:04:27 +0100 Subject: Re: Build can sporadically fail due to dpkg contention in buildchroot To: Alexander Smirnov , isar-users References: <79bd216e-53ca-5485-4f6c-66050d08ed5f@siemens.com> <4d2b9322-54ed-ddff-739e-d0a3d5c6cc7b@ilbers.de> <040312d8-2496-dc62-5cbc-744dbf6e953c@ilbers.de> <9a978df9-2c52-7bd5-ad75-d6a1ff0269bc@siemens.com> <3668e211-0a0a-5837-c52f-4e0dbb1b94bf@siemens.com> <30016591-1d6b-7a5d-20ad-a8ac30413585@ilbers.de> From: Jan Kiszka Message-ID: <1929e7fc-1b8b-bad4-e358-9d2e33e02299@siemens.com> Date: Wed, 14 Feb 2018 19:04:27 +0100 User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 MIME-Version: 1.0 In-Reply-To: <30016591-1d6b-7a5d-20ad-a8ac30413585@ilbers.de> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-TUID: n0M1zGb2lZQq On 2018-02-14 18:57, Alexander Smirnov wrote: > On 02/14/2018 06:02 PM, Jan Kiszka wrote: >> On 2018-02-12 08:42, [ext] Jan Kiszka wrote: >>> On 2018-02-11 19:44, Alexander Smirnov wrote: >>>> >>>> >>>> On 02/11/2018 09:18 PM, Jan Kiszka wrote: >>>>> On 2018-02-11 17:55, Alexander Smirnov wrote: >>>>>> Hi, >>>>>> >>>>>> On 02/11/2018 06:17 PM, Jan Kiszka wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> I got this failure of example-hello:do_build already twice while >>>>>>> doing >>>>>>> rebuild tests with my kernel series (ie. with more independent >>>>>>> buildchroot users): >>>>>>> >>>>>>> DEBUG: Executing shell function do_build >>>>>>> Get:1 file:/isar-apt isar InRelease >>>>>>> Ign:1 file:/isar-apt isar InRelease >>>>>>> Get:2 file:/isar-apt isar Release [2,864 B] >>>>>>> Get:2 file:/isar-apt isar Release [2,864 B] >>>>>>> Get:3 file:/isar-apt isar Release.gpg >>>>>>> Ign:3 file:/isar-apt isar Release.gpg >>>>>>> Get:4 file:/isar-apt isar/main amd64 Packages [1,135 B] >>>>>>> Reading package lists... >>>>>>> W: The repository 'file:/isar-apt isar Release' is not signed. >>>>>>> hostname: No address associated with hostname >>>>>>> dh_testdir >>>>>>> dh_testroot >>>>>>> dh_prep >>>>>>> dh_testdir >>>>>>> dh_testroot >>>>>>> dh_install >>>>>>> dh_install: Compatibility levels before 9 are deprecated (level 7 in >>>>>>> use) >>>>>>> dh_installdocs >>>>>>> dh_installdocs: Compatibility levels before 9 are deprecated >>>>>>> (level 7 >>>>>>> in use) >>>>>>> dh_installchangelogs >>>>>>> dh_compress >>>>>>> dh_fixperms >>>>>>> dh_installdeb >>>>>>> dh_installdeb: Compatibility levels before 9 are deprecated (level 7 >>>>>>> in use) >>>>>>> dh_gencontrol >>>>>>> dh_md5sums >>>>>>> dh_builddeb >>>>>>> dpkg-deb: building package 'hello-build-deps' in >>>>>>> '../hello-build-deps_0.2_all.deb'. >>>>>>> >>>>>> >>>>>> Good catch! >>>>>> >>>>>>> The package has been created. >>>>>>> Attention, the package has been created in the current directory, >>>>>>> not in ".." as indicated by the message above! >>>>>>> dpkg: error: dpkg status database is locked by another process >>>>>>> mk-build-deps: dpkg --unpack failed >>>>>>> [...] >>>>>>> >>>>>>> So we have a concurrency problem when building over the same dpkg >>>>>>> database. Looks like we need to synchronize (lock-protect) the >>>>>>> access to >>>>>>> it, which also means pulling out the dependency installation from >>>>>>> the >>>>>>> regular build step. Is that feasible at all? Any alternatives >>>>>>> (besides >>>>>>> retrying such builds...)? >>>>>> >>>>>> In general we could do this easily: >>>>>> >>>>>> 1. Split the content of build.sh into two functions, for example: >>>>>>    - install_build_deps >>>>>>    - build_package >>>>>> >>>>>> 2. Spit the bitbake do_build() into two tasks: >>>>>> >>>>>> do_install_build_deps() { >>>>>>       ... build.sh install_build_deps ... >>>>>> } >>>>>> >>>>>> addtask install_build_deps before do_build after do_unpack >>>>>> >>>>>> do_build_package() { >>>>>>       ... build.sh build_package ... >>>>>> } >>>>>> >>>>>> 3. Using bitbake synchronization primitives, protect the fist task >>>>>> from >>>>>> parallel execution. >>>>>> >>>>>> If you are OK with this, I could do this tomorrow. >>>>> >>>>> I'm still concerned how well this will scale: >>>>> >>>>> a) We have additional users we already know of (linux-kernel.bbclass). >>>>>      We will need to provide them the same means. >>>>> >>>>> b) There might be more users hidden in today's or future recipes... >>>>> >>>> >>>> Now we have pipeline: >>>>   - do_fetch, do_unpack, do_build. >>>> >>>> I propose to extend this pipeline by one extra task: >>>>   - do_fetch, do_unpack, do_install_build_deps, do_build. >>>> >>>> These are the core tasks that have default payload defined in calsses, >>>> so you should not touch them in custom recipes. >>>> >>>> For better scale-ability, a separate class could be created: >>>> >>>> 8<-- >>>> >>>> dpkg-build-deps.bbclass: >>>> >>>> do_install_build_deps() { >>>>     ... call mk-build-deps >>>> } >>>> >>>> addtask install_build_deps before do_build after do_unpack >>>> >>>> 8<-- >>>> >>>> So if you want this functionality in your class, for example in >>>> dpkg.bbclass, so just include it. I can't imagine if we will have so >>>> many different classes to build something... >>> >>> I'm concern about all dpkg calls. Are we sure there are no others to >>> just query the database eg. which cannot fail? >>> >>> Otherwise, this sounds good to me. >> >> Already coded this into a path, or should I look into it? We desperately >> need this to restore CI with all its concurrent builds - once they run >> truly concurrent. > > If you mean whether I've done this or not, then no, I haven't. Could > handle this in the next one-two-ays, at the moment there are several > series pending for review/applying. Yeah, as I said, I'm already looking into this in order to be able to reproduce your CI setup. I will not model it via locked tasks, to complex, but likely just via flock in build.sh. Jan