From mboxrd@z Thu Jan 1 00:00:00 1970 X-GM-THRID: 7003728059744387072 X-Received: by 2002:adf:b7cd:: with SMTP id t13mr100479wre.63.1630689060999; Fri, 03 Sep 2021 10:11:00 -0700 (PDT) X-BeenThere: isar-users@googlegroups.com Received: by 2002:adf:ef0d:: with SMTP id e13ls786420wro.3.gmail; Fri, 03 Sep 2021 10:11:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz5w3BTQZzz8cEDNYv3mHOBHUcV9WEGIUgDFML7hP8rNuFSD32g7hkK93fpUuBhyijzfYDw X-Received: by 2002:adf:e10c:: with SMTP id t12mr141306wrz.36.1630689060013; Fri, 03 Sep 2021 10:11:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630689060; cv=none; d=google.com; s=arc-20160816; b=ed66/LjLXCG8dVHR81q1g4A3/YA31zlzOK7Y0Ft15d5iQ/bUIZ09eoL5kXIkP1OlCW mTG9EG8nU/6prp5OjzgAxfSnXKBRh6CA5t/U+ih7Bu8rl5ovaKHA/wsL99VU458ODmMd ZkwRayYp/lw7GH9b53AZkNIfvlYg75lEX//V7NtmVhAhAeC/cTUKjQMnTJBSM7zQZIrb KMQSEn1PlugrOaSoC3lDxBdz2Gn+kz8ohMqV7hhbFuttGTGAgMgyazUO2R3HU7tAG6Nw wMRACKMsPDWbO5/3NHVmtMzz4Af8s5zk6eII/niHWB+vpzXFIXRz8ktb+PDWUbPA9WBA yOfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date; bh=l+OXAdAVlOMqRwN13r2Sre+D9Ge2WAGwsf+q/NWtnH0=; b=wubXf806mcsFO7v74HyDImrIbx/JFHRb1KfLuF9n6EoY4/6F08Hsx65cmClascyEEi 3X9hVzjPU1uBeUpjZBnhZGOm2Q6hfzHCSCigJWhvDC0FlxriChGW5HQXf8MswrxMML0n BVMlPOPwLbm0AORj/sIPvyfmJlt37Ezedd0YJ1xNtTR9fyhplt0iY8ZVC7pjgOt/WFyf EikXIe6U5SE4p24rvTpPtoX7amqV0hus9tRaq3Fzcj5J5GLbFhYYkaekCT9kkVXNv85p qUlE2QDfiYqROQdC8S5vlMvsckd6GGO94u7ebNO37yHZlZ0nMk14XgjbVLymG540v8Rw 6gUw== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of henning.schild@siemens.com designates 192.35.17.2 as permitted sender) smtp.mailfrom=henning.schild@siemens.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=siemens.com Return-Path: Received: from thoth.sbs.de (thoth.sbs.de. [192.35.17.2]) by gmr-mx.google.com with ESMTPS id m21si12811wmg.1.2021.09.03.10.10.59 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Sep 2021 10:10:59 -0700 (PDT) Received-SPF: pass (google.com: domain of henning.schild@siemens.com designates 192.35.17.2 as permitted sender) client-ip=192.35.17.2; Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of henning.schild@siemens.com designates 192.35.17.2 as permitted sender) smtp.mailfrom=henning.schild@siemens.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=siemens.com Received: from mail1.sbs.de (mail1.sbs.de [192.129.41.35]) by thoth.sbs.de (8.15.2/8.15.2) with ESMTPS id 183HAwD7001466 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 3 Sep 2021 19:10:59 +0200 Received: from md1za8fc.ad001.siemens.net ([139.25.0.59]) by mail1.sbs.de (8.15.2/8.15.2) with ESMTP id 183HAwNA003036; Fri, 3 Sep 2021 19:10:58 +0200 Date: Fri, 3 Sep 2021 19:10:57 +0200 From: Henning Schild To: Cc: , , , Subject: Re: [isar] reproducible build failures Message-ID: <20210903191057.4eb2394d@md1za8fc.ad001.siemens.net> In-Reply-To: References: X-Mailer: Claws Mail 3.18.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-TUID: orT+JEz+vrK9 Hi there, Am Fri, 3 Sep 2021 15:19:21 +0000 schrieb : > Hi, > > I am using isar system in isar-cip-core project [1] where I found > some reproducible failures, which may be good to fix in the isar > system. I am not good in modifying the isar system, so could you > please guide me to fix these problems? Well ... for isar and maybe to some degree also to debian, a truly reproducible build would be a new topic that so far has been ignored. > Here are the steps to check the reproducible failures in > isar-cip-core project: > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2Fcip-project%2Fcip-core%2Fisar-cip-core%2F-%2Fissues%2F12&data=04%7C01%7Cde173c00-e982-4fda-8644-47edf4671d63%40ad011.siemens.com%7C186fea861bb04bae174708d96eee3bfb%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7C637662792787877923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=ofcsE0TXKbwL%2FPUTD2hAEQWvOLXNRFrIvunSOblUAho%3D&reserved=0 > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.com%2Fcip-project%2Fcip-core%2Fisar-cip-core%2F-%2Fissues%2F13&data=04%7C01%7Cde173c00-e982-4fda-8644-47edf4671d63%40ad011.siemens.com%7C186fea861bb04bae174708d96eee3bfb%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7C637662792787877923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=M7pZHme2TzX7ZhNbaZW%2BKn26K65ZnKFW4BvnQJldLnw%3D&reserved=0 > > I also verified the reproducibility in the isar system and found > similar failures that are copied below: > ============================================== tmp/gpghomefHv8eRhk43/ > tmp/gpghomefHv8eRhk43/private-keys-v1.d/ > usr/share/doc/hello/changelog.Debian.gz > var/cache/debconf/config.dat > var/cache/debconf/config.dat-old > var/cache/ldconfig/aux-cache > var/lib/dpkg/info/enable-fsck.md5sums > var/lib/dpkg/info/example-raw.md5sums > var/lib/dpkg/info/hello.md5sums > var/lib/dpkg/info/isar-disable-apt-cache.md5sums > var/lib/dpkg/info/isar-exclude-docs.md5sums > var/lib/dpkg/info/sshd-regen-keys.md5sums > var/lib/initramfs-tools/4.19.0-17-amd64 > var/lib/systemd/catalog/database > var/log/alternatives.log > var/log/bootstrap.log > var/log/dpkg.log > var/log/apt/history.log > var/log/apt/term.log > ============================================== That said and looking at the list ... it all seems harmless. Maybe not _all_ but a log file or a date here and there can maybe be ignored. I never really got the idea ... if one wants "exactly" the same result, there is no reason to rebuild. You just store/distribute the binary result. But hey you might have your reasons and explain those. > Steps to check reproducible failures in isar > ==================================== > $ . isar-init-build-env ../build1 && bitbake > mc:qemuamd64-buster-tgz:isar-image-base $ . isar-init-build-env > ../build2 && bitbake mc:qemuamd64-buster-tgz:isar-image-base $ mkdir > -p rootfs1 rootfs2 $ tar -xzvf > ./build1/tmp/deploy/images/qemuamd64/isar-image-base-debian-buster-qemuamd64.tar.gz > -C ./rootfs1/ $ tar -xzvf > ./build2/tmp/deploy/images/qemuamd64/isar-image-base-debian-buster-qemuamd64.tar.gz > -C ./rootfs2/ $ rsync -nrclv ./rootfs1/ ./rootfs2/ > difference.txt > ==================================== This is not even remotely close. Here you have been really lucky and all the "diff" you got was caused by the build. If you introduce a long pause between the builds ... you will get actually very different results. That is a feature .. because isar is tracking debian. But in scenarios like yours it can be seen as a bug. In which case you need to build against a custom debian mirror or snapshot.debian.org. (unfortunately hard because the servers have rate limits, but isar-image-base could work, or you restart that bitbake a few times) Snapshot is also a good way to try that ... try a "buster" from a few months ago for "build1". Or if you want to track build1 but _not_ track build2 you should use ISAR_USE_CACHED_BASE_REPO = "1" for that offline rebuild. > > From the reproducible failures I found there are three different > areas to fix these problem > > 1. Changelog file generation, which is embedding the build time > date value at here > (https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Filbers%2Fisar%2Fblob%2Fmaster%2Fmeta%2Fclasses%2Fdebianize.bbclass%23L34&data=04%7C01%7Cde173c00-e982-4fda-8644-47edf4671d63%40ad011.siemens.com%7C186fea861bb04bae174708d96eee3bfb%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7C637662792787877923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=sSB%2BhM1xQauyMIju%2FlJ0GsgdBXYds2C61qpUoQPwuQw%3D&reserved=0 > ) That is a good finding if we want to do something about the "problem". One could maybe derive the "date" from the file-modification time of the recipe calling deb_debianize. But now you have fun with git and will need git-restore-mtime. We could also force people to put a fixed string there and only call date if that string is not in place. > 2. Log files generated by different application, which are > adding build time values, I think we can remove these files if it is > not required after build. ( I tried at here > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Filbers%2Fisar%2Fblob%2Fmaster%2Fmeta%2Fclasses%2Fimage.bbclass%23L183&data=04%7C01%7Cde173c00-e982-4fda-8644-47edf4671d63%40ad011.siemens.com%7C186fea861bb04bae174708d96eee3bfb%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7C637662792787877923%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&sdata=kUZoEeqRwUQWk8jZQ%2Bp6t3zO4czouhLQjGKh9RPB8kc%3D&reserved=0 > but it did not work) Doing it there would be a good place. You could also use ROOTFS_POSTPROCESS_COMMAND which allows such things for layers so you do not need to touch the core. I could envision something like for f in "find all files not owned by any package": if f start.with(/etc) continue if other funny exception continue rm f In addition to what you want ... this would also shrink that rootfs, which would be nice even for people that do not care about repro. logs, tmpdirs, caches would be nice to get rid of. > 3. Cache and temporary files, I think we can delete these files > also. See previous. Just do all at once asking the package manager which files it does not know. This will also enforce a really nice discipline on users to not abuse ROOTFS_POSTPROCESS_COMMAND to smuggle files into the rootfs. > Please guide me to fix these issues. So while i am not 100% with the whole repro idea ... and whether it can really be done in complex layers ... because you are really not building a complicated thing here ... More real use-cases will contain many more packages build by isar, maybe introducing their own share of "repro" mistakes. So the "cherry on the cake" would be a helper script to allow anyone to spot repro diffs. It would run the same build, one online once ISAR_USE_CACHED_BASE_REPO and spit out two folders and a diff summary. To give anyone with repro in mind a chance to check their layer. In fact one has to wonder if such a script should be added to OE or already exists there. And one would add that to CI to find new problems as they are introduced. I think that allowing to provide a DEBIAN_CHANGELOG_DATE to enforce a string and not call date could be an interesting patch. And that a "delete everything not owned by any package" would make a really nice addition as well. Both as steps that are on their own valuable and happen to also work in the direction of reproducible builds. regards, Henning > Thanks, > Venkata.