From: Claudius Heine <ch@denx.de>
To: "Maxim Yu. Osipov" <mosipov@ilbers.de>,
Henning Schild <henning.schild@siemens.com>,
"[ext] claudius.heine.ext@siemens.com"
<claudius.heine.ext@siemens.com>
Cc: isar-users@googlegroups.com
Subject: Re: [RFC PATCH 0/3] Reproducible build
Date: Fri, 25 May 2018 19:04:53 +0200 [thread overview]
Message-ID: <3a6032fee718de6cf44fff4e8051a8c7a89a6471.camel@denx.de> (raw)
In-Reply-To: <a3f5bf99-6023-c254-b781-1ce89ffac0b1@ilbers.de>
[-- Attachment #1: Type: text/plain, Size: 11726 bytes --]
Hi Maxim,
On Fri, 2018-05-25 at 13:57 +0200, Maxim Yu. Osipov wrote:
> Hi Claudius,
>
> Let me summarize the patchset status.
>
> 1. This patchset is just a first step towards more generic traceable
> reproducible build - tarball versioning/reproducibility features are
> out
> of scope of this patch set.
It is the first step and a RFC patch. It is mainly meant to support the
discussion about the possible options. So I if we are on the same page
about the solution for reproducibility in Isar and how to go forward
from there I will post a non-RFC patch with documentation.
> 2. Baurzhan always asks to provide some bits of information into the
> documentation when a new feature is added (or changed).
See answer above.
>
> 3. Henning asked about "stealing DL_DIR of bitbake as well" (see his
> email below). What is your opinion?
My opinion is that we should document want files are needed be archived
in order to reproduce the complete build. It is also that we (as in
isar) cannot dictate to the user how she has to archived those
artifacts.
There are many different systems for archiving binary files available
and we can offer them some suggestions or even write some example code
for a couple of systems, but we should still be able to support all of
their own ideas. Those code examples can be in the form of some
shell/python scripts, but I would be against binding isar to one of
such system at this point.
To be honest I didn't 100% get what Henning meant as 'stealing DL_DIR
of bitbake'. I suppose that he meant putting files from there into the
tarball as well? But I currently don't see a good reason for doing so.
My patch was about producing an artifact that isn't covered by the
DL_DIR and the normal reproducibility mechanisms of bitbake, mixing
this just causes redundancy and confusion IMO.
I also skipped answering this questions because I thought I answered
that in the passage before by pointing out that any expansions to this
(like source packages) can be done later.
Just to make it clear: I don't want to shut down discussion about this
by always pointing out to any argument against this solution, that it
can be done later. Hennings and Baurzhans critique points are very
welcome, because they ask: "Can your solution really be expanded to
include those use-cases/requirements?"
For instance this partial update feature is something that is not so
easy to do with this simple mechanism. It would be much easier to do
this if aptly had "proxy caching" support [1] and we would use that to
solve reproducibility in isar. Also Hennings point about source
packages could be done easier with some coding inside an apt repo
proxy/web server. So are we just saving complexity now and get it later
in heaps or are we gaining a simple normal case while having some
hurdles in the odd special one? I don't know yet. Please tell me!
What we have now is a solution space, from a simple solution like this
RFC patch to a possible complex solution with an "apt caching proxy".
Maybe someone can think of a good solution in between or if some
important feature or UX concern requires a more complex approach.
Here are some ideas I have seen mentioned and my opinion on them
including some pros and cons I just came up with. This is from memory,
so please correct me if I remembered something incorrectly:
Idea 0: Store tarball of debootstrap output with filled apt cache and
use that to restore isar-bootstrap.
Critique 0: Thats in short my 'simple solution'
Pro: simple to implement
Con: Debootstrap process is not done on every build.
Archival of a binary root file system is strange.
How to archive source packages?
=> add apt-get source to the installation process
How to handle partial update?
=> write a script that generates an isar recipe that deploys
those packages to the isar-apt repo.
Idea 1: Generate a repository from the cache and use that for the next
debootstrap run.
Critique 1: Similar to my 'simple solution' but adds the creation of an
additional repository to it. -> higher complexity
Pro: debootstrap process is done on every build.
Con: Different apt repo urls are used.
For me that is a no-go, because that means the configuration
is different between the initial and subsequent builds.
How to add new packages later? (maybe like partial update?)
How to handle multiple repos?
=> map all repos from initial run to the local one.
And then what? => cannot be reverted, loss of information
How to archive source packages? (same as Idea 0)
How to handle partial update? (same as Idea 0)
Idea 2: Like idea 1 but with aptly. And then use aptly to manage
packages.
Critique 2: I am not that familiar with aptly, so I please correct me.
Pro: debootstrap process is done on every build.
Better repo management tools.
Con: Different apt repo urls are used.
Need a whole mirror locally? (See Idea 3 and 4)
Dependency on external tool.
Possible some roadblocks since aptly isn't really designed for
our use case.
Idea 3: Create a whole repo mirror with aptly or similar and strip
unused packages later.
Critique 3:
Pro: debootstrap process is done on every build.
Better repo management tools.
Con: Need a whole mirror locally.
For me that is a no-go as well, it should only be downloaded
what is necessary for a build, nothing more.
Dependency on external tool.
Adding new packages later is a double step: adding in aptly then to isar Possible some roadblocks since aptly isn't really designed for
our use case.
Idea 4: Create a whole repo mirror with aptly or similar and import
used package into a new repo.
Critique 4:
Pro: debootstrap process is done on every build.
Better repo management tools.
Con: Different apt repo urls are used.
Need a whole mirror locally?
That might be unnecessary. Per aptly documentation it could
possible to create a mirror with a package filter to
only allow used packages. Then this is similar to idea 2.
Dependency on external tool.
Possible some roadblocks since aptly isn't really designed for
our use case.
Idea 5: Implementing a 'caching proxy' feature in aptly.
Critique 5:
Pro: debootstrap process is done on every build.
Better repo management tools.
Con: Dependency on external tool.
Needs some implementation in aptly.
Idea 6: Implementing a caching proxy feature in isar.
Critique 6: That
was my initial idea way back.
Pro: debootstrap process is done on
every build.
Con: Needs a lot of python scripting and code
maintenance in isar.
If I missed or misrepresented an idea, please don't hesitate to correct
me or add them.
Ideas 2 to 4 are just slight variations from another. Those are just
the different ideas I could imagine using aptly for our purposes.
Because of the contra arguments 'whole local mirror' and 'different apt
repo urls are used' I would got for 0 and 5.
Idea 6 I discarded after some experimentation. Writing a async http
proxy with only std-lib python is a pain. However writing blocking code
with thread pools might be easier.
So I think I rambled enough now... Sorry for that.
Cheers for anyone left reading to this point,
Claudius
[1] What I mean by this is that aptly operates as some kine of lazy
fetching http/ftp/rsync/... repo. Any request to a unavailable file is
downloaded from upstream and then stored on the build machine. I don't
mean that aptly should necessarily be a http proxy, could also just be
a web server.
>
> Kind regards,
> Maxim.
>
> On 05/25/2018 10:10 AM, Claudius Heine wrote:
> > Hi Henning,
> >
> > On 05/24/2018 06:00 PM, Henning Schild wrote:
> > > Am Wed, 23 May 2018 08:32:03 +0200
> > > schrieb "[ext] claudius.heine.ext@siemens.com"
> > > <claudius.heine.ext@siemens.com>:
> > >
> > > > From: Claudius Heine <ch@denx.de>
> > > >
> > > > Hi,
> > > >
> > > > this patchset contains a implementation of my proposed solution
> > > > for
> > > > reproducible builds.
> > > >
> > > > I am currenlty not quite sure if that is the right approach,
> > > > but it is
> > > > the simplest I can think of currently.
> > >
> > > I did not look at the patches yet. And because it sounds so
> > > simple my
> > > first reaction is that it can not be complete.> One thing we will
> > > need for sure is the sources that lead to the
> > > packages we built ourselfs, otherwise we can not rebuild them
> > > later on.
> > > And that seems to be a tricky part, not covered by stealing the
> > > cache.
> >
> > You are right, this solution is not complete and Rom was not build
> > on
> > one day. My goal was to improve the situation just one small step
> > and
> > then build on top of it.
> >
> > > Maybe stealing the DLDIR of bitbake as well?
> > >
> > > > As already described in my proposal, this patchset does the
> > > > following:
> > > >
> > > > 1. Takes care that the package cache in the isar-bootstrap
> > > > root file
> > > > system contains all the packages used for this
> > > > distro/architecture. 2. A tarball is created after the package
> > > > cache
> > > > contains all the packages needed by the image.
> > >
> > > Are you sure that "apt-get clean" is the only reason for cache
> > > eviction? What will happen if i install a ton of packages, not
> > > that apt
> > > will want to safe space at some point.
> >
> > Yes, I might be useful to set the apt.conf to disable all
> > autocleaning
> > options.
> > But normally apt removes packages from cache only if they are no
> > longer
> > downloadable and since the local index of the upstream repos are
> > not
> > updated it shouldn't detect if they are no longer downloadable and
> > therefore not remove them. Disabling this completely is still the
> > better
> > option.
> >
> > Claudius
> >
> > >
> > > Henning
> > >
> > > > 3. This tarball can be used as the basis of subsequent
> > > > builds by
> > > > setting a bitbake variable.
> > > >
> > > > This is just a first draft of this feature, maybe we can
> > > > further
> > > > improve some steps and maybe there are better ideas to improve
> > > > the
> > > > usability.
> > > >
> > > > Cheers,
> > > > Claudius
> > > >
> > > > Claudius Heine (3):
> > > > meta/isar-bootstrap-helper+dpkg.bbclass: bind mount
> > > > /var/cache/apt/archives
> > > > meta/classes/image: added isar_bootstrap_tarball task
> > > > meta/isar-bootstrap: add 'do_restore_from_tarball' task
> > > >
> > > > meta/classes/dpkg.bbclass | 5 ++++
> > > > meta/classes/image.bbclass | 10 +++++++
> > > > meta/classes/isar-bootstrap-helper.bbclass | 9 ++++++-
> > > > .../isar-bootstrap/isar-bootstrap.bb | 27
> > > > ++++++++++++++++++- 4 files changed, 49 insertions(+), 2
> > > > deletions(-)
> > > >
>
>
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-54 Fax: (+49)-8142-66989-80 Email: ch@denx.de
PGP key: 6FF2 E59F 00C6 BC28 31D8 64C1 1173 CB19 9808 B153
Keyserver: hkp://pool.sks-keyservers.net
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2018-05-25 17:05 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-22 11:55 Idea for implementing reproducible builds Claudius Heine
2018-05-22 13:47 ` Andreas Reichel
2018-05-22 14:24 ` Claudius Heine
2018-05-22 22:32 ` Baurzhan Ismagulov
2018-05-23 8:22 ` Claudius Heine
2018-05-23 11:34 ` Claudius Heine
2018-06-04 11:48 ` Baurzhan Ismagulov
2018-05-23 6:32 ` [RFC PATCH 0/3] Reproducible build claudius.heine.ext
2018-05-23 6:32 ` [RFC PATCH 1/3] meta/isar-bootstrap-helper+dpkg.bbclass: bind mount /var/cache/apt/archives claudius.heine.ext
2018-05-23 6:32 ` [RFC PATCH 2/3] meta/classes/image: added isar_bootstrap_tarball task claudius.heine.ext
2018-05-23 6:32 ` [RFC PATCH 3/3] meta/isar-bootstrap: add 'do_restore_from_tarball' task claudius.heine.ext
2018-05-23 14:30 ` [RFC PATCH 0/3] Reproducible build Maxim Yu. Osipov
2018-05-23 15:20 ` Claudius Heine
2018-05-24 16:00 ` Henning Schild
2018-05-25 8:10 ` Claudius Heine
2018-05-25 11:57 ` Maxim Yu. Osipov
2018-05-25 17:04 ` Claudius Heine [this message]
2018-06-04 11:37 ` Baurzhan Ismagulov
2018-06-04 16:05 ` Claudius Heine
2018-06-05 10:42 ` Claudius Heine
2018-06-06 9:17 ` Claudius Heine
2018-06-06 14:20 ` Claudius Heine
2018-06-07 8:50 ` Baurzhan Ismagulov
2018-06-07 8:08 ` Maxim Yu. Osipov
2018-06-11 8:45 ` Claudius Heine
2018-06-11 13:51 ` Claudius Heine
2018-06-14 8:50 ` Claudius Heine
2018-06-20 4:20 ` Maxim Yu. Osipov
2018-06-20 8:12 ` Claudius Heine
2018-05-23 13:26 ` [RFC PATCH v2 " claudius.heine.ext
2018-05-23 13:26 ` [RFC PATCH v2 1/3] meta/isar-bootstrap-helper+dpkg.bbclass: bind mount /var/cache/apt/archives claudius.heine.ext
2018-05-23 13:26 ` [RFC PATCH v2 2/3] meta/classes/image: added isar_bootstrap_tarball task claudius.heine.ext
2018-05-23 13:26 ` [RFC PATCH v2 3/3] meta/isar-bootstrap: add 'do_restore_from_tarball' task claudius.heine.ext
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3a6032fee718de6cf44fff4e8051a8c7a89a6471.camel@denx.de \
--to=ch@denx.de \
--cc=claudius.heine.ext@siemens.com \
--cc=henning.schild@siemens.com \
--cc=isar-users@googlegroups.com \
--cc=mosipov@ilbers.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox