Hi Maxim, On Fri, 2018-05-25 at 13:57 +0200, Maxim Yu. Osipov wrote: > Hi Claudius, > > Let me summarize the patchset status. > > 1. This patchset is just a first step towards more generic traceable > reproducible build - tarball versioning/reproducibility features are > out > of scope of this patch set. It is the first step and a RFC patch. It is mainly meant to support the discussion about the possible options. So I if we are on the same page about the solution for reproducibility in Isar and how to go forward from there I will post a non-RFC patch with documentation. > 2. Baurzhan always asks to provide some bits of information into the > documentation when a new feature is added (or changed). See answer above. > > 3. Henning asked about "stealing DL_DIR of bitbake as well" (see his > email below). What is your opinion? My opinion is that we should document want files are needed be archived in order to reproduce the complete build. It is also that we (as in isar) cannot dictate to the user how she has to archived those artifacts. There are many different systems for archiving binary files available and we can offer them some suggestions or even write some example code for a couple of systems, but we should still be able to support all of their own ideas. Those code examples can be in the form of some shell/python scripts, but I would be against binding isar to one of such system at this point. To be honest I didn't 100% get what Henning meant as 'stealing DL_DIR of bitbake'. I suppose that he meant putting files from there into the tarball as well? But I currently don't see a good reason for doing so. My patch was about producing an artifact that isn't covered by the DL_DIR and the normal reproducibility mechanisms of bitbake, mixing this just causes redundancy and confusion IMO. I also skipped answering this questions because I thought I answered that in the passage before by pointing out that any expansions to this (like source packages) can be done later. Just to make it clear: I don't want to shut down discussion about this by always pointing out to any argument against this solution, that it can be done later. Hennings and Baurzhans critique points are very welcome, because they ask: "Can your solution really be expanded to include those use-cases/requirements?" For instance this partial update feature is something that is not so easy to do with this simple mechanism. It would be much easier to do this if aptly had "proxy caching" support [1] and we would use that to solve reproducibility in isar. Also Hennings point about source packages could be done easier with some coding inside an apt repo proxy/web server. So are we just saving complexity now and get it later in heaps or are we gaining a simple normal case while having some hurdles in the odd special one? I don't know yet. Please tell me! What we have now is a solution space, from a simple solution like this RFC patch to a possible complex solution with an "apt caching proxy". Maybe someone can think of a good solution in between or if some important feature or UX concern requires a more complex approach. Here are some ideas I have seen mentioned and my opinion on them including some pros and cons I just came up with. This is from memory, so please correct me if I remembered something incorrectly: Idea 0: Store tarball of debootstrap output with filled apt cache and use that to restore isar-bootstrap. Critique 0: Thats in short my 'simple solution' Pro: simple to implement Con: Debootstrap process is not done on every build. Archival of a binary root file system is strange. How to archive source packages? => add apt-get source to the installation process How to handle partial update? => write a script that generates an isar recipe that deploys those packages to the isar-apt repo. Idea 1: Generate a repository from the cache and use that for the next debootstrap run. Critique 1: Similar to my 'simple solution' but adds the creation of an additional repository to it. -> higher complexity Pro: debootstrap process is done on every build. Con: Different apt repo urls are used. For me that is a no-go, because that means the configuration is different between the initial and subsequent builds. How to add new packages later? (maybe like partial update?) How to handle multiple repos? => map all repos from initial run to the local one. And then what? => cannot be reverted, loss of information How to archive source packages? (same as Idea 0) How to handle partial update? (same as Idea 0) Idea 2: Like idea 1 but with aptly. And then use aptly to manage packages. Critique 2: I am not that familiar with aptly, so I please correct me. Pro: debootstrap process is done on every build. Better repo management tools. Con: Different apt repo urls are used. Need a whole mirror locally? (See Idea 3 and 4) Dependency on external tool. Possible some roadblocks since aptly isn't really designed for our use case. Idea 3: Create a whole repo mirror with aptly or similar and strip unused packages later. Critique 3: Pro: debootstrap process is done on every build. Better repo management tools. Con: Need a whole mirror locally. For me that is a no-go as well, it should only be downloaded what is necessary for a build, nothing more. Dependency on external tool. Adding new packages later is a double step: adding in aptly then to isar Possible some roadblocks since aptly isn't really designed for our use case. Idea 4: Create a whole repo mirror with aptly or similar and import used package into a new repo. Critique 4: Pro: debootstrap process is done on every build. Better repo management tools. Con: Different apt repo urls are used. Need a whole mirror locally? That might be unnecessary. Per aptly documentation it could possible to create a mirror with a package filter to only allow used packages. Then this is similar to idea 2. Dependency on external tool. Possible some roadblocks since aptly isn't really designed for our use case. Idea 5: Implementing a 'caching proxy' feature in aptly. Critique 5: Pro: debootstrap process is done on every build. Better repo management tools. Con: Dependency on external tool. Needs some implementation in aptly. Idea 6: Implementing a caching proxy feature in isar. Critique 6: That was my initial idea way back. Pro: debootstrap process is done on every build. Con: Needs a lot of python scripting and code maintenance in isar. If I missed or misrepresented an idea, please don't hesitate to correct me or add them. Ideas 2 to 4 are just slight variations from another. Those are just the different ideas I could imagine using aptly for our purposes. Because of the contra arguments 'whole local mirror' and 'different apt repo urls are used' I would got for 0 and 5. Idea 6 I discarded after some experimentation. Writing a async http proxy with only std-lib python is a pain. However writing blocking code with thread pools might be easier. So I think I rambled enough now... Sorry for that. Cheers for anyone left reading to this point, Claudius [1] What I mean by this is that aptly operates as some kine of lazy fetching http/ftp/rsync/... repo. Any request to a unavailable file is downloaded from upstream and then stored on the build machine. I don't mean that aptly should necessarily be a http proxy, could also just be a web server. > > Kind regards, > Maxim. > > On 05/25/2018 10:10 AM, Claudius Heine wrote: > > Hi Henning, > > > > On 05/24/2018 06:00 PM, Henning Schild wrote: > > > Am Wed, 23 May 2018 08:32:03 +0200 > > > schrieb "[ext] claudius.heine.ext@siemens.com" > > > : > > > > > > > From: Claudius Heine > > > > > > > > Hi, > > > > > > > > this patchset contains a implementation of my proposed solution > > > > for > > > > reproducible builds. > > > > > > > > I am currenlty not quite sure if that is the right approach, > > > > but it is > > > > the simplest I can think of currently. > > > > > > I did not look at the patches yet. And because it sounds so > > > simple my > > > first reaction is that it can not be complete.> One thing we will > > > need for sure is the sources that lead to the > > > packages we built ourselfs, otherwise we can not rebuild them > > > later on. > > > And that seems to be a tricky part, not covered by stealing the > > > cache. > > > > You are right, this solution is not complete and Rom was not build > > on > > one day. My goal was to improve the situation just one small step > > and > > then build on top of it. > > > > > Maybe stealing the DLDIR of bitbake as well? > > > > > > > As already described in my proposal, this patchset does the > > > > following: > > > > > > > > 1. Takes care that the package cache in the isar-bootstrap > > > > root file > > > > system contains all the packages used for this > > > > distro/architecture. 2. A tarball is created after the package > > > > cache > > > > contains all the packages needed by the image. > > > > > > Are you sure that "apt-get clean" is the only reason for cache > > > eviction? What will happen if i install a ton of packages, not > > > that apt > > > will want to safe space at some point. > > > > Yes, I might be useful to set the apt.conf to disable all > > autocleaning > > options. > > But normally apt removes packages from cache only if they are no > > longer > > downloadable and since the local index of the upstream repos are > > not > > updated it shouldn't detect if they are no longer downloadable and > > therefore not remove them. Disabling this completely is still the > > better > > option. > > > > Claudius > > > > > > > > Henning > > > > > > > 3. This tarball can be used as the basis of subsequent > > > > builds by > > > > setting a bitbake variable. > > > > > > > > This is just a first draft of this feature, maybe we can > > > > further > > > > improve some steps and maybe there are better ideas to improve > > > > the > > > > usability. > > > > > > > > Cheers, > > > > Claudius > > > > > > > > Claudius Heine (3): > > > > meta/isar-bootstrap-helper+dpkg.bbclass: bind mount > > > > /var/cache/apt/archives > > > > meta/classes/image: added isar_bootstrap_tarball task > > > > meta/isar-bootstrap: add 'do_restore_from_tarball' task > > > > > > > > meta/classes/dpkg.bbclass | 5 ++++ > > > > meta/classes/image.bbclass | 10 +++++++ > > > > meta/classes/isar-bootstrap-helper.bbclass | 9 ++++++- > > > > .../isar-bootstrap/isar-bootstrap.bb | 27 > > > > ++++++++++++++++++- 4 files changed, 49 insertions(+), 2 > > > > deletions(-) > > > > > > -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-54 Fax: (+49)-8142-66989-80 Email: ch@denx.de PGP key: 6FF2 E59F 00C6 BC28 31D8 64C1 1173 CB19 9808 B153 Keyserver: hkp://pool.sks-keyservers.net