From: Claudius Heine <claudius.heine.ext@siemens.com>
To: "Maxim Yu. Osipov" <mosipov@ilbers.de>, isar-users@googlegroups.com
Cc: Silvano Cirujano Cuesta <silvano.cirujano-cuesta@siemens.com>
Subject: Re: [RFC PATCH 0/3] Reproducible build
Date: Wed, 20 Jun 2018 10:12:14 +0200 [thread overview]
Message-ID: <b2cec89a-90bb-e8c2-e60c-68ccebce8b05@siemens.com> (raw)
In-Reply-To: <383e7f38-82f9-cc3c-dfec-f974d0e0bfc0@ilbers.de>
Hi Maxim,
On 2018-06-20 06:20, Maxim Yu. Osipov wrote:
> Hi Claudius,
>
> On 06/14/2018 10:50 AM, Claudius Heine wrote:
>> Hi,
>>
>> On 2018-06-11 15:51, [ext] Claudius Heine wrote:
>>> Hi,
>>>
>>> On 2018-06-11 10:45, [ext] Claudius Heine wrote:
>>>> Hi Maxim,
>>>>
>>>> On 2018-06-07 10:08, Maxim Yu. Osipov wrote:
>>>>> Hi Claudius,
>>>>>
>>>>> As far as I understood, 'apt-move' doesn't fit your requirements.
>>>>
>>>> The documented functionality of apt-move would fit the requirements,
>>>> but since its no longer maintained and has bugs that makes it
>>>> incompatible with debootstrap, it cannot be used.
>>>>
>>>>> Nevertheless, diagram illustrating the approach you propose would
>>>>> be be very helpful.
>>>>>
>>>>> As for diagram you attached to email below, honestly, it's rather
>>>>> difficult to understand it - too many arrows :(. It's worth to
>>>>> additionally describe the steps in text (and put corresponding
>>>>> numbers on the picture).
>>>>
>>>> Yes I know its a bit overwhelming with all those arrows. It was much
>>>> worse before and this is the result of much simplifications.
>>>>
>>>> But either way, since we cannot use apt-move I have to investigate
>>>> what is possible with reprepro and change the diagram accordingly.
>>>
>>> reprepro does provide many nice features that might be interesting to
>>> use. For instance the 'gensnapshot' command. This caused me to
>>> further redesign this approach as the diagram attached shows.
>>>
>>> I know this kind of diagram does not 100% the UML meaning, but I
>>> could not find a diagram type that really fits to what I want to show
>>> here.
>>>
>>> This time I try to explain a bit more about what those arrows mean
>>> and how to read this diagram.
>>>
>>> Dashed lines symbolizes dependencies between steps or 'packages' (in
>>> this case I mean those 'isar-bootstrap', 'buildchroot', 'recipes' and
>>> 'image'. So if they are followed from the 'debootstrap download' in
>>> the 'isar-bootstrap' package upwards, then this is the execution
>>> order. When the 'package import' component is reached the next
>>> 'package' in the dependency graph are started. This is 'buildchroot'
>>> or 'image'. Normally that is the buildchroot. Only if there are no
>>> recieps around that need a buildchroot it could be skipped. That
>>> means the last step to be executed is the 'finished' one.
>>>
>>> The arrows going out from those database things, mean that packages
>>> are used from their. Those arrows have the same arrow head as the
>>> dependency arrows. In this diagram I made the lines of the arrows
>>> that are going out from 'local partial upstream mirror' a bit thicker
>>> to better differentiate them from the other ones.
>>>
>>> The arrows that are going to 'to local mirror', 'to isar repository
>>> or 'create local snapshot' nodes have a slightly different arrow
>>> head. These arrows mean that there are debian packages that are added
>>> to repos.
>>>
>>> The basic idea it that after every step that installs or upgrades
>>> packages to the rootfs those packages are added to the local partial
>>> mirror. This can be done by first doing a download-only step adding
>>> it to the repo and then doing an install from the repo or the cache.
>>>
>>> I also changed how the repo is build. The current idea it to have one
>>> repository with 2 components. One component for all upstream packages
>>> and one for all isar built packages. When the snapshot is generated
>>> in the end it will create one containing both components.
>>>
>>> Maybe we should start adding the build time to all files in the
>>> deploy directory. This way we could add this to the name of the
>>> snapshot as well, so the association between those is made clear.
>>>
>>> What do you think?
>>
>> This solution has one issue I can currently think of. It doesn't solve
>> this:
>>
>> >>> U3.4. Remove packages not used in any previous commit.
>> >>
>> >> I am currently not sure what you mean by that. Why would there be
>> packages
>> >> that aren't used in any previous commits?
>> >
>> > Bad wording, I meant just "remove unused packages".
>>
>> We could solve that by always creating a fresh repository and add the
>> repository from the old build as primary source for the current one.
>>
>> However, this is getting even more complex and I might need more
>> arrows... :/
>>
>> It would be simple if we didn't need to combine this with updatability
>> of selected packages and automated fetching of new packages.
>
> I agree with you - we may drop this use case as this is a kind of
> overkill which makes design more complex.
>
> I need some clarification on your diagram on the box 'apt upgrade'. The
> comment to 'apt upgrade' states "prefers local mirror over upstream".
>
> Do I understand correctly that as soon as we created our local partial
> mirror it will be only updated as a side effect of installation of build
> dependencies (buildchroot or package) or if the package listed in
> image's IMAGE_PREINSTALL is not in the local repo?
>
> So we don't call 'apt upgrade' over upstream apt repos anymore, right?
So the basic idea is that the local partial mirror is given a priority
of 1001 via apt preferences. So calling an unrestricted `apt update` and
`apt upgrade` can be done anytime in the build. `apt upgrade` would
prefer installing any package from the local partial mirror over any
upstream distribution even if upstream has a more recent version of that
package.
Only packages that are not available in the local partial mirror would
be fetched from upstream, since it has to get them from somewhere. After
that though this new package is added to the local partial mirror, so
any builds after that would fetch the same version of the package from
the local partial mirror instead of the upstream.
> It would be nice for understanding of how local mirroring works is you
> describe the case when we need a package which is not present in our
> local mirror (just imagine the case that we stick for a long time to our
> local mirror and we add a new package to IMAGE_PREINSTALL which is not
> in our local mirror and this package depends on the updated versions of
> upstream packages).
Ok, I try to describe some scenarios that might help understand what I
imagine:
1. Adding a package to an old build
- Add the package to the IMAGE_PREINSTALL list
- Start build
- Everything until the image recipe runs the same, since all
packages that are used by isar-bootstrap and buildchroot are the
same and can be fetched from the local partial mirror.
Only difference is that the image contains an updated index of the
upstream repositories.
- In the image recipe a new package is installed, that is not
available in the local partial mirror. That image is then fetched
from the upstream mirror and installed.
- That could cause a conflict with the old versions of the rest of
the system and hopefully lead to an error, requiring manual
steps to add an old but compatible package to be added to the
local partial mirror by hand.
I don't think this kind of stuff can be prevented or automated
by isar reasonably.
- The newly installed package is installed to the local partial
mirror, making it available for subsequent builds.
2. Removing a package from an old build
This requires some changes of the current system. A new 'local partial
mirror repo' needs to be created at every build and filled with
packages needed for that specific build. This way no unused packages
find their way into the repo.
An old local partial mirror repo will used as an immutable package
repository with the 1001 priority.
This approach borrows from concepts like copy-on-write and functional
programming.
- Remove the package from the IMAGE_PREINSTALL list
- Start build
- Everything is done like the build before while using the local
partial mirror repository from the build before as a base.
The removed package is not installed to the image or added to the
fresh local partial mirror repo.
3. Updating a package from an old build
The goal here is to get an updated version of a package to the image
and by extension to the local partial mirror repo.
To solve this depends of where this package is originating.
If its a specific version the developer wants, it might be better to
download that package add add it to isar via a recipe. Since the
apt-isar repo, containing the packages 'build' by isar, should have
a higher priority than the local partial mirror repository, the
specific version would be preferred over any version from upstream
or the local partial mirror repo.
To solve this, we might need to implement some helper scripts outside
of bitbake build. Similar to the `bitbake-layers` script.
If the developer generally wants to use *some* version of the package
that is currently used by upstream, pinning rules could be used for
one run and then disabled again, just to make sure that those
packages are now in the local partial mirror repo.
The other solution for both of those is to modify the local partial
mirror outside of the bitbake build. This is the way I showed in the
diagram as well. I am a bit hesitant to do that tough, since it
modifies something that is used for reproducible builds outside of
the build.
Maybe if we have the changes I described in point 2, we could just
add those packages to the newly created local partial mirror cache,
but then this needs to be part of the build as well. This would
result in these priorities:
1. local partial mirror repo from previous build: 1001
2. local partial mirror repo from current build: 1002
3. isar apt repo for current isar built packages: 1003
So what I meant was adding the updated packages to 2 before the
build is started. And I am hesitant about adding packages to 1,
because that modifies stuff from a previous build.
If updating certain packages from upstream should be done within the
build, then it could be done between setting the sources.list entry
and apt-preferences. This way those packages will be fetched from
upstream if its more up-to-date there and then added to the local
partial mirror cache.
In any-case tough this could lead to a 'local partial mirror repo'
that uses a mix of many different upstream repos snapshots, where it
can be obscure as to *why* specific versions are used.
The first 2 points behave similar when adding/removing build or runtime
dependencies, then they might just apply to the buildchroot as well.
regards,
Claudius
>
> Kind regards,
> Maxim.
>
>> Cheers,
>> Claudius
>>
>>>
>>> best regards,
>>> Claudius
>>>
>>>>
>>>> Claudius
>>>>
>>>>>
>>>>> Kind regards,
>>>>> Maxim.
>>>>>
>>>>> On 06/05/2018 12:42 PM, Claudius Heine wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 2018-06-04 18:05, [ext] Claudius Heine wrote:
>>>>>>> I will try to post a small graphic about this soon.
>>>>>>
>>>>>> I attached the design Jan and me came up with yesterday.
>>>>>>
>>>>>> Hopefully its understandable enough even with all those arrows
>>>>>> pointing around. I tried to vary the arrow lines and heads to
>>>>>> signify dependencies/execution order, using of repositories and
>>>>>> deploying packages to the repositories.
>>>>>>
>>>>>> The main point about this is that it will work with apt
>>>>>> preferences repository pinning and that we will try to move all
>>>>>> installed packages to the local cache after every step.
>>>>>>
>>>>>> I highlighted the new components in the diagram as green and the
>>>>>> changed components as light green.
>>>>>>
>>>>>> If this is the way we want to go then I would try to get a RFC
>>>>>> patchset started.
>>>>>>
>>>>>> Cheers,
>>>>>> Claudius
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-54 Fax: (+49)-8142-66989-80 Email: ch@denx.de
next prev parent reply other threads:[~2018-06-20 8:12 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-22 11:55 Idea for implementing reproducible builds Claudius Heine
2018-05-22 13:47 ` Andreas Reichel
2018-05-22 14:24 ` Claudius Heine
2018-05-22 22:32 ` Baurzhan Ismagulov
2018-05-23 8:22 ` Claudius Heine
2018-05-23 11:34 ` Claudius Heine
2018-06-04 11:48 ` Baurzhan Ismagulov
2018-05-23 6:32 ` [RFC PATCH 0/3] Reproducible build claudius.heine.ext
2018-05-23 6:32 ` [RFC PATCH 1/3] meta/isar-bootstrap-helper+dpkg.bbclass: bind mount /var/cache/apt/archives claudius.heine.ext
2018-05-23 6:32 ` [RFC PATCH 2/3] meta/classes/image: added isar_bootstrap_tarball task claudius.heine.ext
2018-05-23 6:32 ` [RFC PATCH 3/3] meta/isar-bootstrap: add 'do_restore_from_tarball' task claudius.heine.ext
2018-05-23 14:30 ` [RFC PATCH 0/3] Reproducible build Maxim Yu. Osipov
2018-05-23 15:20 ` Claudius Heine
2018-05-24 16:00 ` Henning Schild
2018-05-25 8:10 ` Claudius Heine
2018-05-25 11:57 ` Maxim Yu. Osipov
2018-05-25 17:04 ` Claudius Heine
2018-06-04 11:37 ` Baurzhan Ismagulov
2018-06-04 16:05 ` Claudius Heine
2018-06-05 10:42 ` Claudius Heine
2018-06-06 9:17 ` Claudius Heine
2018-06-06 14:20 ` Claudius Heine
2018-06-07 8:50 ` Baurzhan Ismagulov
2018-06-07 8:08 ` Maxim Yu. Osipov
2018-06-11 8:45 ` Claudius Heine
2018-06-11 13:51 ` Claudius Heine
2018-06-14 8:50 ` Claudius Heine
2018-06-20 4:20 ` Maxim Yu. Osipov
2018-06-20 8:12 ` Claudius Heine [this message]
2018-05-23 13:26 ` [RFC PATCH v2 " claudius.heine.ext
2018-05-23 13:26 ` [RFC PATCH v2 1/3] meta/isar-bootstrap-helper+dpkg.bbclass: bind mount /var/cache/apt/archives claudius.heine.ext
2018-05-23 13:26 ` [RFC PATCH v2 2/3] meta/classes/image: added isar_bootstrap_tarball task claudius.heine.ext
2018-05-23 13:26 ` [RFC PATCH v2 3/3] meta/isar-bootstrap: add 'do_restore_from_tarball' task claudius.heine.ext
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b2cec89a-90bb-e8c2-e60c-68ccebce8b05@siemens.com \
--to=claudius.heine.ext@siemens.com \
--cc=isar-users@googlegroups.com \
--cc=mosipov@ilbers.de \
--cc=silvano.cirujano-cuesta@siemens.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox