* RFC: base-apt caching improvements
@ 2019-02-27 9:05 Henning Schild
2019-02-28 9:40 ` Henning Schild
2019-03-01 17:38 ` Baurzhan Ismagulov
0 siblings, 2 replies; 4+ messages in thread
From: Henning Schild @ 2019-02-27 9:05 UTC (permalink / raw)
To: isar-users, Maxim Yu. Osipov, Alexander Smirnov, Kiszka,
Jan (CT RDA IOT SES-DE)
Hi,
i did not really like the current approach to how we cache, because i
thought we can and should do that somehow transparent and route all
"apt" downloads through a proxy that will reprepro all files as the are
requested.
I did not find a proxy that can do that, maybe i did not look hard
enough.
So i came back to the currently implemented model. Collect the files
from the rootfs-es after we are all done. The way that is currently
done is getting them from the apt cache. Unfortunately that is not
guaranteed to work. Any package we install could mess with apt-config
and therefore the caching of the rootfs-es. (i.e. something
like /etc/apt/apt.conf.d/docker-clean you will find a container images)
So that cache can not be trusted. And that cache does not work for
source packages and can not fulfill our needs with regards to full
caching/archiving.
So instead we should go and download all debs and sources of the
currently installed packages of all rootfs-es (target and its
buildchroots) explicitly and reprepro from there.
Here is what this could look like:
rm -rf /tmp/foo
cd /tmp/foo
dpkg -l | grep "^ii" | awk '{print $2"="$3}' | xargs apt-get -y download
### reprepro *.deb
rm -rf /tmp/foo-src
cd /tmp/foo-src
dpkg -l | grep "^ii" | awk '{print $2"="$3}' | xargs apt-get -y source --download-only
### reprepro *.dsc
This still lacks filtering out all the packages from isar-apt but could
be an improvement to our current way. We do not need to trust those fragile
caches because we are explicit. We can do sources ... and should probably
just do all of them anyways.
For selective sources we could "apt-get source" into a central dldir and
copy the sources to the recipe workdir. Would be a matter of mounting that
shared dir and changing do_apt_fetch to use that staging dir to leave a copy.
In fact we (Siemens) want all sources, and it might be a sane default for
others as well.
Henning
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RFC: base-apt caching improvements
2019-02-27 9:05 RFC: base-apt caching improvements Henning Schild
@ 2019-02-28 9:40 ` Henning Schild
2019-03-01 17:38 ` Baurzhan Ismagulov
1 sibling, 0 replies; 4+ messages in thread
From: Henning Schild @ 2019-02-28 9:40 UTC (permalink / raw)
To: isar-users, Maxim Yu. Osipov, Alexander Smirnov, Kiszka,
Jan (CT RDA IOT SES-DE)
Am Wed, 27 Feb 2019 10:05:36 +0100
schrieb "[ext] Henning Schild" <henning.schild@siemens.com>:
> Hi,
>
> i did not really like the current approach to how we cache, because i
> thought we can and should do that somehow transparent and route all
> "apt" downloads through a proxy that will reprepro all files as the
> are requested.
> I did not find a proxy that can do that, maybe i did not look hard
> enough.
>
> So i came back to the currently implemented model. Collect the files
> from the rootfs-es after we are all done. The way that is currently
> done is getting them from the apt cache. Unfortunately that is not
> guaranteed to work. Any package we install could mess with apt-config
> and therefore the caching of the rootfs-es. (i.e. something
> like /etc/apt/apt.conf.d/docker-clean you will find a container
> images)
>
> So that cache can not be trusted. And that cache does not work for
> source packages and can not fulfill our needs with regards to full
> caching/archiving.
>
> So instead we should go and download all debs and sources of the
> currently installed packages of all rootfs-es (target and its
> buildchroots) explicitly and reprepro from there.
>
> Here is what this could look like:
>
> rm -rf /tmp/foo
> cd /tmp/foo
> dpkg -l | grep "^ii" | awk '{print $2"="$3}' | xargs apt-get -y
> download ### reprepro *.deb
> rm -rf /tmp/foo-src
> cd /tmp/foo-src
> dpkg -l | grep "^ii" | awk '{print $2"="$3}' | xargs apt-get -y
> source --download-only ### reprepro *.dsc
>
> This still lacks filtering out all the packages from isar-apt but
Here are some snipplets to filter out isar-apt, does seem non-trivial
and i ended up parsing. Might actually require some perl to dig into
the guts of apt.
apt-cache policy sed | grep -A 1 "\*\*\* $( dpkg-query --show --showformat '${Version}' sed )" | grep -q "file:/isar-apt"
or
grep -e "^Package: hello$" /var/lib/apt/lists/_isar-apt_dists_isar_main*_Packages -A 1 | grep "Version: $( dpkg-query --show --showformat '{Version}' hello )
The way we currently filter out isar-apt is pretty fragile and should
be improved. We find packages by matching file names. Instead we should
read the .debs with dpkg-deb and read the repo with reprepro.
Henning
> could be an improvement to our current way. We do not need to trust
> those fragile caches because we are explicit. We can do sources ...
> and should probably just do all of them anyways.
> For selective sources we could "apt-get source" into a central dldir
> and copy the sources to the recipe workdir. Would be a matter of
> mounting that shared dir and changing do_apt_fetch to use that
> staging dir to leave a copy. In fact we (Siemens) want all sources,
> and it might be a sane default for others as well.
>
> Henning
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RFC: base-apt caching improvements
2019-02-27 9:05 RFC: base-apt caching improvements Henning Schild
2019-02-28 9:40 ` Henning Schild
@ 2019-03-01 17:38 ` Baurzhan Ismagulov
2019-03-05 9:53 ` Henning Schild
1 sibling, 1 reply; 4+ messages in thread
From: Baurzhan Ismagulov @ 2019-03-01 17:38 UTC (permalink / raw)
To: isar-users
On Wed, Feb 27, 2019 at 10:05:36AM +0100, Henning Schild wrote:
> So instead we should go and download all debs and sources of the
> currently installed packages of all rootfs-es (target and its
> buildchroots) explicitly and reprepro from there.
Thanks for the clear description. I agree this approach is better. Should we
look at that?
> This still lacks filtering out all the packages from isar-apt but could
> be an improvement to our current way.
We have isar-apt filtering in place, which could be adapted to the new
solution. Or does it have problems?
With kind regards,
Baurzhan.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: RFC: base-apt caching improvements
2019-03-01 17:38 ` Baurzhan Ismagulov
@ 2019-03-05 9:53 ` Henning Schild
0 siblings, 0 replies; 4+ messages in thread
From: Henning Schild @ 2019-03-05 9:53 UTC (permalink / raw)
To: Baurzhan Ismagulov; +Cc: isar-users
Am Fri, 1 Mar 2019 18:38:41 +0100
schrieb Baurzhan Ismagulov <ibr@radix50.net>:
> On Wed, Feb 27, 2019 at 10:05:36AM +0100, Henning Schild wrote:
> > So instead we should go and download all debs and sources of the
> > currently installed packages of all rootfs-es (target and its
> > buildchroots) explicitly and reprepro from there.
>
> Thanks for the clear description. I agree this approach is better.
> Should we look at that?
I currently do not have the time do work on it. So i decided to write
down my findings and document them, for anyone to implement it. Or for
myself to implement it eventually.
If you want to work on it, go ahead!
> > This still lacks filtering out all the packages from isar-apt but
> > could be an improvement to our current way.
>
> We have isar-apt filtering in place, which could be adapted to the new
> solution. Or does it have problems?
I wrote something in a reply. The current implementation can be
improved by reading deb meta-data with dpkg-deb.
When looping over all installing packages and fetching sources, the
isar-apt packages will fail. So you need to detect their
"isar-apt"-ness before you start downloading, the current approach
detects it before reprepro.
Henning
> With kind regards,
> Baurzhan.
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-03-05 9:53 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-27 9:05 RFC: base-apt caching improvements Henning Schild
2019-02-28 9:40 ` Henning Schild
2019-03-01 17:38 ` Baurzhan Ismagulov
2019-03-05 9:53 ` Henning Schild
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox