From: "Roberto A. Foglietta" <roberto.foglietta@gmail.com>
To: "Moessbauer, Felix" <felix.moessbauer@siemens.com>
Cc: "ubely@ilbers.de" <ubely@ilbers.de>,
"isar-users@googlegroups.com" <isar-users@googlegroups.com>,
"ibr@radix50.net" <ibr@radix50.net>,
"Schild, Henning" <henning.schild@siemens.com>
Subject: Re: Better way to handle apt cache needed
Date: Fri, 30 Dec 2022 00:15:33 +0100 [thread overview]
Message-ID: <CAJGKYO4LnC-Yomn2cVRibAdh0VaYL7b10_F_kaa4e0PWF7Bm5w@mail.gmail.com> (raw)
In-Reply-To: <c2cfeb38d7da65f4b968d28dff5ab2489eac0cac.camel@siemens.com>
On Wed, 28 Dec 2022 at 12:04, Moessbauer, Felix
<felix.moessbauer@siemens.com> wrote:
>
> On Wed, 2022-12-28 at 13:23 +0300, Uladzimir Bely wrote:
> > In mail from среда, 28 декабря 2022 г. 12:45:07 +03 user Moessbauer,
> > Felix
> > wrote:
> > > On Wed, 2022-12-28 at 10:21 +0100, Baurzhan Ismagulov wrote:
> > > > On Wed, Dec 28, 2022 at 09:02:13AM +0000, Moessbauer, Felix
> > > > wrote:
> > > > > The root cause for that behavior is the apt cache
> > > > > (deb_dl_dir_(import|export)), that copies all previously
> > > > > downloaded
> > > > > apt
> > > > > packages into the WORKDIR of each (bitbake) package.
> > > > > Given, that a common apt-cache is around 2GB and 8 tasks are
> > > > > run in
> > > > > parallel, this gives already 16GB for the tasks, and 7 * 2GB
> > > > > for
> > > > > the
> > > > > buildchroots (host and target), in total ~30GB.
> > > >
> > > > Thanks Felix for the report. IIRC, it was previously mounted and
> > > > was
> > > > supposed
> > > > to be converted to per-package hardlinks to parallelize sbuild
> > > > instances and
> > > > ease debugging (by knowing later which exact snapshot was fed to
> > > > a
> > > > specific
> > > > build). We use small (1- / 2-TB) SSDs as job storage and a huge
> > > > increase would
> > > > have been noticeable... We'll check.
> > >
> > > Thanks!
> > >
> > > I just noticed, that we even make an additional copy to have the
> > > packages inside the sbuild chroot.
> > >
> > > This behavior is hard to notice on small and medium sized projects
> > > (and
> > > given that IOPS are not an issue). But any quadratic behavior will
> > > eventually make the build impossible. And as Florian said, many of
> > > our
> > > ISAR users build in VMs on shared filesystems, where IO is
> > > extremely
> > > expensive / slow. If we could optimize that, it would be a huge
> > > benefit
> > > for a lot of users.
> >
> > I've just did some measurements:
> >
> > - run Isar build for qemuarm64 (8 cores = max 8 build tasks in
> > parallel)
> > - measured system disk consumption every 5 seconds
> >
> > Results:
> > - 'downloads/deb' directory finally takse 480MiB
> > - After the build finished, disk space decreased by ~9GiB
> > - During the build maximum disk space decrease was ~16GiB)
> >
> > It means, that we really had about 16 - 9 = 7GiB of space temporarly
> > used by
> > parallel builds and it really corresponds to 8 * 2 * 480MiB.
> >
> > So, the main goal now is to minimize this value.
>
> Even the 7GiB should be further reduced, as some major contributor to
> that is the apt-cache for the buildchroots and the images itself, which
> is never cleaned up. While this is not so much an issue for CI systems,
> it is definitely annoying on local builds. Especially, when working on
> multiple ISAR based projects.
>
> I would recommend to have a look at the build/tmp folder using ncdu -x
> (-x to not cross filesystems / temporary mounts).
Hi all,
I did some changes to reduce disk usage and to speed up the building.
The results are quite impressing so, before everything else I am going
to show you the numbers:
results before and after the changes
43954 Mb (max) | 8657 Mb (max)
26548 Mb (rest) | 4118 Mb (rest)
3741 Mb (deb) | 3741 Mb (deb)
820 Mb (wic) | 820 Mb (wic)
11789 Mb (cache) | 579 Mb (cache)
time: 8m33s | time: 4m34s
The changes has been committed on the default branch (evo) on my ISAR fork
https://github.com/robang74/isar
* 9fd5282 - changes for a faster build using less disk space, p.1
* 1b6ac1e - bugfix: no sstate archive obtainable, will run full task instead
* 2cc1854 - bugfix: do_copy_boot_files was able to fail but then set -e
The two bug fixes are necessary to test for regression because the top
commit changes the sstate cache creation. The top commit uses hard
physical links to obtain these results, so it may not work in some
cases like distributed or networked filesystems which do not support
hard links (however, they might be smart enough to do a bare copy but
I cannot grant that).
However, this is the reason because I titled part one the patch. I am
thinking of addressing the issue in a more radical way but for the
moment this seems quite interesting.
I wish to have some feedback about the limitation about using uncommon
filesystems and which one have a nice fallback to address unsupported
features.
Looking forward your opinions, R-
next prev parent reply other threads:[~2022-12-29 23:16 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-28 9:02 Moessbauer, Felix
2022-12-28 9:21 ` Baurzhan Ismagulov
2022-12-28 9:45 ` Moessbauer, Felix
2022-12-28 10:23 ` Uladzimir Bely
2022-12-28 11:04 ` Moessbauer, Felix
2022-12-29 23:15 ` Roberto A. Foglietta [this message]
2022-12-30 4:38 ` Uladzimir Bely
2022-12-30 7:08 ` Roberto A. Foglietta
2022-12-30 6:05 ` Moessbauer, Felix
2022-12-30 8:27 ` Roberto A. Foglietta
2022-12-30 10:04 ` Moessbauer, Felix
2022-12-30 13:11 ` Moessbauer, Felix
2022-12-30 13:33 ` Roberto A. Foglietta
2022-12-30 13:47 ` Roberto A. Foglietta
2022-12-31 8:59 ` Roberto A. Foglietta
2022-12-31 21:03 ` Roberto A. Foglietta
2023-01-09 8:12 ` Roberto A. Foglietta
2023-01-09 9:58 ` Roberto A. Foglietta
2023-01-19 18:08 ` Roberto A. Foglietta
2023-01-25 4:48 ` Roberto A. Foglietta
2023-02-10 16:05 ` Roberto A. Foglietta
2023-02-14 10:01 ` Roberto A. Foglietta
2023-02-14 16:46 ` Roberto A. Foglietta
2022-12-30 12:29 ` Roberto A. Foglietta
2022-12-28 9:22 ` Florian Bezdeka
2023-01-02 16:15 ` Henning Schild
2023-01-05 6:31 ` Uladzimir Bely
2023-01-05 17:10 ` Roberto A. Foglietta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJGKYO4LnC-Yomn2cVRibAdh0VaYL7b10_F_kaa4e0PWF7Bm5w@mail.gmail.com \
--to=roberto.foglietta@gmail.com \
--cc=felix.moessbauer@siemens.com \
--cc=henning.schild@siemens.com \
--cc=ibr@radix50.net \
--cc=isar-users@googlegroups.com \
--cc=ubely@ilbers.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox