public inbox for isar-users@googlegroups.com
 help / color / mirror / Atom feed
From: Uladzimir Bely <ubely@ilbers.de>
To: "Roberto A. Foglietta" <roberto.foglietta@gmail.com>
Cc: "isar-users@googlegroups.com" <isar-users@googlegroups.com>
Subject: Re: Better way to handle apt cache needed
Date: Fri, 30 Dec 2022 07:38:57 +0300	[thread overview]
Message-ID: <3494812.dWV9SEqChM@hp> (raw)
In-Reply-To: <CAJGKYO4LnC-Yomn2cVRibAdh0VaYL7b10_F_kaa4e0PWF7Bm5w@mail.gmail.com>

In mail from пятница, 30 декабря 2022 г. 02:15:33 +03 user Roberto A. 
Foglietta wrote:
> On Wed, 28 Dec 2022 at 12:04, Moessbauer, Felix
> 
> <felix.moessbauer@siemens.com> wrote:
> > On Wed, 2022-12-28 at 13:23 +0300, Uladzimir Bely wrote:
> > > In mail from среда, 28 декабря 2022 г. 12:45:07 +03 user Moessbauer,
> > > Felix
> > > 
> > > wrote:
> > > > On Wed, 2022-12-28 at 10:21 +0100, Baurzhan Ismagulov wrote:
> > > > > On Wed, Dec 28, 2022 at 09:02:13AM +0000, Moessbauer, Felix
> > > > > 
> > > > > wrote:
> > > > > > The root cause for that behavior is the apt cache
> > > > > > (deb_dl_dir_(import|export)), that copies all previously
> > > > > > downloaded
> > > > > > apt
> > > > > > packages into the WORKDIR of each (bitbake) package.
> > > > > > Given, that a common apt-cache is around 2GB and 8 tasks are
> > > > > > run in
> > > > > > parallel, this gives already 16GB for the tasks, and 7 * 2GB
> > > > > > for
> > > > > > the
> > > > > > buildchroots (host and target), in total ~30GB.
> > > > > 
> > > > > Thanks Felix for the report. IIRC, it was previously mounted and
> > > > > was
> > > > > supposed
> > > > > to be converted to per-package hardlinks to parallelize sbuild
> > > > > instances and
> > > > > ease debugging (by knowing later which exact snapshot was fed to
> > > > > a
> > > > > specific
> > > > > build). We use small (1- / 2-TB) SSDs as job storage and a huge
> > > > > increase would
> > > > > have been noticeable... We'll check.
> > > > 
> > > > Thanks!
> > > > 
> > > > I just noticed, that we even make an additional copy to have the
> > > > packages inside the sbuild chroot.
> > > > 
> > > > This behavior is hard to notice on small and medium sized projects
> > > > (and
> > > > given that IOPS are not an issue). But any quadratic behavior will
> > > > eventually make the build impossible. And as Florian said, many of
> > > > our
> > > > ISAR users build in VMs on shared filesystems, where IO is
> > > > extremely
> > > > expensive / slow. If we could optimize that, it would be a huge
> > > > benefit
> > > > for a lot of users.
> > > 
> > > I've just did some measurements:
> > > 
> > > - run Isar build for qemuarm64 (8 cores = max 8 build tasks in
> > > parallel)
> > > - measured system disk consumption every 5 seconds
> > > 
> > > Results:
> > > - 'downloads/deb' directory finally takse 480MiB
> > > - After the build finished, disk space decreased by ~9GiB
> > > - During the build maximum disk space decrease was ~16GiB)
> > > 
> > > It means, that we really had about 16 - 9 = 7GiB of space temporarly
> > > used by
> > > parallel builds and it really corresponds to 8 * 2 * 480MiB.
> > > 
> > > So, the main goal now is to minimize this value.
> > 
> > Even the 7GiB should be further reduced, as some major contributor to
> > that is the apt-cache for the buildchroots and the images itself, which
> > is never cleaned up. While this is not so much an issue for CI systems,
> > it is definitely annoying on local builds. Especially, when working on
> > multiple ISAR based projects.
> > 
> > I would recommend to have a look at the build/tmp folder using ncdu -x
> > (-x to not cross filesystems / temporary mounts).
> 
> Hi all,
> 
> I did some changes to reduce disk usage and to speed up the building.
> The results are quite impressing so, before everything else I am going
> to show you the numbers:
> 
> results before and after the changes
> 
> 43954 Mb (max)   |  8657 Mb (max)
> 26548 Mb (rest)  |  4118 Mb (rest)
>  3741 Mb (deb)   |  3741 Mb (deb)
>   820 Mb (wic)   |   820 Mb (wic)
> 11789 Mb (cache) |   579 Mb (cache)
> time: 8m33s      | time: 4m34s
> 
> The changes has been committed on the default branch (evo) on my ISAR fork
> 
>  https://github.com/robang74/isar
> 
> * 9fd5282 - changes for a faster build using less disk space, p.1
> * 1b6ac1e - bugfix: no sstate archive obtainable, will run full task instead
> * 2cc1854 - bugfix: do_copy_boot_files was able to fail but then set -e
> 
> The two bug fixes are necessary to test for regression because the top
> commit changes the sstate cache creation. The top commit uses hard
> physical links to obtain these results, so it may not work in some
> cases like distributed or networked filesystems which do not support
> hard links (however, they might be smart enough to do a bare copy but
> I cannot grant that).
> 
> However, this is the reason because I titled part one the patch. I am
> thinking of addressing the issue in a more radical way but for the
> moment this seems quite interesting.
> I wish to have some feedback about the limitation about using uncommon
> filesystems and which one have a nice fallback to address unsupported
> features.
> 
> Looking forward your opinions, R-

Hello Roberto.

I also did some similar improvements. They are not so radical as yours and the 
benefit is not so big. But anyway, I'm sending 1st version of patchset. For 
the 2nd version I plan to play with your idea with CACHEDIR.TAG and `tar -
exclude-caches` option and to improve work with additional internal sbuild 
copying that you've already implemented.





  reply	other threads:[~2022-12-30  4:38 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-28  9:02 Moessbauer, Felix
2022-12-28  9:21 ` Baurzhan Ismagulov
2022-12-28  9:45   ` Moessbauer, Felix
2022-12-28 10:23     ` Uladzimir Bely
2022-12-28 11:04       ` Moessbauer, Felix
2022-12-29 23:15         ` Roberto A. Foglietta
2022-12-30  4:38           ` Uladzimir Bely [this message]
2022-12-30  7:08             ` Roberto A. Foglietta
2022-12-30  6:05           ` Moessbauer, Felix
2022-12-30  8:27             ` Roberto A. Foglietta
2022-12-30 10:04               ` Moessbauer, Felix
2022-12-30 13:11               ` Moessbauer, Felix
2022-12-30 13:33                 ` Roberto A. Foglietta
2022-12-30 13:47                   ` Roberto A. Foglietta
2022-12-31  8:59                     ` Roberto A. Foglietta
2022-12-31 21:03                       ` Roberto A. Foglietta
2023-01-09  8:12                       ` Roberto A. Foglietta
2023-01-09  9:58                         ` Roberto A. Foglietta
2023-01-19 18:08                           ` Roberto A. Foglietta
2023-01-25  4:48                             ` Roberto A. Foglietta
2023-02-10 16:05                               ` Roberto A. Foglietta
2023-02-14 10:01                                 ` Roberto A. Foglietta
2023-02-14 16:46                                   ` Roberto A. Foglietta
2022-12-30 12:29           ` Roberto A. Foglietta
2022-12-28  9:22 ` Florian Bezdeka
2023-01-02 16:15 ` Henning Schild
2023-01-05  6:31 ` Uladzimir Bely
2023-01-05 17:10   ` Roberto A. Foglietta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3494812.dWV9SEqChM@hp \
    --to=ubely@ilbers.de \
    --cc=isar-users@googlegroups.com \
    --cc=roberto.foglietta@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox