public inbox for isar-users@googlegroups.com
 help / color / mirror / Atom feed
From: Uladzimir Bely <ubely@ilbers.de>
To: "isar-users@googlegroups.com" <isar-users@googlegroups.com>
Subject: Re: Better way to handle apt cache needed
Date: Thu, 05 Jan 2023 09:31:36 +0300	[thread overview]
Message-ID: <25742987.1r3eYUQgxm@home> (raw)
In-Reply-To: <371e4d826cca6aaba11a4222fef547b134ed6ce7.camel@siemens.com>

In the email from Wednesday, 28 December 2022 12:02:13 +03 user Moessbauer, 
Felix wrote:
> Hi,
> 
> when working with builds that have both many recipes, as well as many
> build dependencies, disk usage explodes during the build. As both
> preconditions somehow correspond, this results in a quadratic disc
> consumption in the number of tasks during the build.
> 
> The root cause for that behavior is the apt cache
> (deb_dl_dir_(import|export)), that copies all previously downloaded apt
> packages into the WORKDIR of each (bitbake) package.
> Given, that a common apt-cache is around 2GB and 8 tasks are run in
> parallel, this gives already 16GB for the tasks, and 7 * 2GB for the
> buildchroots (host and target), in total ~30GB.
> 
> In one of my projects, we have to work with huge debian packages,
> leading to apt-cache sizes around 20GB. As these projects usually also
> have to be built on big machines with many cores, you easily get 500GB
> of required scratch disk space + a lot of disc accesses for the copy,
> making it basically impossible to build the project except by limiting
> the number of tasks that run in parallel.
> 
> Given that, we should really think about a way to get the disc
> consumption back to a linear level. Ideally, we would only use symlinks
> or maybe hardlinks to deduplicate. Another option would be to use the
> POSIX atomicity guarantees by just renaming packages when inserting
> into the cache.
> 
> Anyways, we need a better solution.
> Putting Henning as the author of that logic in CC.
> 
> Best regards,
> Felix

Hi all

I'd like just to mention unfinished patchset that I was working on earlier. It 
was last sent to maillist as `[PATCH v3 0/5] Improving base-apt usage PoC`.

The idea was to predownload all possible package dependencies to 'base-apt' 
repo first and use it (as "file:///path/to/base-apt") in sources list.

In this case (e.g. "file://..." source), as far as I remember, apt doesn't 
need packages to be "downloaded" to /var/cache/apt and just use them directly.

Since that time, of course, Isar changed (for example, now we have host/target 
splitted base-apt), so the patchset need to be updated. Also it requires some 
cleanup and improvements (technically, we don't need $DL_DIR/deb at all, since 
we use local base-apt repo).

Anyway, such kind of local-repo-based approach would be a good solution for 
high disk usage problem.

-- 
Uladzimir Bely




  parent reply	other threads:[~2023-01-05  6:31 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-28  9:02 Moessbauer, Felix
2022-12-28  9:21 ` Baurzhan Ismagulov
2022-12-28  9:45   ` Moessbauer, Felix
2022-12-28 10:23     ` Uladzimir Bely
2022-12-28 11:04       ` Moessbauer, Felix
2022-12-29 23:15         ` Roberto A. Foglietta
2022-12-30  4:38           ` Uladzimir Bely
2022-12-30  7:08             ` Roberto A. Foglietta
2022-12-30  6:05           ` Moessbauer, Felix
2022-12-30  8:27             ` Roberto A. Foglietta
2022-12-30 10:04               ` Moessbauer, Felix
2022-12-30 13:11               ` Moessbauer, Felix
2022-12-30 13:33                 ` Roberto A. Foglietta
2022-12-30 13:47                   ` Roberto A. Foglietta
2022-12-31  8:59                     ` Roberto A. Foglietta
2022-12-31 21:03                       ` Roberto A. Foglietta
2023-01-09  8:12                       ` Roberto A. Foglietta
2023-01-09  9:58                         ` Roberto A. Foglietta
2023-01-19 18:08                           ` Roberto A. Foglietta
2023-01-25  4:48                             ` Roberto A. Foglietta
2023-02-10 16:05                               ` Roberto A. Foglietta
2023-02-14 10:01                                 ` Roberto A. Foglietta
2023-02-14 16:46                                   ` Roberto A. Foglietta
2022-12-30 12:29           ` Roberto A. Foglietta
2022-12-28  9:22 ` Florian Bezdeka
2023-01-02 16:15 ` Henning Schild
2023-01-05  6:31 ` Uladzimir Bely [this message]
2023-01-05 17:10   ` Roberto A. Foglietta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25742987.1r3eYUQgxm@home \
    --to=ubely@ilbers.de \
    --cc=isar-users@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox