From: Uladzimir Bely <ubely@ilbers.de>
To: "isar-users@googlegroups.com" <isar-users@googlegroups.com>
Subject: Re: Better way to handle apt cache needed
Date: Thu, 05 Jan 2023 09:31:36 +0300 [thread overview]
Message-ID: <25742987.1r3eYUQgxm@home> (raw)
In-Reply-To: <371e4d826cca6aaba11a4222fef547b134ed6ce7.camel@siemens.com>
In the email from Wednesday, 28 December 2022 12:02:13 +03 user Moessbauer,
Felix wrote:
> Hi,
>
> when working with builds that have both many recipes, as well as many
> build dependencies, disk usage explodes during the build. As both
> preconditions somehow correspond, this results in a quadratic disc
> consumption in the number of tasks during the build.
>
> The root cause for that behavior is the apt cache
> (deb_dl_dir_(import|export)), that copies all previously downloaded apt
> packages into the WORKDIR of each (bitbake) package.
> Given, that a common apt-cache is around 2GB and 8 tasks are run in
> parallel, this gives already 16GB for the tasks, and 7 * 2GB for the
> buildchroots (host and target), in total ~30GB.
>
> In one of my projects, we have to work with huge debian packages,
> leading to apt-cache sizes around 20GB. As these projects usually also
> have to be built on big machines with many cores, you easily get 500GB
> of required scratch disk space + a lot of disc accesses for the copy,
> making it basically impossible to build the project except by limiting
> the number of tasks that run in parallel.
>
> Given that, we should really think about a way to get the disc
> consumption back to a linear level. Ideally, we would only use symlinks
> or maybe hardlinks to deduplicate. Another option would be to use the
> POSIX atomicity guarantees by just renaming packages when inserting
> into the cache.
>
> Anyways, we need a better solution.
> Putting Henning as the author of that logic in CC.
>
> Best regards,
> Felix
Hi all
I'd like just to mention unfinished patchset that I was working on earlier. It
was last sent to maillist as `[PATCH v3 0/5] Improving base-apt usage PoC`.
The idea was to predownload all possible package dependencies to 'base-apt'
repo first and use it (as "file:///path/to/base-apt") in sources list.
In this case (e.g. "file://..." source), as far as I remember, apt doesn't
need packages to be "downloaded" to /var/cache/apt and just use them directly.
Since that time, of course, Isar changed (for example, now we have host/target
splitted base-apt), so the patchset need to be updated. Also it requires some
cleanup and improvements (technically, we don't need $DL_DIR/deb at all, since
we use local base-apt repo).
Anyway, such kind of local-repo-based approach would be a good solution for
high disk usage problem.
--
Uladzimir Bely
next prev parent reply other threads:[~2023-01-05 6:31 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-28 9:02 Moessbauer, Felix
2022-12-28 9:21 ` Baurzhan Ismagulov
2022-12-28 9:45 ` Moessbauer, Felix
2022-12-28 10:23 ` Uladzimir Bely
2022-12-28 11:04 ` Moessbauer, Felix
2022-12-29 23:15 ` Roberto A. Foglietta
2022-12-30 4:38 ` Uladzimir Bely
2022-12-30 7:08 ` Roberto A. Foglietta
2022-12-30 6:05 ` Moessbauer, Felix
2022-12-30 8:27 ` Roberto A. Foglietta
2022-12-30 10:04 ` Moessbauer, Felix
2022-12-30 13:11 ` Moessbauer, Felix
2022-12-30 13:33 ` Roberto A. Foglietta
2022-12-30 13:47 ` Roberto A. Foglietta
2022-12-31 8:59 ` Roberto A. Foglietta
2022-12-31 21:03 ` Roberto A. Foglietta
2023-01-09 8:12 ` Roberto A. Foglietta
2023-01-09 9:58 ` Roberto A. Foglietta
2023-01-19 18:08 ` Roberto A. Foglietta
2023-01-25 4:48 ` Roberto A. Foglietta
2023-02-10 16:05 ` Roberto A. Foglietta
2023-02-14 10:01 ` Roberto A. Foglietta
2023-02-14 16:46 ` Roberto A. Foglietta
2022-12-30 12:29 ` Roberto A. Foglietta
2022-12-28 9:22 ` Florian Bezdeka
2023-01-02 16:15 ` Henning Schild
2023-01-05 6:31 ` Uladzimir Bely [this message]
2023-01-05 17:10 ` Roberto A. Foglietta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=25742987.1r3eYUQgxm@home \
--to=ubely@ilbers.de \
--cc=isar-users@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox