From mboxrd@z Thu Jan 1 00:00:00 1970 X-GM-THRID: 7182122219497062400 X-Received: by 2002:a25:cf43:0:b0:7b1:26e6:b32f with SMTP id f64-20020a25cf43000000b007b126e6b32fmr713370ybg.534.1672900300705; Wed, 04 Jan 2023 22:31:40 -0800 (PST) X-BeenThere: isar-users@googlegroups.com Received: by 2002:a25:c746:0:b0:7b3:a60a:4bff with SMTP id w67-20020a25c746000000b007b3a60a4bffls1414694ybe.7.-pod-prod-gmail; Wed, 04 Jan 2023 22:31:40 -0800 (PST) X-Google-Smtp-Source: AMrXdXvo5tSSgWrdJ5Hl36w82d965Oxj86L2Givs0NGC7IPr5x9vjG8FG0HoUzHgR0O3uqKGiadl X-Received: by 2002:a25:511:0:b0:791:2e53:309f with SMTP id 17-20020a250511000000b007912e53309fmr20667957ybf.58.1672900299914; Wed, 04 Jan 2023 22:31:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672900299; cv=none; d=google.com; s=arc-20160816; b=N7uaq3hq/KubclAPEVenXPW1WkC0MIVc8MxxfJ2xiANjEP0l/zGAvZHOVewXTYq5+s JAySrKcdbGpXxTHYhKxZj8wnXTO8WXEEN1AKlzLTgytNfh5macdgwbZhODmluc+mtFgv ONCeBu4ISasARFXwctxJ6KXGJi71rZ8myXF8M8AdiVYifNJiJ+l9opdOi27ILziDmbwm 4h/huCJagttiTcyPaaoLClVaUvQnsve8EpvKF6mL8RvN7Pb5z7n6Dn8YuvB7RqoljW4R q+6SOUzIh17DjjTrJe6k1lQooV942ACsoCZBvVmjvWa8vEz7leERppsqZL6EPXxocq/E O/8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from; bh=vCJfr0JHatjikWzR09jNqIL2U4XcKm/7rul8Xqifkmo=; b=bKqwr9xtrS7LC4INIdEOAzniUrvn+yEf6ffJE+IWWi0663Rmv+V84W9yrjTf3ExXvo 1wewjqJ4XboFnbvBfHBcUuYibB1QcO/gcOqWuHTG9l0rMz13ona8tMHZpHT7bKHcvrsp 9FPRzrFFOdvFgH3CQeSOlRntYqB4pQjWtdHGAKG3HXQ4eQcFFVT9YOJEgtVzzvKORVto L/XLvaPWp3P+NTJpd+IU09ECGtUkZjEhVv5AXmcd/zvFNZDJ2X9w/5C4Ko59V5wLSGLa BItWEkO0QEzNtnBfPgqSk8SlRFZUIBr77sD1AHLl3uwVDNo5DbLQD+hQzyl1W6S7Pgmc giWQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of ubely@ilbers.de designates 85.214.156.166 as permitted sender) smtp.mailfrom=ubely@ilbers.de Return-Path: Received: from shymkent.ilbers.de (shymkent.ilbers.de. [85.214.156.166]) by gmr-mx.google.com with ESMTPS id k16-20020a25c610000000b007b5afaf560esi21723ybf.4.2023.01.04.22.31.39 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Wed, 04 Jan 2023 22:31:39 -0800 (PST) Received-SPF: pass (google.com: domain of ubely@ilbers.de designates 85.214.156.166 as permitted sender) client-ip=85.214.156.166; Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of ubely@ilbers.de designates 85.214.156.166 as permitted sender) smtp.mailfrom=ubely@ilbers.de Received: from home.localnet (44-208-124-178-static.mgts.by [178.124.208.44] (may be forged)) (authenticated bits=0) by shymkent.ilbers.de (8.15.2/8.15.2/Debian-8+deb9u1) with ESMTPSA id 3056VbH7006549 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 5 Jan 2023 07:31:37 +0100 From: Uladzimir Bely To: "isar-users@googlegroups.com" Subject: Re: Better way to handle apt cache needed Date: Thu, 05 Jan 2023 09:31:36 +0300 Message-ID: <25742987.1r3eYUQgxm@home> In-Reply-To: <371e4d826cca6aaba11a4222fef547b134ed6ce7.camel@siemens.com> References: <371e4d826cca6aaba11a4222fef547b134ed6ce7.camel@siemens.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=unavailable autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on shymkent.ilbers.de X-TUID: YEV9Cza1kwiG In the email from Wednesday, 28 December 2022 12:02:13 +03 user Moessbauer, Felix wrote: > Hi, > > when working with builds that have both many recipes, as well as many > build dependencies, disk usage explodes during the build. As both > preconditions somehow correspond, this results in a quadratic disc > consumption in the number of tasks during the build. > > The root cause for that behavior is the apt cache > (deb_dl_dir_(import|export)), that copies all previously downloaded apt > packages into the WORKDIR of each (bitbake) package. > Given, that a common apt-cache is around 2GB and 8 tasks are run in > parallel, this gives already 16GB for the tasks, and 7 * 2GB for the > buildchroots (host and target), in total ~30GB. > > In one of my projects, we have to work with huge debian packages, > leading to apt-cache sizes around 20GB. As these projects usually also > have to be built on big machines with many cores, you easily get 500GB > of required scratch disk space + a lot of disc accesses for the copy, > making it basically impossible to build the project except by limiting > the number of tasks that run in parallel. > > Given that, we should really think about a way to get the disc > consumption back to a linear level. Ideally, we would only use symlinks > or maybe hardlinks to deduplicate. Another option would be to use the > POSIX atomicity guarantees by just renaming packages when inserting > into the cache. > > Anyways, we need a better solution. > Putting Henning as the author of that logic in CC. > > Best regards, > Felix Hi all I'd like just to mention unfinished patchset that I was working on earlier. It was last sent to maillist as `[PATCH v3 0/5] Improving base-apt usage PoC`. The idea was to predownload all possible package dependencies to 'base-apt' repo first and use it (as "file:///path/to/base-apt") in sources list. In this case (e.g. "file://..." source), as far as I remember, apt doesn't need packages to be "downloaded" to /var/cache/apt and just use them directly. Since that time, of course, Isar changed (for example, now we have host/target splitted base-apt), so the patchset need to be updated. Also it requires some cleanup and improvements (technically, we don't need $DL_DIR/deb at all, since we use local base-apt repo). Anyway, such kind of local-repo-based approach would be a good solution for high disk usage problem. -- Uladzimir Bely