From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from shymkent.ilbers.de ([unix socket]) by shymkent (Cyrus 2.5.10-Debian-2.5.10-3+deb9u2) with LMTPA; Mon, 10 Mar 2025 12:07:04 +0100 X-Sieve: CMU Sieve 2.4 Received: from mail-qt1-f184.google.com (mail-qt1-f184.google.com [209.85.160.184]) by shymkent.ilbers.de (8.15.2/8.15.2/Debian-8+deb9u1) with ESMTPS id 52AB73sH009691 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 10 Mar 2025 12:07:04 +0100 Received: by mail-qt1-f184.google.com with SMTP id d75a77b69052e-476664bffbesf44570551cf.3 for ; Mon, 10 Mar 2025 04:07:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20230601; t=1741604818; x=1742209618; darn=ilbers.de; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:x-original-sender:mime-version :subject:references:in-reply-to:message-id:to:from:date:sender:from :to:cc:subject:date:message-id:reply-to; bh=ejXm1DUxqrfZfzEr7qOdCiETszMurj7B6CS3fdSzRSA=; b=hRrY/f4P1tw5C2vg2hRs8OPDwlXH6iE5IW+teKDIU0D9CkgVJgOaCVniXi9g1g6KxR RsraEu5Go8KUI6jzJZ9EUsD7d2OmxiG/k9gFbWiKwQpOK8mLSB1istc+zhCu7KghpNTM b83pUfukCJz4Twgv2v27AT7TYmxNV64+TjD+v/6isZgSZYq3ZK7JpJz9H5YX5jNospzg 50c4Hv5O3o4F6XJgqOGXku88L2EnPziETLJcCaDBvRHRcn95hXwQIUdzGxk2NTEscE1A JHCAciSBgZS2brlb+BXE9tJCmGEodjuz4A3SPc5LcSSv7OSWgBLdomvjdrVj0TVHDvn6 nx9g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741604818; x=1742209618; darn=ilbers.de; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:x-original-sender:mime-version :subject:references:in-reply-to:message-id:to:from:date:from:to:cc :subject:date:message-id:reply-to; bh=ejXm1DUxqrfZfzEr7qOdCiETszMurj7B6CS3fdSzRSA=; b=e829E/1Z1rYzx/w1Vvc8oA3y9ROIaV+QaydOKsseCWv16OwOxmS+GQnvp4oPSlLjzh Wskk3mhTdVHaaBuMM3g/ITVa0OpDSe/Z+vHboWKU+aQO9BhhHnp6LtGtOuTnsYmTAMzg NEuAf3RJDUuyEhv9hWbVDv9hXHyL5BOoOQMzTMaNC6+sqOINq9EiFSA+X8WnjQx0DMZq kIInK0NYfrlF+pBy4z3DsF27YLF84VM8ZSb4bJ3n8vi1rfL/peVGDesAhU1UimJ73obe Ct+68cwBjWoXz2vLURJzTdr7ac/662V5kRViMOfxGHsK5mk4z5pPDjnc7nOX3DMPsHnW Cg9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741604818; x=1742209618; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-beenthere:x-gm-message-state:sender:from :to:cc:subject:date:message-id:reply-to; bh=ejXm1DUxqrfZfzEr7qOdCiETszMurj7B6CS3fdSzRSA=; b=RRaDxWb115Fowtb2tGT11yz65O3BO06qecJ2/LcStuweRLWxpS2KtYX2svMqcurrPn G667L+kX/0W+GScO1Jf7bt+Ky/k7//eYghrC2hJZgrjz/HiOnFPLu3f1hjIFxbc9Nq/z T6Wi/aQxocDRbsxjrYZQwF+DsgbtO43Q8eOFbf/uL17aM0LNWv1MAXNRySvCVDYhEtUw iF1aonhcMUR5MybFUHGgE1JYy41H2Z8htYrrRlxmbmklldDMe8kbXQnuL5o9n5MeSYQu toUgWLFNEA+P0JyOg3K0Vyy1x6pbFkEmPqnm3EGLU0tBGdEqPm1+3YkfkkAerd5MlSSv r66g== Sender: isar-users@googlegroups.com X-Forwarded-Encrypted: i=1; AJvYcCUsbjrrig6GR8GTkc4NkSFhRjEPUeJd4rySruL1G4IOQC0dpH52e4ZEXZrwS2vVWjEkQwys@ilbers.de X-Gm-Message-State: AOJu0YyyJhvFHDN+0X0MA8yHSb2+258kWEoB4JPOLHP9gcUBdFUXJ7VU AbJP1Y+Ctt5LfqmRooItAgYIWCwYEucnit5qT9Oy3lNx6sSo3k42 X-Google-Smtp-Source: AGHT+IHafsV3nLC3t95JSIxeIg7TTRFyB9r4NrwR4kH5dL46onUrAilGKTHD7Tc32YjuF2Km1WxGBA== X-Received: by 2002:ac8:5844:0:b0:476:88da:21ec with SMTP id d75a77b69052e-47688da259emr35674461cf.24.1741604817568; Mon, 10 Mar 2025 04:06:57 -0700 (PDT) X-BeenThere: isar-users@googlegroups.com; h=Adn5yVEQvyGNYya/ZV8TJZuXzRzymhL/MhwpdS8DNzGEIFUhdg== Received: by 2002:ac8:5846:0:b0:46b:2e03:7b85 with SMTP id d75a77b69052e-4751a4d88a4ls10698231cf.2.-pod-prod-04-us; Mon, 10 Mar 2025 04:06:56 -0700 (PDT) X-Received: by 2002:a05:620a:284e:b0:7c3:d280:a67c with SMTP id af79cd13be357-7c4e168b415mr1648217985a.17.1741604816505; Mon, 10 Mar 2025 04:06:56 -0700 (PDT) Date: Mon, 10 Mar 2025 04:06:55 -0700 (PDT) From: Srinuvasan Arjunan To: isar-users Message-Id: <671d33d8-1a0f-402d-8b6a-f8c56d6c3e30n@googlegroups.com> In-Reply-To: <20250305131142.2717692-1-cedric.hombourger@siemens.com> References: <20250305131142.2717692-1-cedric.hombourger@siemens.com> Subject: Re: [PATCH] deb-dl-dir: remove excessive calls to dpkg-deb in debsrc_download MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_79194_814009397.1741604815845" X-Original-Sender: srinuvasanasv@gmail.com Precedence: list Mailing-list: list isar-users@googlegroups.com; contact isar-users+owners@googlegroups.com List-ID: X-Spam-Checked-In-Group: isar-users@googlegroups.com X-Google-Group-Id: 914930254986 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , X-Spam-Status: No, score=-4.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, RCVD_IN_RP_CERTIFIED,RCVD_IN_RP_RNBL,RCVD_IN_RP_SAFE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on shymkent.ilbers.de X-TUID: BXtDyc/eDWap ------=_Part_79194_814009397.1741604815845 Content-Type: multipart/alternative; boundary="----=_Part_79195_1307014282.1741604815845" ------=_Part_79195_1307014282.1741604815845 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wednesday, March 5, 2025 at 6:42:05=E2=80=AFPM UTC+5:30 Cedric Hombourge= r wrote: Several calls to dpkg-deb are made for each single .deb file found in=20 downloads to parse individual fields. This approach is terribly slow=20 when a large amount of .deb files are found. Use apt-ftparchive to=20 produce an index of packages that were found and a simple awk script=20 to produce a (sorted) list of source package names and their versions.=20 Also avoid using sed to remove Epoch from the version when we are=20 trying to determine the name of the .dsc file: we instead use a simple=20 POSIX parameter expansion to remove everything up to the first colon=20 Signed-off-by: Cedric Hombourger =20 ---=20 meta/classes/deb-dl-dir.bbclass | 62 +++++++++++++++++++--------------=20 1 file changed, 35 insertions(+), 27 deletions(-)=20 diff --git a/meta/classes/deb-dl-dir.bbclass=20 b/meta/classes/deb-dl-dir.bbclass=20 index 7ebd057e..53ce4538 100644=20 --- a/meta/classes/deb-dl-dir.bbclass=20 +++ b/meta/classes/deb-dl-dir.bbclass=20 @@ -5,23 +5,6 @@=20 inherit repository=20 -is_not_part_of_current_build() {=20 - local package=3D"$( dpkg-deb --show --showformat '${Package}' "${1}" )"= =20 - local arch=3D"$( dpkg-deb --show --showformat '${Architecture}' "${1}" )"= =20 - local version=3D"$( dpkg-deb --show --showformat '${Version}' "${1}" )"= =20 - # Since we are parsing all the debs in DEBDIR, we can to some extend=20 - # try to eliminate some debs that are not part of the current multiconfig= =20 - # build using the below method.=20 - local output=3D"$( grep -xhs ".* status installed ${package}:${arch}=20 ${version}" \=20 - "${IMAGE_ROOTFS}"/var/log/dpkg.log \=20 - "${SCHROOT_HOST_DIR}"/var/log/dpkg.log \=20 - "${SCHROOT_TARGET_DIR}"/var/log/dpkg.log \=20 - "${SCHROOT_HOST_DIR}"/tmp/dpkg_common.log \=20 - "${SCHROOT_TARGET_DIR}"/tmp/dpkg_common.log | head -1 )"=20 -=20 - [ -z "${output}" ]=20 -}=20 -=20 debsrc_do_mounts() {=20 sudo -s <=20 + # Version: =20 + # Source: ()=20 + #=20 + # If Source is omitted, then =3D and=20 + # if is not specified then it is .=20 + # The awk script handles these optional fields. It looks for Size: as a= =20 + # trigger to print the source,version tupple=20 +=20 + apt-ftparchive --md5=3Dno --sha1=3Dno --sha256=3Dno --sha512=3Dno \=20 + -a "${DISTRO_ARCH}" packages \ Hi Cedric, I took this patch for my deb-src-caching issue [1], now i can able to=20 download deb-src for bootstrap and image related packages only missing part is imager_install related packages, going to send the= =20 patches based on your patch. But here i found one issue for armfh arch base-apt builds in ISAR, the=20 help2man and texinfo deb-src packages are missing because when we take the index using apt-ftparchive --md5=3Dno --sha1=3D= no=20 --sha256=3Dno --sha512=3Dno -a "${DISTRO_ARCH}" we uses the -a ${DISTRO_ARCH}, in this case it is armfh, but help2man and= =20 texinfo packages are only available for amd64 arch (might be ISAR_CROSS_COMPILE configuration) not armhf, hence the index doesn't= =20 have those packages , due to this reason we are not able to download src packages for those packages. I would suggest we can remove -a "${DISTRO_ARCH}" option and anyhow we= =20 are getting final list with sort -u. Validated without -a option and it's working fine as expected. [1]: https://groups.google.com/g/isar-users/c/8QstIaudyts Please provide your thoughts? =20 + "${rootfs}/var/cache/apt/archives" \=20 + | awk '/^Package:/ { s=3D$2; }=20 + /^Version:/ { v=3D$2; next }=20 + /^Source:/ { s=3D$2; if ($3 ~ /^\(/) v=3Dsubstr($3, 2, length($3)-2) }=20 + /^Size:/ { print s, v}' \=20 + | sort -u \=20 + | while read src version; do=20 + # Name of the .dsc file does not include Epoch, remove it before checking= =20 + # if sources were already downloaded. Avoid using sed here to reduce the= =20 + # number of processes being spawned by this function: we assume that the= =20 + # version is correctly formatted and simply strip everything up to the=20 + # first colon=20 + dscname=3D"${src}_${version#*:}.dsc"=20 + [ -f "${DEBSRCDIR}"/"${rootfs_distro}"/"${src}"/"${dscname}" ] || {=20 + # use apt-get source to download sources in DEBSRCDIR=20 + sudo -E chroot --userspec=3D$( id -u ):$( id -g ) ${rootfs} \=20 + sh -c ' mkdir -p "/deb-src/${1}/${2}" && cd "/deb-src/${1}/${2}" &&=20 apt-get -y --download-only --only-source source "$2"=3D"$3" ' download-src= =20 "${rootfs_distro}" "${src}" "${version}"=20 + }=20 done=20 ) 9>"${DEBSRCDIR}/${rootfs_distro}.lock"=20 --=20 2.39.5=20 --=20 You received this message because you are subscribed to the Google Groups "= isar-users" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to isar-users+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/isar-users/= 671d33d8-1a0f-402d-8b6a-f8c56d6c3e30n%40googlegroups.com. ------=_Part_79195_1307014282.1741604815845 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Wednesday, March 5, 2025 at 6:42:05= =E2=80=AFPM UTC+5:30 Cedric Hombourger wrote:
Several calls to dpkg-deb are made for each single .deb = file found in
downloads to parse individual fields. This approach is terribly slow
when a large amount of .deb files are found. Use apt-ftparchive to
produce an index of packages that were found and a simple awk script
to produce a (sorted) list of source package names and their versions= .
Also avoid using sed to remove Epoch from the version when we are
trying to determine the name of the .dsc file: we instead use a simpl= e
POSIX parameter expansion to remove everything up to the first colon

Signed-off-by: Cedric Hombourger <ce= dric.h...@siemens.com>
---
meta/classes/deb-dl-dir.bbclass | 62 +++++++++++++++++++------------= --
1 file changed, 35 insertions(+), 27 deletions(-)

diff --git a/meta/classes/deb-dl-dir.bbclass b/meta/classes/deb-dl-di= r.bbclass
index 7ebd057e..53ce4538 100644
--- a/meta/classes/deb-dl-dir.bbclass
+++ b/meta/classes/deb-dl-dir.bbclass
@@ -5,23 +5,6 @@
=20
inherit repository
=20
-is_not_part_of_current_build() {
- local package=3D"$( dpkg-deb --show --showformat '${Package}' "$= {1}" )"
- local arch=3D"$( dpkg-deb --show --showformat '${Architecture}' = "${1}" )"
- local version=3D"$( dpkg-deb --show --showformat '${Version}' "$= {1}" )"
- # Since we are parsing all the debs in DEBDIR, we can to some ex= tend
- # try to eliminate some debs that are not part of the current mu= lticonfig
- # build using the below method.
- local output=3D"$( grep -xhs ".* status installed ${package}:${a= rch} ${version}" \
- "${IMAGE_ROOTFS}"/var/log/dpkg.log \
- "${SCHROOT_HOST_DIR}"/var/log/dpkg.log \
- "${SCHROOT_TARGET_DIR}"/var/log/dpkg.log \
- "${SCHROOT_HOST_DIR}"/tmp/dpkg_common.log \
- "${SCHROOT_TARGET_DIR}"/tmp/dpkg_common.log | head -1 )"
-
- [ -z "${output}" ]
-}
-
debsrc_do_mounts() {
sudo -s <<EOSUDO
set -e
@@ -54,16 +37,41 @@ debsrc_download() {
( flock 9
set -e
printenv | grep -q BB_VERBOSE_LOGS && set -x
- find "${rootfs}/var/cache/apt/archives/" -maxdepth 1 -type f -in= ame '*\.deb' | while read package; do
- is_not_part_of_current_build "${package}" && continu= e
- local src=3D"$( dpkg-deb --show --showformat '${source:Packa= ge}' "${package}" )"
- local version=3D"$( dpkg-deb --show --showformat '${source:V= ersion}' "${package}" )"
- local dscname=3D"$(echo ${src}_${version} | sed -e 's/_[0-9]= \+:/_/')"
- local dscfile=3D$(find "${DEBSRCDIR}"/"${rootfs_distro}" -na= me "${dscname}.dsc")
- [ -n "$dscfile" ] && continue
-
- sudo -E chroot --userspec=3D$( id -u ):$( id -g ) ${rootfs} = \
- sh -c ' mkdir -p "/deb-src/${1}/${2}" && cd "/de= b-src/${1}/${2}" && apt-get -y --download-only --only-source source= "$2"=3D"$3" ' download-src "${rootfs_distro}" "${src}" "${version}"
+
+ # Use apt-ftparchive to scan all .deb files found in the downloa= d directory
+ # and produce an index that we can "parse" with awk. This is muc= h faster
+ # than parsing each .deb file individually using dpkg-deb. Lines= from the
+ # index we need are:
+ #
+ # Package: <binary-name>
+ # Version: <binary-version>
+ # Source: <source-name> (<source-version>)
+ #
+ # If Source is omitted, then <source-name>=3D<binary-na= me> and
+ # if <source-version> is not specified then it is <bina= ry-version>.
+ # The awk script handles these optional fields. It looks for Siz= e: as a
+ # trigger to print the source,version tupple
+
+ apt-ftparchive --md5=3Dno --sha1=3Dno --sha256=3Dno --sha512=3Dn= o \
+ -a "${DISTRO_ARCH}" packages \
<= br />
=C2=A0 Hi Cedric,

=C2=A0 I took = this patch for my deb-src-caching issue [1], now i can able to download deb= -src for bootstrap and image related packages
=C2=A0 only missing= part is imager_install related packages, going to send the patches based o= n your patch.

=C2=A0 But here i found one issue = for armfh arch base-apt builds in ISAR, the help2man and texinfo deb-src pa= ckages are missing
=C2=A0 because when we take the index using=C2= =A0 apt-ftparchive --md5=3Dno --sha1=3Dno --sha256=3Dno --sha512=3Dno=C2=A0= -a "${DISTRO_ARCH}"
=C2=A0 we uses the -a ${DISTRO_ARCH}, in thi= s case it is armfh, but help2man and texinfo packages are only available fo= r amd64 arch (might
=C2=A0 be ISAR_CROSS_COMPILE configuration) n= ot armhf, hence the index doesn't have those packages , due to this reason = we are not able to
=C2=A0 download src packages for those package= s.

=C2=A0 =C2=A0I would suggest we can remove -a= "${DISTRO_ARCH}" option and anyhow we are getting final list with sort -u.=
=C2=A0 =C2=A0Validated without -a option and it's working fine a= s expected.

=C2=A0 =C2=A0[1]:=C2=A0https://group= s.google.com/g/isar-users/c/8QstIaudyts

=C2=A0Pl= ease provide your thoughts?=C2=A0=C2=A0

+ "${rootfs}/var/cache/apt/archives" \
+ | awk '/^Package:/ { s=3D$2; }
+ /^Version:/ { v=3D$2; next }
+ /^Source:/ { s=3D$2; if ($3 ~ /^\(/) v=3Dsubstr($3, 2, le= ngth($3)-2) }
+ /^Size:/ { print s, v}' \
+ | sort -u \
+ | while read src version; do
+ # Name of the .dsc file does not include Epoch, remove it be= fore checking
+ # if sources were already downloaded. Avoid using sed here t= o reduce the
+ # number of processes being spawned by this function: we ass= ume that the
+ # version is correctly formatted and simply strip everything= up to the
+ # first colon
+ dscname=3D"${src}_${version#*:}.dsc"
+ [ -f "${DEBSRCDIR}"/"${rootfs_distro}"/"${src}"/"${dscname}"= ] || {
+ # use apt-get source to download sources in DEBSRCDIR
+ sudo -E chroot --userspec=3D$( id -u ):$( id -g ) ${root= fs} \
+ sh -c ' mkdir -p "/deb-src/${1}/${2}" && cd = "/deb-src/${1}/${2}" && apt-get -y --download-only --only-source so= urce "$2"=3D"$3" ' download-src "${rootfs_distro}" "${src}" "${version}"
+ }
done
) 9>"${DEBSRCDIR}/${rootfs_distro}.lock"
=20
--=20
2.39.5

--
You received this message because you are subscribed to the Google Groups &= quot;isar-users" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to isar-use= rs+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/isar-use= rs/671d33d8-1a0f-402d-8b6a-f8c56d6c3e30n%40googlegroups.com.
------=_Part_79195_1307014282.1741604815845-- ------=_Part_79194_814009397.1741604815845--