public inbox for isar-users@googlegroups.com
 help / color / mirror / Atom feed
* [PATCH 00/11] Make rootfs build reproducible
@ 2023-01-11  4:11 Felix Moessbauer
  2023-01-11  4:11 ` [PATCH 01/11] fix rebuild of rootfs_finalize task Felix Moessbauer
                   ` (12 more replies)
  0 siblings, 13 replies; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

This series finally makes the rootfs generation bit-reproducible
from debian bullseye on. Parts of it have already been sent
as individual patches. However, image reproducibility can only
be achived once all parts are reproducible itself. By that,
these patches are included in this series as well.

With this series, the following parts are now fully reproducible.
This has been tested on the isar-image-base target.

- custom initramfs (creation and updates)
- debian initramfs (only updates are relevant)
- custom kernel (debian kernel is reproducible itself)
- rootfs itself
- tar file generation (<image>.tar)
- ext4 generation (only from bookworm on, more tests needed)

Other parts that are still not reproducible are:

- WIC (should be solved in OE already)
- containers (untested yet)

Best regards,
Felix Moessbauer
Siemens AG

Felix Moessbauer (10):
  fix rebuild of rootfs_finalize task
  rootfs postprocess: clean python cache
  remove non-portable ldconfig aux-cache
  generate deterministic clear-text password hash
  update debian initramfs in deterministic mode
  create custom initramfs in deterministic mode
  make deb_add_changelog idempotent
  deb_add_changelog: set timestamp to valid epoch
  deb_add_changelog: use SOURCE_DATE_EPOCH
  make custom linux-image bit-by-bit reproducible

venkata pyla (1):
  image.bbclass: fix non-reproducible file time-stamps inside rootfs

 meta-isar/conf/local.conf.sample              | 10 +++++++++
 meta/classes/debianize.bbclass                | 22 +++++++++++++------
 meta/classes/image-account-extension.bbclass  | 10 ++++++++-
 meta/classes/image.bbclass                    | 21 ++++++++++++++++--
 meta/classes/initramfs.bbclass                |  5 +++++
 meta/classes/rootfs.bbclass                   | 13 +++++++++++
 .../linux/files/debian/isar/build.tmpl        |  1 +
 .../linux/files/debian/rules.tmpl             | 14 +++++++++++-
 meta/recipes-kernel/linux/linux-custom.inc    |  2 ++
 9 files changed, 87 insertions(+), 11 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 01/11] fix rebuild of rootfs_finalize task
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  4:11 ` [PATCH 02/11] image.bbclass: fix non-reproducible file time-stamps inside rootfs Felix Moessbauer
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

The rootfs_finalize task currently cannot be re-executed, as it moves
the sources-list into bootstrap.list. As this only has to be done once,
it is not required on subsequent executions. To fix the rebuild issue,
we simply ignore the return code of the mv statement.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 meta/classes/image.bbclass | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
index 629a0c1..6f0607e 100644
--- a/meta/classes/image.bbclass
+++ b/meta/classes/image.bbclass
@@ -425,7 +425,7 @@ do_rootfs_finalize() {
         rm -f "${ROOTFSDIR}/etc/apt/apt.conf.d/50isar"
 
         mv "${ROOTFSDIR}/etc/apt/sources-list" \
-            "${ROOTFSDIR}/etc/apt/sources.list.d/bootstrap.list"
+            "${ROOTFSDIR}/etc/apt/sources.list.d/bootstrap.list" || true
 
         rm -f "${ROOTFSDIR}/etc/apt/sources-list"
 EOSUDO
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 02/11] image.bbclass: fix non-reproducible file time-stamps inside rootfs
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
  2023-01-11  4:11 ` [PATCH 01/11] fix rebuild of rootfs_finalize task Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  4:11 ` [PATCH 03/11] rootfs postprocess: clean python cache Felix Moessbauer
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

From: venkata pyla <venkata.pyla@toshiba-tsip.com>

As part of reproducible-build work, the rootfs images generated on same
source should be identical between two builds.

In this commit it tries to solve one of the non-reproducible problem
i.e. the rootfs file time-stamps generated during build time are not
reproducible, it uses one of the solution provided in the debian
live-build image project (refer [1]), it fixes by finding all the
files/folders that are gernerated newly and set the time-stamp provided
by `SOURCE_DATE_EPOCH` environment variable.

[1] https://salsa.debian.org/live-team/live-build/-/merge_requests/218

Acked-by: Felix Moessbauer <felix.moessbauer@siemens.com>
Signed-off-by: venkata pyla <venkata.pyla@toshiba-tsip.com>
---
 meta-isar/conf/local.conf.sample | 10 ++++++++++
 meta/classes/image.bbclass       | 10 ++++++++++
 2 files changed, 20 insertions(+)

diff --git a/meta-isar/conf/local.conf.sample b/meta-isar/conf/local.conf.sample
index e1cd66e..6208623 100644
--- a/meta-isar/conf/local.conf.sample
+++ b/meta-isar/conf/local.conf.sample
@@ -248,3 +248,13 @@ USER_isar[flags] += "clear-text-password"
 #CCACHE_TOP_DIR ?= "${TMPDIR}/ccache"
 # Enable ccache debug mode
 #CCACHE_DEBUG = "1"
+
+# Uncommnet and add value to it to build images reproducibly
+#
+# The value for `SOURCE_DATE_EPOCH` should be latest source change time in
+# seconds since the Epoch.
+# Git repository users can use value from 'git log -1 --pretty=%ct'
+# Non git repository users can use value from 'stat -c%Y ChangeLog'
+# To know more details about this variable and how to set the value refer below
+# https://reproducible-builds.org/docs/source-date-epoch/
+#SOURCE_DATE_EPOCH =
diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
index 6f0607e..519a2e5 100644
--- a/meta/classes/image.bbclass
+++ b/meta/classes/image.bbclass
@@ -429,6 +429,16 @@ do_rootfs_finalize() {
 
         rm -f "${ROOTFSDIR}/etc/apt/sources-list"
 EOSUDO
+
+    # Set same time-stamps to the newly generated file/folders in the
+    # rootfs image for the purpose of reproducible builds.
+    test ! -z "${SOURCE_DATE_EPOCH}" && \
+        sudo find ${ROOTFSDIR} -newermt \
+            "$(date -d@${SOURCE_DATE_EPOCH} '+%Y-%m-%d %H:%M:%S')" \
+            -printf "%y %p\n" \
+            -exec touch '{}' -h -d@${SOURCE_DATE_EPOCH} ';' > ${DEPLOY_DIR_IMAGE}/files.modified_timestamps && \
+            bbwarn "$(grep ^f ${DEPLOY_DIR_IMAGE}/files.modified_timestamps) \nModified above file timestamps to build image reproducibly"
+
 }
 addtask rootfs_finalize before do_rootfs after do_rootfs_postprocess
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 03/11] rootfs postprocess: clean python cache
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
  2023-01-11  4:11 ` [PATCH 01/11] fix rebuild of rootfs_finalize task Felix Moessbauer
  2023-01-11  4:11 ` [PATCH 02/11] image.bbclass: fix non-reproducible file time-stamps inside rootfs Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  8:06   ` Henning Schild
  2023-01-11  4:11 ` [PATCH 04/11] remove non-portable ldconfig aux-cache Felix Moessbauer
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

When calling python scripts, python automatically creates cache files
to speedup future invocations of the same sources. This often happens
in postinst scripts, that directly run in the image chroot. The
created debian packages do not ship these files, as the debheper
scripts remove them before installing.

For the rootfs part, we manually have to do it to also not
include these in the final image. This patch implements this logic in
a custom cleanup postprocess step. As there might be situations where
shipping of a subset of the caches is desireable (e.g. readonly rootfs
images), we add support to control this logic using ROOTFS_FEATURES.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 meta/classes/image.bbclass  | 2 +-
 meta/classes/rootfs.bbclass | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
index 519a2e5..b86a428 100644
--- a/meta/classes/image.bbclass
+++ b/meta/classes/image.bbclass
@@ -80,7 +80,7 @@ image_do_mounts() {
 }
 
 ROOTFSDIR = "${IMAGE_ROOTFS}"
-ROOTFS_FEATURES += "clean-package-cache generate-manifest export-dpkg-status clean-log-files clean-debconf-cache"
+ROOTFS_FEATURES += "clean-package-cache clean-pycache generate-manifest export-dpkg-status clean-log-files clean-debconf-cache"
 ROOTFS_PACKAGES += "${IMAGE_PREINSTALL} ${IMAGE_INSTALL}"
 ROOTFS_MANIFEST_DEPLOY_DIR ?= "${DEPLOY_DIR_IMAGE}"
 ROOTFS_DPKGSTATUS_DEPLOY_DIR ?= "${DEPLOY_DIR_IMAGE}"
diff --git a/meta/classes/rootfs.bbclass b/meta/classes/rootfs.bbclass
index 786682d..325e7ae 100644
--- a/meta/classes/rootfs.bbclass
+++ b/meta/classes/rootfs.bbclass
@@ -252,6 +252,12 @@ rootfs_postprocess_clean_debconf_cache() {
     sudo rm -rf "${ROOTFSDIR}/var/cache/debconf/"*
 }
 
+ROOTFS_POSTPROCESS_COMMAND += "${@bb.utils.contains('ROOTFS_FEATURES', 'clean-pycache', 'rootfs_postprocess_clean_pycache', '', d)}"
+rootfs_postprocess_clean_pycache() {
+    sudo find ${ROOTFSDIR}/usr -type f -name '*.pyc'       -delete -print
+    sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__' -delete -print
+}
+
 ROOTFS_POSTPROCESS_COMMAND += "${@bb.utils.contains('ROOTFS_FEATURES', 'generate-manifest', 'rootfs_generate_manifest', '', d)}"
 rootfs_generate_manifest () {
     mkdir -p ${ROOTFS_MANIFEST_DEPLOY_DIR}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 04/11] remove non-portable ldconfig aux-cache
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (2 preceding siblings ...)
  2023-01-11  4:11 ` [PATCH 03/11] rootfs postprocess: clean python cache Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  8:19   ` Henning Schild
  2023-01-11  4:11 ` [PATCH 05/11] generate deterministic clear-text password hash Felix Moessbauer
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

This patch removes the ldconfig aux-cache from the final rootfs.
The cache is both not portable across file systems, as well as non
reproducible.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 meta/classes/rootfs.bbclass | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/meta/classes/rootfs.bbclass b/meta/classes/rootfs.bbclass
index 325e7ae..226fa8b 100644
--- a/meta/classes/rootfs.bbclass
+++ b/meta/classes/rootfs.bbclass
@@ -258,6 +258,13 @@ rootfs_postprocess_clean_pycache() {
     sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__' -delete -print
 }
 
+ROOTFS_POSTPROCESS_COMMAND += "rootfs_postprocess_clean_ldconfig_cache"
+rootfs_postprocess_clean_ldconfig_cache() {
+    # the ldconfig aux-cache is not portable and breaks reproducability
+    # https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=845034#49
+    sudo rm -f ${ROOTFSDIR}/var/cache/ldconfig/aux-cache
+}
+
 ROOTFS_POSTPROCESS_COMMAND += "${@bb.utils.contains('ROOTFS_FEATURES', 'generate-manifest', 'rootfs_generate_manifest', '', d)}"
 rootfs_generate_manifest () {
     mkdir -p ${ROOTFS_MANIFEST_DEPLOY_DIR}
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 05/11] generate deterministic clear-text password hash
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (3 preceding siblings ...)
  2023-01-11  4:11 ` [PATCH 04/11] remove non-portable ldconfig aux-cache Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  8:21   ` Henning Schild
  2023-01-11  4:11 ` [PATCH 06/11] update debian initramfs in deterministic mode Felix Moessbauer
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

This patch changes how we derive the hashed password of a user that is
created using the clear-text-password flag. Previously, the clear-text
password was directly input into chpasswd. However, chpasswd internally
creates a 16-character random salt. This breaks the reproducability.

Instead of letting chpasswd create the hashed password string, we now
create it manually by deriving the salt from the SOURCE_DATE_EPOCH
variable. This is technically done using the host openssl tool. As
openssl is a transitive dependency of sbuild, we do not need to add
it explicitly to the host-tools.

In case SOURCE_DATE_EPOCH is not set, chpasswd is used
directly to create the salt.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 meta/classes/image-account-extension.bbclass | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/meta/classes/image-account-extension.bbclass b/meta/classes/image-account-extension.bbclass
index 70950a7..bcaa9c3 100644
--- a/meta/classes/image-account-extension.bbclass
+++ b/meta/classes/image-account-extension.bbclass
@@ -253,7 +253,15 @@ image_postprocess_accounts() {
         if [ -n "$password" -o "${flags}" != "${flags%*,allow-empty-password,*}" ]; then
             chpasswd_args="-e"
             if [ "${flags}" != "${flags%*,clear-text-password,*}" ]; then
-                chpasswd_args=""
+                # chpasswd adds a random salt when running against a clean-text password.
+                # For reproducible images, we manually generate the password and use the
+                # SOURCE_DATE_EPOCH to generate the salt in a deterministic way.
+                if [ -z "${SOURCE_DATE_EPOCH}"]; then
+                    chpasswd_args=""
+                else
+                    salt="$(echo "${SOURCE_DATE_EPOCH}" | sha256sum -z | cut -c 1-15)"
+                    password="$(openssl passwd -6 -salt $salt "$password")"
+                fi
             fi
             printf '%s:%s' "$name" "$password" | sudo chroot '${ROOTFSDIR}' \
                 /usr/sbin/chpasswd $chpasswd_args
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 06/11] update debian initramfs in deterministic mode
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (4 preceding siblings ...)
  2023-01-11  4:11 ` [PATCH 05/11] generate deterministic clear-text password hash Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  8:23   ` Henning Schild
  2023-01-11  4:11 ` [PATCH 07/11] create custom " Felix Moessbauer
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

This patch exports the SOURCE_DATE_EPOCH variable in the image install
task. By that, update-initramfs is switched into reproducible mode.
Before this patch, each trigger of update-initramfs created a new
non-deterministic version of the initramfs.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 meta/classes/image.bbclass | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
index b86a428..c981c7a 100644
--- a/meta/classes/image.bbclass
+++ b/meta/classes/image.bbclass
@@ -304,6 +304,13 @@ python() {
 }
 
 
+# make generation of initramfs reproducible
+rootfs_install_pkgs_install_prepend() {
+    if [ ! -z "${SOURCE_DATE_EPOCH}" ]; then
+        export SOURCE_DATE_EPOCH="${SOURCE_DATE_EPOCH}"
+    fi
+}
+
 # here we call a command that should describe your whole build system,
 # this could be "git describe" or something similar.
 # set ISAR_RELEASE_CMD to customize, or override do_mark_rootfs to do something
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 07/11] create custom initramfs in deterministic mode
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (5 preceding siblings ...)
  2023-01-11  4:11 ` [PATCH 06/11] update debian initramfs in deterministic mode Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  4:11 ` [PATCH 08/11] make deb_add_changelog idempotent Felix Moessbauer
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

This patch enables the deterministic mode of update-initramfs for custom
initramfs versions, in case SOURCE_DATE_EPOCH is defined.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 meta/classes/initramfs.bbclass | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/meta/classes/initramfs.bbclass b/meta/classes/initramfs.bbclass
index 2cec85d..db28334 100644
--- a/meta/classes/initramfs.bbclass
+++ b/meta/classes/initramfs.bbclass
@@ -32,6 +32,11 @@ do_generate_initramfs() {
     rootfs_do_mounts
     rootfs_do_qemu
 
+    # generate reproducible initrd if requested
+    if [ ! -z "${SOURCE_DATE_EPOCH}" ]; then
+        export SOURCE_DATE_EPOCH="${SOURCE_DATE_EPOCH}"
+    fi
+
     sudo -E chroot "${INITRAMFS_ROOTFS}" \
         update-initramfs -u -v
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 08/11] make deb_add_changelog idempotent
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (6 preceding siblings ...)
  2023-01-11  4:11 ` [PATCH 07/11] create custom " Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  4:11 ` [PATCH 09/11] deb_add_changelog: set timestamp to valid epoch Felix Moessbauer
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

Previously, the deb_add_changelog function considered an auto-generated
changelog as a base to add changes on top. This behavior is not
idempontent on subsequent invocations of the function (e.g. on partial
rebuilds). This lead to both reproducability issues, as well as unclean
changelog files having multiple "generated by ISAR" entries.

This patch changes this implementation in a way to always create a
(possibly empty) orig changelog on the first invocation. On subequent
invocations, the orig changelog is only considered as provided by the
user, if it is not empty.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 meta/classes/debianize.bbclass | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/meta/classes/debianize.bbclass b/meta/classes/debianize.bbclass
index d125256..ca7b520 100644
--- a/meta/classes/debianize.bbclass
+++ b/meta/classes/debianize.bbclass
@@ -19,11 +19,14 @@ deb_add_changelog() {
 		if [ ! -f ${WORKDIR}/changelog.orig ]; then
 			cp ${S}/debian/changelog ${WORKDIR}/changelog.orig
 		fi
-		orig_version=$(dpkg-parsechangelog -l ${WORKDIR}/changelog.orig -S Version)
-		changelog_v=$(echo "${changelog_v}" | sed 's/<orig-version>/'${orig_version}'/')
-		orig_date=$(dpkg-parsechangelog -l ${WORKDIR}/changelog.orig -S Date)
-		orig_seconds=$(date --date="${orig_date}" +'%s')
-		timestamp=$(expr ${orig_seconds} + 42)
+		# we have a non auto-generated original changelog
+		if [ -s ${WORKDIR}/changelog.orig ]; then
+			orig_version=$(dpkg-parsechangelog -l ${WORKDIR}/changelog.orig -S Version)
+			changelog_v=$(echo "${changelog_v}" | sed 's/<orig-version>/'${orig_version}'/')
+			orig_date=$(dpkg-parsechangelog -l ${WORKDIR}/changelog.orig -S Date)
+			orig_seconds=$(date --date="${orig_date}" +'%s')
+			timestamp=$(expr ${orig_seconds} + 42)
+		fi
 	fi
 
 	date=$(LANG=C date -R -d @${timestamp})
@@ -34,7 +37,10 @@ ${PN} (${changelog_v}) UNRELEASED; urgency=low
 
  -- ${MAINTAINER}  ${date}
 EOF
-	if [ -f ${WORKDIR}/changelog.orig ]; then
+	# ensure that we always start with the orig version of the
+	# changelog on repeated invocations (e.g. on partial rebuilds)
+	touch ${WORKDIR}/changelog.orig
+	if [ -s ${WORKDIR}/changelog.orig ]; then
 		# prepend our entry to the original changelog
 		echo >> ${S}/debian/changelog
 		cat ${WORKDIR}/changelog.orig >> ${S}/debian/changelog
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 09/11] deb_add_changelog: set timestamp to valid epoch
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (7 preceding siblings ...)
  2023-01-11  4:11 ` [PATCH 08/11] make deb_add_changelog idempotent Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  4:11 ` [PATCH 10/11] deb_add_changelog: use SOURCE_DATE_EPOCH Felix Moessbauer
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

A changelog date of 0 (unix timestamp) is not considered a valid
timestamp for the SOURCE_DATE_EPOCH. By that, the debhelper scripts
set the SOURCE_DATE_EPOCH variable to the current time of the build,
breaking reproducability. By that, we get an inconsistency between the
debian changelog timestamp and the timestamp that the build tools encode
into the binary and the file timestamps.

Without having support to control the SOURCE_DATE_EPOCH variable
externally via bitbake, this always led to non-reproducible packages.
To fix this, we simply set the default timestamp to 1h later.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 meta/classes/debianize.bbclass | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/meta/classes/debianize.bbclass b/meta/classes/debianize.bbclass
index ca7b520..a6694a0 100644
--- a/meta/classes/debianize.bbclass
+++ b/meta/classes/debianize.bbclass
@@ -14,7 +14,7 @@ MAINTAINER ??= "Unknown maintainer <unknown@example.com>"
 
 deb_add_changelog() {
 	changelog_v="${CHANGELOG_V}"
-	timestamp=0
+	timestamp=3600
 	if [ -f ${S}/debian/changelog ]; then
 		if [ ! -f ${WORKDIR}/changelog.orig ]; then
 			cp ${S}/debian/changelog ${WORKDIR}/changelog.orig
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 10/11] deb_add_changelog: use SOURCE_DATE_EPOCH
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (8 preceding siblings ...)
  2023-01-11  4:11 ` [PATCH 09/11] deb_add_changelog: set timestamp to valid epoch Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  8:49   ` Henning Schild
  2023-01-11  4:11 ` [PATCH 11/11] make custom linux-image bit-by-bit reproducible Felix Moessbauer
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

In case the SOURCE_DATE_EPOCH bb variable is set, use that value
both for the auto-generated changelog as well as when appending to
an existing changelog.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 meta/classes/debianize.bbclass | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/meta/classes/debianize.bbclass b/meta/classes/debianize.bbclass
index a6694a0..3d5d934 100644
--- a/meta/classes/debianize.bbclass
+++ b/meta/classes/debianize.bbclass
@@ -14,7 +14,7 @@ MAINTAINER ??= "Unknown maintainer <unknown@example.com>"
 
 deb_add_changelog() {
 	changelog_v="${CHANGELOG_V}"
-	timestamp=3600
+	timestamp=${@ d.getVar('SOURCE_DATE_EPOCH', True) or '3600' }
 	if [ -f ${S}/debian/changelog ]; then
 		if [ ! -f ${WORKDIR}/changelog.orig ]; then
 			cp ${S}/debian/changelog ${WORKDIR}/changelog.orig
@@ -23,9 +23,11 @@ deb_add_changelog() {
 		if [ -s ${WORKDIR}/changelog.orig ]; then
 			orig_version=$(dpkg-parsechangelog -l ${WORKDIR}/changelog.orig -S Version)
 			changelog_v=$(echo "${changelog_v}" | sed 's/<orig-version>/'${orig_version}'/')
-			orig_date=$(dpkg-parsechangelog -l ${WORKDIR}/changelog.orig -S Date)
-			orig_seconds=$(date --date="${orig_date}" +'%s')
-			timestamp=$(expr ${orig_seconds} + 42)
+			if [ -z "${SOURCE_DATE_EPOCH}" ]; then
+				orig_date=$(dpkg-parsechangelog -l ${WORKDIR}/changelog.orig -S Date)
+				orig_seconds=$(date --date="${orig_date}" +'%s')
+				timestamp=$(expr ${orig_seconds} + 42)
+			fi
 		fi
 	fi
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 11/11] make custom linux-image bit-by-bit reproducible
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (9 preceding siblings ...)
  2023-01-11  4:11 ` [PATCH 10/11] deb_add_changelog: use SOURCE_DATE_EPOCH Felix Moessbauer
@ 2023-01-11  4:11 ` Felix Moessbauer
  2023-01-11  6:51 ` [PATCH 00/11] Make rootfs build reproducible Jan Kiszka
  2023-01-11  9:04 ` Venkata.Pyla
  12 siblings, 0 replies; 28+ messages in thread
From: Felix Moessbauer @ 2023-01-11  4:11 UTC (permalink / raw)
  To: isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild, venkata.pyla,
	Felix Moessbauer

This patch makes the build of custom linux kernels bit-by-bit
reproducible. By that, we can remove the dh_strip_nondeterminism step,
which significantly reduces the kernel build time.

The implementation is similar to how upstream debian builds their kernel
images and extracts all information from the changelog. As the
DISTRIBUTOR field is not part of the changelog, we inject it via a bb
variable which is defaulted to ISAR.

Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
 .../linux/files/debian/isar/build.tmpl             |  1 +
 meta/recipes-kernel/linux/files/debian/rules.tmpl  | 14 +++++++++++++-
 meta/recipes-kernel/linux/linux-custom.inc         |  2 ++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/meta/recipes-kernel/linux/files/debian/isar/build.tmpl b/meta/recipes-kernel/linux/files/debian/isar/build.tmpl
index 94cfbe0..e7e0479 100644
--- a/meta/recipes-kernel/linux/files/debian/isar/build.tmpl
+++ b/meta/recipes-kernel/linux/files/debian/isar/build.tmpl
@@ -34,6 +34,7 @@ print_settings() {
 # ---------------
 # ARCH=${ARCH}
 # CROSS_COMPILE=${CROSS_COMPILE}
+# KBUILD_BUILD_TIMESTAMP=${KBUILD_BUILD_TIMESTAMP}
 EOF
 }
 
diff --git a/meta/recipes-kernel/linux/files/debian/rules.tmpl b/meta/recipes-kernel/linux/files/debian/rules.tmpl
index 8063c49..e8ae3da 100755
--- a/meta/recipes-kernel/linux/files/debian/rules.tmpl
+++ b/meta/recipes-kernel/linux/files/debian/rules.tmpl
@@ -2,6 +2,11 @@
 
 CROSS_COMPILE:=$(DEB_HOST_GNU_TYPE)-
 
+MAINTAINER := $(shell sed -ne 's,^Maintainer: .[^<]*<\([^>]*\)>,\1,p' debian/control)
+DISTRIBUTOR := ${DISTRIBUTOR}
+SOURCE_DATE := $(shell dpkg-parsechangelog -SDate)
+SOURCE_DATE_UTC_ISO := $(shell date -u -d '$(SOURCE_DATE)' +%Y-%m-%d)
+
 O:=$(CURDIR)/${KERNEL_BUILD_DIR}
 S:=$(CURDIR)
 deb_top_dir:=$(S)/debian
@@ -14,7 +19,11 @@ isar_env=$(strip \
 	export MAKE='$(MAKE)' && \
 	export O='${O}' && \
 	export S='${S}' && \
-	export CURDIR='$(CURDIR)' \
+	export CURDIR='$(CURDIR)' && \
+	export KBUILD_BUILD_TIMESTAMP='$(SOURCE_DATE)' && \
+	export KBUILD_BUILD_VERSION_TIMESTAMP='$(DISTRIBUTOR) $(DEB_VERSION_UPSTREAM) ($(SOURCE_DATE_UTC_ISO))' && \
+	export KBUILD_BUILD_USER='$(word 1,$(subst @, ,$(MAINTAINER)))' && \
+	export KBUILD_BUILD_HOST='$(word 2,$(subst @, ,$(MAINTAINER)))' \
 )
 
 %:
@@ -35,5 +44,8 @@ override_dh_auto_install:
 override_dh_auto_test:
 	true
 
+override_dh_strip_nondeterminism:
+	true
+
 override_dh_strip:
 	unset DEB_HOST_GNU_TYPE && dh_strip -Xvmlinu --no-automatic-dbgsym
diff --git a/meta/recipes-kernel/linux/linux-custom.inc b/meta/recipes-kernel/linux/linux-custom.inc
index 447d4e8..6c539c0 100644
--- a/meta/recipes-kernel/linux/linux-custom.inc
+++ b/meta/recipes-kernel/linux/linux-custom.inc
@@ -12,6 +12,7 @@
 CHANGELOG_V = "${PV}+${PR}"
 DESCRIPTION ?= "Custom kernel"
 MAINTAINER ?= "isar-users <isar-users@googlegroups.com>"
+DISTRIBUTOR ?= "ISAR"
 
 KBUILD_DEPENDS ?= "build-essential:native, \
                    libelf-dev:native, \
@@ -79,6 +80,7 @@ TEMPLATE_VARS += "                \
     KERNEL_NAME_PROVIDED          \
     KERNEL_CONFIG_FRAGMENTS       \
     KCFLAGS                       \
+    DISTRIBUTOR                   \
 "
 
 inherit dpkg
-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 00/11] Make rootfs build reproducible
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (10 preceding siblings ...)
  2023-01-11  4:11 ` [PATCH 11/11] make custom linux-image bit-by-bit reproducible Felix Moessbauer
@ 2023-01-11  6:51 ` Jan Kiszka
  2023-01-11  9:04 ` Venkata.Pyla
  12 siblings, 0 replies; 28+ messages in thread
From: Jan Kiszka @ 2023-01-11  6:51 UTC (permalink / raw)
  To: Felix Moessbauer, isar-users
  Cc: daniel.bovensiepen, henning.schild, venkata.pyla

On 11.01.23 05:11, Felix Moessbauer wrote:
> This series finally makes the rootfs generation bit-reproducible
> from debian bullseye on. Parts of it have already been sent
> as individual patches. However, image reproducibility can only
> be achived once all parts are reproducible itself. By that,
> these patches are included in this series as well.
> 
> With this series, the following parts are now fully reproducible.
> This has been tested on the isar-image-base target.
> 
> - custom initramfs (creation and updates)
> - debian initramfs (only updates are relevant)
> - custom kernel (debian kernel is reproducible itself)
> - rootfs itself
> - tar file generation (<image>.tar)
> - ext4 generation (only from bookworm on, more tests needed)
> 
> Other parts that are still not reproducible are:
> 
> - WIC (should be solved in OE already)
> - containers (untested yet)
> 
> Best regards,
> Felix Moessbauer
> Siemens AG
> 
> Felix Moessbauer (10):
>   fix rebuild of rootfs_finalize task
>   rootfs postprocess: clean python cache
>   remove non-portable ldconfig aux-cache
>   generate deterministic clear-text password hash
>   update debian initramfs in deterministic mode
>   create custom initramfs in deterministic mode
>   make deb_add_changelog idempotent
>   deb_add_changelog: set timestamp to valid epoch
>   deb_add_changelog: use SOURCE_DATE_EPOCH
>   make custom linux-image bit-by-bit reproducible
> 
> venkata pyla (1):
>   image.bbclass: fix non-reproducible file time-stamps inside rootfs
> 
>  meta-isar/conf/local.conf.sample              | 10 +++++++++
>  meta/classes/debianize.bbclass                | 22 +++++++++++++------
>  meta/classes/image-account-extension.bbclass  | 10 ++++++++-
>  meta/classes/image.bbclass                    | 21 ++++++++++++++++--
>  meta/classes/initramfs.bbclass                |  5 +++++
>  meta/classes/rootfs.bbclass                   | 13 +++++++++++
>  .../linux/files/debian/isar/build.tmpl        |  1 +
>  .../linux/files/debian/rules.tmpl             | 14 +++++++++++-
>  meta/recipes-kernel/linux/linux-custom.inc    |  2 ++
>  9 files changed, 87 insertions(+), 11 deletions(-)
> 

Cool!

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 03/11] rootfs postprocess: clean python cache
  2023-01-11  4:11 ` [PATCH 03/11] rootfs postprocess: clean python cache Felix Moessbauer
@ 2023-01-11  8:06   ` Henning Schild
  2023-01-11  8:23     ` Moessbauer, Felix
  0 siblings, 1 reply; 28+ messages in thread
From: Henning Schild @ 2023-01-11  8:06 UTC (permalink / raw)
  To: Felix Moessbauer; +Cc: isar-users, jan.kiszka, daniel.bovensiepen, venkata.pyla

Am Wed, 11 Jan 2023 04:11:32 +0000
schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:

> When calling python scripts, python automatically creates cache files
> to speedup future invocations of the same sources. This often happens
> in postinst scripts, that directly run in the image chroot. The
> created debian packages do not ship these files, as the debheper
> scripts remove them before installing.
> 
> For the rootfs part, we manually have to do it to also not
> include these in the final image. This patch implements this logic in
> a custom cleanup postprocess step. As there might be situations where
> shipping of a subset of the caches is desireable (e.g. readonly rootfs
> images), we add support to control this logic using ROOTFS_FEATURES.
> 
> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> ---
>  meta/classes/image.bbclass  | 2 +-
>  meta/classes/rootfs.bbclass | 6 ++++++
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
> index 519a2e5..b86a428 100644
> --- a/meta/classes/image.bbclass
> +++ b/meta/classes/image.bbclass
> @@ -80,7 +80,7 @@ image_do_mounts() {
>  }
>  
>  ROOTFSDIR = "${IMAGE_ROOTFS}"
> -ROOTFS_FEATURES += "clean-package-cache generate-manifest
> export-dpkg-status clean-log-files clean-debconf-cache"
> +ROOTFS_FEATURES += "clean-package-cache clean-pycache
> generate-manifest export-dpkg-status clean-log-files
> clean-debconf-cache" ROOTFS_PACKAGES += "${IMAGE_PREINSTALL}
> ${IMAGE_INSTALL}" ROOTFS_MANIFEST_DEPLOY_DIR ?= "${DEPLOY_DIR_IMAGE}"
> ROOTFS_DPKGSTATUS_DEPLOY_DIR ?= "${DEPLOY_DIR_IMAGE}" diff --git
> a/meta/classes/rootfs.bbclass b/meta/classes/rootfs.bbclass index
> 786682d..325e7ae 100644 --- a/meta/classes/rootfs.bbclass +++
> b/meta/classes/rootfs.bbclass @@ -252,6 +252,12 @@
> rootfs_postprocess_clean_debconf_cache() { sudo rm -rf
> "${ROOTFSDIR}/var/cache/debconf/"* }
>  
> +ROOTFS_POSTPROCESS_COMMAND +=
> "${@bb.utils.contains('ROOTFS_FEATURES', 'clean-pycache',
> 'rootfs_postprocess_clean_pycache', '', d)}"
> +rootfs_postprocess_clean_pycache() {
> +    sudo find ${ROOTFSDIR}/usr -type f -name '*.pyc'       -delete
> -print
> +    sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__' -delete
> -print +}

Are we sure that this can never be valid content of any package? I
suggest we double check with dpkg.

Henning

>  ROOTFS_POSTPROCESS_COMMAND +=
> "${@bb.utils.contains('ROOTFS_FEATURES', 'generate-manifest',
> 'rootfs_generate_manifest', '', d)}" rootfs_generate_manifest () {
> mkdir -p ${ROOTFS_MANIFEST_DEPLOY_DIR}


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/11] remove non-portable ldconfig aux-cache
  2023-01-11  4:11 ` [PATCH 04/11] remove non-portable ldconfig aux-cache Felix Moessbauer
@ 2023-01-11  8:19   ` Henning Schild
  2023-01-11  8:31     ` Moessbauer, Felix
  0 siblings, 1 reply; 28+ messages in thread
From: Henning Schild @ 2023-01-11  8:19 UTC (permalink / raw)
  To: Felix Moessbauer; +Cc: isar-users, jan.kiszka, daniel.bovensiepen, venkata.pyla

Am Wed, 11 Jan 2023 04:11:33 +0000
schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:

> This patch removes the ldconfig aux-cache from the final rootfs.
> The cache is both not portable across file systems, as well as non
> reproducible.
> 
> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> ---
>  meta/classes/rootfs.bbclass | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/meta/classes/rootfs.bbclass b/meta/classes/rootfs.bbclass
> index 325e7ae..226fa8b 100644
> --- a/meta/classes/rootfs.bbclass
> +++ b/meta/classes/rootfs.bbclass
> @@ -258,6 +258,13 @@ rootfs_postprocess_clean_pycache() {
>      sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__' -delete
> -print }
>  
> +ROOTFS_POSTPROCESS_COMMAND +=
> "rootfs_postprocess_clean_ldconfig_cache"
> +rootfs_postprocess_clean_ldconfig_cache() {
> +    # the ldconfig aux-cache is not portable and breaks
> reproducability
> +    # https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=845034#49
> +    sudo rm -f ${ROOTFSDIR}/var/cache/ldconfig/aux-cache
> +}

Should this not be enabled by default?

Henning

>  ROOTFS_POSTPROCESS_COMMAND +=
> "${@bb.utils.contains('ROOTFS_FEATURES', 'generate-manifest',
> 'rootfs_generate_manifest', '', d)}" rootfs_generate_manifest () {
> mkdir -p ${ROOTFS_MANIFEST_DEPLOY_DIR}


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 05/11] generate deterministic clear-text password hash
  2023-01-11  4:11 ` [PATCH 05/11] generate deterministic clear-text password hash Felix Moessbauer
@ 2023-01-11  8:21   ` Henning Schild
  0 siblings, 0 replies; 28+ messages in thread
From: Henning Schild @ 2023-01-11  8:21 UTC (permalink / raw)
  To: Felix Moessbauer; +Cc: isar-users, jan.kiszka, daniel.bovensiepen, venkata.pyla

Am Wed, 11 Jan 2023 04:11:34 +0000
schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:

> This patch changes how we derive the hashed password of a user that is
> created using the clear-text-password flag. Previously, the clear-text
> password was directly input into chpasswd. However, chpasswd
> internally creates a 16-character random salt. This breaks the
> reproducability.
> 
> Instead of letting chpasswd create the hashed password string, we now
> create it manually by deriving the salt from the SOURCE_DATE_EPOCH
> variable. This is technically done using the host openssl tool. As
> openssl is a transitive dependency of sbuild, we do not need to add
> it explicitly to the host-tools.
> 
> In case SOURCE_DATE_EPOCH is not set, chpasswd is used
> directly to create the salt.
> 
> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> ---
>  meta/classes/image-account-extension.bbclass | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/meta/classes/image-account-extension.bbclass
> b/meta/classes/image-account-extension.bbclass index 70950a7..bcaa9c3
> 100644 --- a/meta/classes/image-account-extension.bbclass
> +++ b/meta/classes/image-account-extension.bbclass
> @@ -253,7 +253,15 @@ image_postprocess_accounts() {
>          if [ -n "$password" -o "${flags}" !=
> "${flags%*,allow-empty-password,*}" ]; then chpasswd_args="-e"
>              if [ "${flags}" != "${flags%*,clear-text-password,*}" ];
> then
> -                chpasswd_args=""
> +                # chpasswd adds a random salt when running against a
> clean-text password.

clear-text

> +                # For reproducible images, we manually generate the
> password and use the
> +                # SOURCE_DATE_EPOCH to generate the salt in a
> deterministic way.
> +                if [ -z "${SOURCE_DATE_EPOCH}"]; then
> +                    chpasswd_args=""
> +                else
> +                    salt="$(echo "${SOURCE_DATE_EPOCH}" | sha256sum
> -z | cut -c 1-15)"
> +                    password="$(openssl passwd -6 -salt $salt
> "$password")"
> +                fi
>              fi
>              printf '%s:%s' "$name" "$password" | sudo chroot
> '${ROOTFSDIR}' \ /usr/sbin/chpasswd $chpasswd_args


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 03/11] rootfs postprocess: clean python cache
  2023-01-11  8:06   ` Henning Schild
@ 2023-01-11  8:23     ` Moessbauer, Felix
  2023-01-11 12:47       ` Henning Schild
  0 siblings, 1 reply; 28+ messages in thread
From: Moessbauer, Felix @ 2023-01-11  8:23 UTC (permalink / raw)
  To: Schild, Henning
  Cc: Bovensiepen, Daniel (bovi), isar-users, Kiszka, Jan, venkata.pyla

On Wed, 2023-01-11 at 09:06 +0100, Henning Schild wrote:
> Am Wed, 11 Jan 2023 04:11:32 +0000
> schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:
> 
> > When calling python scripts, python automatically creates cache
> > files
> > to speedup future invocations of the same sources. This often
> > happens
> > in postinst scripts, that directly run in the image chroot. The
> > created debian packages do not ship these files, as the debheper
> > scripts remove them before installing.
> > 
> > For the rootfs part, we manually have to do it to also not
> > include these in the final image. This patch implements this logic
> > in
> > a custom cleanup postprocess step. As there might be situations
> > where
> > shipping of a subset of the caches is desireable (e.g. readonly
> > rootfs
> > images), we add support to control this logic using
> > ROOTFS_FEATURES.
> > 
> > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > ---
> >  meta/classes/image.bbclass  | 2 +-
> >  meta/classes/rootfs.bbclass | 6 ++++++
> >  2 files changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/meta/classes/image.bbclass
> > b/meta/classes/image.bbclass
> > index 519a2e5..b86a428 100644
> > --- a/meta/classes/image.bbclass
> > +++ b/meta/classes/image.bbclass
> > @@ -80,7 +80,7 @@ image_do_mounts() {
> >  }
> >  
> >  ROOTFSDIR = "${IMAGE_ROOTFS}"
> > -ROOTFS_FEATURES += "clean-package-cache generate-manifest
> > export-dpkg-status clean-log-files clean-debconf-cache"
> > +ROOTFS_FEATURES += "clean-package-cache clean-pycache
> > generate-manifest export-dpkg-status clean-log-files
> > clean-debconf-cache" ROOTFS_PACKAGES += "${IMAGE_PREINSTALL}
> > ${IMAGE_INSTALL}" ROOTFS_MANIFEST_DEPLOY_DIR ?=
> > "${DEPLOY_DIR_IMAGE}"
> > ROOTFS_DPKGSTATUS_DEPLOY_DIR ?= "${DEPLOY_DIR_IMAGE}" diff --git
> > a/meta/classes/rootfs.bbclass b/meta/classes/rootfs.bbclass index
> > 786682d..325e7ae 100644 --- a/meta/classes/rootfs.bbclass +++
> > b/meta/classes/rootfs.bbclass @@ -252,6 +252,12 @@
> > rootfs_postprocess_clean_debconf_cache() { sudo rm -rf
> > "${ROOTFSDIR}/var/cache/debconf/"* }
> >  
> > +ROOTFS_POSTPROCESS_COMMAND +=
> > "${@bb.utils.contains('ROOTFS_FEATURES', 'clean-pycache',
> > 'rootfs_postprocess_clean_pycache', '', d)}"
> > +rootfs_postprocess_clean_pycache() {
> > +    sudo find ${ROOTFSDIR}/usr -type f -name '*.pyc'       -delete
> > -print
> > +    sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__' -delete
> > -print +}
> 
> Are we sure that this can never be valid content of any package? I
> suggest we double check with dpkg.

I already checked this. Shipping the __pycache__ folder is a linitan
error [1], shipping any .pyc files is a linitan warning [2].

Adding bbwarn here does not make sense either, as we cannot distinguish
between pycache entries from a broken package and ones created by
postinst scripts. Anyways, pyc files are just cache files and these
should not be part of any package or image.

In case a user really wants to ship .pyc files, he can still disable
this rootfs feature. But the debian ruleset should be our baseline, not
some erroneous behavior that somebody could implement.

[1] https://lintian.debian.org/tags/package-installs-python-pycache-dir
[2]
https://lintian.debian.org/tags/source-contains-prebuilt-python-object

Felix

> 
> Henning
> 
> >  ROOTFS_POSTPROCESS_COMMAND +=
> > "${@bb.utils.contains('ROOTFS_FEATURES', 'generate-manifest',
> > 'rootfs_generate_manifest', '', d)}" rootfs_generate_manifest () {
> > mkdir -p ${ROOTFS_MANIFEST_DEPLOY_DIR}
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 06/11] update debian initramfs in deterministic mode
  2023-01-11  4:11 ` [PATCH 06/11] update debian initramfs in deterministic mode Felix Moessbauer
@ 2023-01-11  8:23   ` Henning Schild
  2023-01-11  8:39     ` Moessbauer, Felix
  0 siblings, 1 reply; 28+ messages in thread
From: Henning Schild @ 2023-01-11  8:23 UTC (permalink / raw)
  To: Felix Moessbauer; +Cc: isar-users, jan.kiszka, daniel.bovensiepen, venkata.pyla

Am Wed, 11 Jan 2023 04:11:35 +0000
schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:

> This patch exports the SOURCE_DATE_EPOCH variable in the image install
> task. By that, update-initramfs is switched into reproducible mode.
> Before this patch, each trigger of update-initramfs created a new
> non-deterministic version of the initramfs.
> 
> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> ---
>  meta/classes/image.bbclass | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/meta/classes/image.bbclass b/meta/classes/image.bbclass
> index b86a428..c981c7a 100644
> --- a/meta/classes/image.bbclass
> +++ b/meta/classes/image.bbclass
> @@ -304,6 +304,13 @@ python() {
>  }
>  
>  
> +# make generation of initramfs reproducible
> +rootfs_install_pkgs_install_prepend() {
> +    if [ ! -z "${SOURCE_DATE_EPOCH}" ]; then
> +        export SOURCE_DATE_EPOCH="${SOURCE_DATE_EPOCH}"
> +    fi
> +}

Why prepend and not put this right into the task? This will be hard to
maintain.

Henning

>  # here we call a command that should describe your whole build
> system, # this could be "git describe" or something similar.
>  # set ISAR_RELEASE_CMD to customize, or override do_mark_rootfs to
> do something


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/11] remove non-portable ldconfig aux-cache
  2023-01-11  8:19   ` Henning Schild
@ 2023-01-11  8:31     ` Moessbauer, Felix
  2023-01-11 12:52       ` Henning Schild
  0 siblings, 1 reply; 28+ messages in thread
From: Moessbauer, Felix @ 2023-01-11  8:31 UTC (permalink / raw)
  To: Schild, Henning
  Cc: Bovensiepen, Daniel (bovi), isar-users, Kiszka, Jan, venkata.pyla

On Wed, 2023-01-11 at 09:19 +0100, Henning Schild wrote:
> Am Wed, 11 Jan 2023 04:11:33 +0000
> schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:
> 
> > This patch removes the ldconfig aux-cache from the final rootfs.
> > The cache is both not portable across file systems, as well as non
> > reproducible.
> > 
> > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > ---
> >  meta/classes/rootfs.bbclass | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/meta/classes/rootfs.bbclass
> > b/meta/classes/rootfs.bbclass
> > index 325e7ae..226fa8b 100644
> > --- a/meta/classes/rootfs.bbclass
> > +++ b/meta/classes/rootfs.bbclass
> > @@ -258,6 +258,13 @@ rootfs_postprocess_clean_pycache() {
> >      sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__' -delete
> > -print }
> >  
> > +ROOTFS_POSTPROCESS_COMMAND +=
> > "rootfs_postprocess_clean_ldconfig_cache"
> > +rootfs_postprocess_clean_ldconfig_cache() {
> > +    # the ldconfig aux-cache is not portable and breaks
> > reproducability
> > +    # https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=845034#49
> > +    sudo rm -f ${ROOTFSDIR}/var/cache/ldconfig/aux-cache
> > +}
> 
> Should this not be enabled by default?

Yes, this is always enabled.
But it did not really fit into any other postprocess command, so I
decided to add a dedicated one.

Felix

> 
> Henning
> 
> >  ROOTFS_POSTPROCESS_COMMAND +=
> > "${@bb.utils.contains('ROOTFS_FEATURES', 'generate-manifest',
> > 'rootfs_generate_manifest', '', d)}" rootfs_generate_manifest () {
> > mkdir -p ${ROOTFS_MANIFEST_DEPLOY_DIR}
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 06/11] update debian initramfs in deterministic mode
  2023-01-11  8:23   ` Henning Schild
@ 2023-01-11  8:39     ` Moessbauer, Felix
  2023-01-11 12:55       ` Henning Schild
  0 siblings, 1 reply; 28+ messages in thread
From: Moessbauer, Felix @ 2023-01-11  8:39 UTC (permalink / raw)
  To: Schild, Henning
  Cc: Bovensiepen, Daniel (bovi), isar-users, Kiszka, Jan, venkata.pyla

On Wed, 2023-01-11 at 09:23 +0100, Henning Schild wrote:
> Am Wed, 11 Jan 2023 04:11:35 +0000
> schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:
> 
> > This patch exports the SOURCE_DATE_EPOCH variable in the image
> > install
> > task. By that, update-initramfs is switched into reproducible mode.
> > Before this patch, each trigger of update-initramfs created a new
> > non-deterministic version of the initramfs.
> > 
> > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > ---
> >  meta/classes/image.bbclass | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/meta/classes/image.bbclass
> > b/meta/classes/image.bbclass
> > index b86a428..c981c7a 100644
> > --- a/meta/classes/image.bbclass
> > +++ b/meta/classes/image.bbclass
> > @@ -304,6 +304,13 @@ python() {
> >  }
> >  
> >  
> > +# make generation of initramfs reproducible
> > +rootfs_install_pkgs_install_prepend() {
> > +    if [ ! -z "${SOURCE_DATE_EPOCH}" ]; then
> > +        export SOURCE_DATE_EPOCH="${SOURCE_DATE_EPOCH}"
> > +    fi
> > +}
> 
> Why prepend and not put this right into the task? This will be hard
> to
> maintain.

Yes, true. However, the rootfs_install_pkgs_install is shared across
all rootfs, but we really only want to set the SOURCE_DATE_EPOCH
variable for the final target image install. If we would add it
globally, this would break SSTATE caching all over the place, as it
would have influence on the sbuild chroots.

On the other side, we also cannot whitelist the variable as it
internally changes the logic of many tools so that they run in
deterministic mode. And we also have to rebuild parts that depend on
the value of the variable.

Felix

> 
> Henning
> 
> >  # here we call a command that should describe your whole build
> > system, # this could be "git describe" or something similar.
> >  # set ISAR_RELEASE_CMD to customize, or override do_mark_rootfs to
> > do something
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 10/11] deb_add_changelog: use SOURCE_DATE_EPOCH
  2023-01-11  4:11 ` [PATCH 10/11] deb_add_changelog: use SOURCE_DATE_EPOCH Felix Moessbauer
@ 2023-01-11  8:49   ` Henning Schild
  2023-01-11  9:06     ` Moessbauer, Felix
  0 siblings, 1 reply; 28+ messages in thread
From: Henning Schild @ 2023-01-11  8:49 UTC (permalink / raw)
  To: Felix Moessbauer; +Cc: isar-users, jan.kiszka, daniel.bovensiepen, venkata.pyla

Am Wed, 11 Jan 2023 04:11:39 +0000
schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:

> In case the SOURCE_DATE_EPOCH bb variable is set, use that value
> both for the auto-generated changelog as well as when appending to
> an existing changelog.
> 
> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> ---
>  meta/classes/debianize.bbclass | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/meta/classes/debianize.bbclass
> b/meta/classes/debianize.bbclass index a6694a0..3d5d934 100644
> --- a/meta/classes/debianize.bbclass
> +++ b/meta/classes/debianize.bbclass
> @@ -14,7 +14,7 @@ MAINTAINER ??= "Unknown maintainer
> <unknown@example.com>" 
>  deb_add_changelog() {
>  	changelog_v="${CHANGELOG_V}"
> -	timestamp=3600
> +	timestamp=${@ d.getVar('SOURCE_DATE_EPOCH', True) or '3600' }
>  	if [ -f ${S}/debian/changelog ]; then
>  		if [ ! -f ${WORKDIR}/changelog.orig ]; then
>  			cp ${S}/debian/changelog
> ${WORKDIR}/changelog.orig @@ -23,9 +23,11 @@ deb_add_changelog() {
>  		if [ -s ${WORKDIR}/changelog.orig ]; then
>  			orig_version=$(dpkg-parsechangelog -l
> ${WORKDIR}/changelog.orig -S Version) changelog_v=$(echo
> "${changelog_v}" | sed 's/<orig-version>/'${orig_version}'/')
> -			orig_date=$(dpkg-parsechangelog -l
> ${WORKDIR}/changelog.orig -S Date)
> -			orig_seconds=$(date --date="${orig_date}"
> +'%s')
> -			timestamp=$(expr ${orig_seconds} + 42)
> +			if [ -z "${SOURCE_DATE_EPOCH}" ]; then
> +				orig_date=$(dpkg-parsechangelog -l
> ${WORKDIR}/changelog.orig -S Date)
> +				orig_seconds=$(date
> --date="${orig_date}" +'%s')
> +				timestamp=$(expr ${orig_seconds} +
> 42)

What would happen if we prepended an entry older than orig-date? I hope
this would trigger some sort of warning or maybe even package build
failure.

I still think the images SOURCE_DATE_EPOCH should not be used for
individual components. The value of doing so remains unclear, while it
is very risky to use global scope variables to construct package
content. Correct me if i am wrong, but without this patch everything
will be reproducible just fine.

Henning

> +			fi
>  		fi
>  	fi
>  


^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 00/11] Make rootfs build reproducible
  2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
                   ` (11 preceding siblings ...)
  2023-01-11  6:51 ` [PATCH 00/11] Make rootfs build reproducible Jan Kiszka
@ 2023-01-11  9:04 ` Venkata.Pyla
  12 siblings, 0 replies; 28+ messages in thread
From: Venkata.Pyla @ 2023-01-11  9:04 UTC (permalink / raw)
  To: felix.moessbauer, isar-users
  Cc: jan.kiszka, daniel.bovensiepen, henning.schild



>-----Original Message-----
>From: isar-users@googlegroups.com <isar-users@googlegroups.com> On
>Behalf Of Felix Moessbauer
>Sent: 11 January 2023 09:41
>To: isar-users@googlegroups.com
>Cc: jan.kiszka@siemens.com; daniel.bovensiepen@siemens.com;
>henning.schild@siemens.com; pyla venkata(TSIP TMIEC ODG Porting)
><Venkata.Pyla@toshiba-tsip.com>; Felix Moessbauer
><felix.moessbauer@siemens.com>
>Subject: [PATCH 00/11] Make rootfs build reproducible
>
>This series finally makes the rootfs generation bit-reproducible from debian
>bullseye on. Parts of it have already been sent as individual patches. However,
>image reproducibility can only be achived once all parts are reproducible itself.
>By that, these patches are included in this series as well.
>
>With this series, the following parts are now fully reproducible.
>This has been tested on the isar-image-base target.
>
>- custom initramfs (creation and updates)
>- debian initramfs (only updates are relevant)
>- custom kernel (debian kernel is reproducible itself)
>- rootfs itself
>- tar file generation (<image>.tar)
>- ext4 generation (only from bookworm on, more tests needed)
>
>Other parts that are still not reproducible are:
>
>- WIC (should be solved in OE already)
>- containers (untested yet)

This is great work, thank you.

>
>Best regards,
>Felix Moessbauer
>Siemens AG
>
>Felix Moessbauer (10):
>  fix rebuild of rootfs_finalize task
>  rootfs postprocess: clean python cache
>  remove non-portable ldconfig aux-cache
>  generate deterministic clear-text password hash
>  update debian initramfs in deterministic mode
>  create custom initramfs in deterministic mode
>  make deb_add_changelog idempotent
>  deb_add_changelog: set timestamp to valid epoch
>  deb_add_changelog: use SOURCE_DATE_EPOCH
>  make custom linux-image bit-by-bit reproducible
>
>venkata pyla (1):
>  image.bbclass: fix non-reproducible file time-stamps inside rootfs
>
> meta-isar/conf/local.conf.sample              | 10 +++++++++
> meta/classes/debianize.bbclass                | 22 +++++++++++++------
> meta/classes/image-account-extension.bbclass  | 10 ++++++++-
> meta/classes/image.bbclass                    | 21 ++++++++++++++++--
> meta/classes/initramfs.bbclass                |  5 +++++
> meta/classes/rootfs.bbclass                   | 13 +++++++++++
> .../linux/files/debian/isar/build.tmpl        |  1 +
> .../linux/files/debian/rules.tmpl             | 14 +++++++++++-
> meta/recipes-kernel/linux/linux-custom.inc    |  2 ++
> 9 files changed, 87 insertions(+), 11 deletions(-)
>
>--
>2.34.1
>
>--
>You received this message because you are subscribed to the Google Groups
>"isar-users" group.
>To unsubscribe from this group and stop receiving emails from it, send an email
>to isar-users+unsubscribe@googlegroups.com.
>To view this discussion on the web visit
>https://groups.google.com/d/msgid/isar-users/20230111041140.3460393-1-
>felix.moessbauer%40siemens.com.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 10/11] deb_add_changelog: use SOURCE_DATE_EPOCH
  2023-01-11  8:49   ` Henning Schild
@ 2023-01-11  9:06     ` Moessbauer, Felix
  0 siblings, 0 replies; 28+ messages in thread
From: Moessbauer, Felix @ 2023-01-11  9:06 UTC (permalink / raw)
  To: Schild, Henning
  Cc: Bovensiepen, Daniel (bovi), isar-users, Kiszka, Jan, venkata.pyla

On Wed, 2023-01-11 at 09:49 +0100, Henning Schild wrote:
> Am Wed, 11 Jan 2023 04:11:39 +0000
> schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:
> 
> > In case the SOURCE_DATE_EPOCH bb variable is set, use that value
> > both for the auto-generated changelog as well as when appending to
> > an existing changelog.
> > 
> > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > ---
> >  meta/classes/debianize.bbclass | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/meta/classes/debianize.bbclass
> > b/meta/classes/debianize.bbclass index a6694a0..3d5d934 100644
> > --- a/meta/classes/debianize.bbclass
> > +++ b/meta/classes/debianize.bbclass
> > @@ -14,7 +14,7 @@ MAINTAINER ??= "Unknown maintainer
> > <unknown@example.com>" 
> >  deb_add_changelog() {
> >         changelog_v="${CHANGELOG_V}"
> > -       timestamp=3600
> > +       timestamp=${@ d.getVar('SOURCE_DATE_EPOCH', True) or '3600'
> > }
> >         if [ -f ${S}/debian/changelog ]; then
> >                 if [ ! -f ${WORKDIR}/changelog.orig ]; then
> >                         cp ${S}/debian/changelog
> > ${WORKDIR}/changelog.orig @@ -23,9 +23,11 @@ deb_add_changelog() {
> >                 if [ -s ${WORKDIR}/changelog.orig ]; then
> >                         orig_version=$(dpkg-parsechangelog -l
> > ${WORKDIR}/changelog.orig -S Version) changelog_v=$(echo
> > "${changelog_v}" | sed 's/<orig-version>/'${orig_version}'/')
> > -                       orig_date=$(dpkg-parsechangelog -l
> > ${WORKDIR}/changelog.orig -S Date)
> > -                       orig_seconds=$(date --date="${orig_date}"
> > +'%s')
> > -                       timestamp=$(expr ${orig_seconds} + 42)
> > +                       if [ -z "${SOURCE_DATE_EPOCH}" ]; then
> > +                               orig_date=$(dpkg-parsechangelog -l
> > ${WORKDIR}/changelog.orig -S Date)
> > +                               orig_seconds=$(date
> > --date="${orig_date}" +'%s')
> > +                               timestamp=$(expr ${orig_seconds} +
> > 42)
> 
> What would happen if we prepended an entry older than orig-date? I
> hope
> this would trigger some sort of warning or maybe even package build
> failure.

Valid point.

Actually, this works perfectly fine. But I'm not sure if this is
undefined behavior or actually allowed by Debian standards.

> 
> I still think the images SOURCE_DATE_EPOCH should not be used for
> individual components. The value of doing so remains unclear, while
> it
> is very risky to use global scope variables to construct package
> content. Correct me if i am wrong, but without this patch everything
> will be reproducible just fine.

The value is indeed very little and probably it will need more
fundamental discussions about auto-generated parts. Let's drop this
patch for now, so we can integrate the rest in a timely manner.

Felix

> 
> Henning
> 
> > +                       fi
> >                 fi
> >         fi
> >  
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 03/11] rootfs postprocess: clean python cache
  2023-01-11  8:23     ` Moessbauer, Felix
@ 2023-01-11 12:47       ` Henning Schild
  2023-01-11 13:18         ` Moessbauer, Felix
  0 siblings, 1 reply; 28+ messages in thread
From: Henning Schild @ 2023-01-11 12:47 UTC (permalink / raw)
  To: Moessbauer, Felix (T CED INW-CN)
  Cc: Bovensiepen, Daniel (bovi) (T CED INW-CN),
	isar-users, Kiszka, Jan (T CED),
	venkata.pyla

Am Wed, 11 Jan 2023 09:23:01 +0100
schrieb "Moessbauer, Felix (T CED INW-CN)"
<felix.moessbauer@siemens.com>:

> On Wed, 2023-01-11 at 09:06 +0100, Henning Schild wrote:
> > Am Wed, 11 Jan 2023 04:11:32 +0000
> > schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:
> >  
> > > When calling python scripts, python automatically creates cache
> > > files
> > > to speedup future invocations of the same sources. This often
> > > happens
> > > in postinst scripts, that directly run in the image chroot. The
> > > created debian packages do not ship these files, as the debheper
> > > scripts remove them before installing.
> > >
> > > For the rootfs part, we manually have to do it to also not
> > > include these in the final image. This patch implements this logic
> > > in
> > > a custom cleanup postprocess step. As there might be situations
> > > where
> > > shipping of a subset of the caches is desireable (e.g. readonly
> > > rootfs
> > > images), we add support to control this logic using
> > > ROOTFS_FEATURES.
> > >
> > > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > > ---
> > >  meta/classes/image.bbclass  | 2 +-
> > >  meta/classes/rootfs.bbclass | 6 ++++++
> > >  2 files changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/meta/classes/image.bbclass
> > > b/meta/classes/image.bbclass
> > > index 519a2e5..b86a428 100644
> > > --- a/meta/classes/image.bbclass
> > > +++ b/meta/classes/image.bbclass
> > > @@ -80,7 +80,7 @@ image_do_mounts() {
> > >  }
> > >
> > >  ROOTFSDIR = "${IMAGE_ROOTFS}"
> > > -ROOTFS_FEATURES += "clean-package-cache generate-manifest
> > > export-dpkg-status clean-log-files clean-debconf-cache"
> > > +ROOTFS_FEATURES += "clean-package-cache clean-pycache
> > > generate-manifest export-dpkg-status clean-log-files
> > > clean-debconf-cache" ROOTFS_PACKAGES += "${IMAGE_PREINSTALL}
> > > ${IMAGE_INSTALL}" ROOTFS_MANIFEST_DEPLOY_DIR ?=
> > > "${DEPLOY_DIR_IMAGE}"
> > > ROOTFS_DPKGSTATUS_DEPLOY_DIR ?= "${DEPLOY_DIR_IMAGE}" diff --git
> > > a/meta/classes/rootfs.bbclass b/meta/classes/rootfs.bbclass index
> > > 786682d..325e7ae 100644 --- a/meta/classes/rootfs.bbclass +++
> > > b/meta/classes/rootfs.bbclass @@ -252,6 +252,12 @@
> > > rootfs_postprocess_clean_debconf_cache() { sudo rm -rf
> > > "${ROOTFSDIR}/var/cache/debconf/"* }
> > >
> > > +ROOTFS_POSTPROCESS_COMMAND +=
> > > "${@bb.utils.contains('ROOTFS_FEATURES', 'clean-pycache',
> > > 'rootfs_postprocess_clean_pycache', '', d)}"
> > > +rootfs_postprocess_clean_pycache() {
> > > +    sudo find ${ROOTFSDIR}/usr -type f -name '*.pyc'
> > > -delete -print
> > > +    sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__'
> > > -delete -print +}  
> >
> > Are we sure that this can never be valid content of any package? I
> > suggest we double check with dpkg.  
> 
> I already checked this. Shipping the __pycache__ folder is a linitan
> error [1], shipping any .pyc files is a linitan warning [2].
> 
> Adding bbwarn here does not make sense either, as we cannot
> distinguish between pycache entries from a broken package and ones
> created by postinst scripts. Anyways, pyc files are just cache files
> and these should not be part of any package or image.

Can we not ask dpkg -S for every file before we delete it? Removing
files owned by package would likely be wrong. No matter what you might
think of the quality of such a package and how many debian rules you
cite. We have these kinds of packages, coming from funny vendors and
maybe also from weird recipes.

I am not worried about packages coming from debian and built with
debian tooling.

Henning

> In case a user really wants to ship .pyc files, he can still disable
> this rootfs feature. But the debian ruleset should be our baseline,
> not some erroneous behavior that somebody could implement.
> 
> [1]
> https://lintian.debian.org/tags/package-installs-python-pycache-dir
> [2]
> https://lintian.debian.org/tags/source-contains-prebuilt-python-object
>
> Felix
> 
> >
> > Henning
> >  
> > >  ROOTFS_POSTPROCESS_COMMAND +=
> > > "${@bb.utils.contains('ROOTFS_FEATURES', 'generate-manifest',
> > > 'rootfs_generate_manifest', '', d)}" rootfs_generate_manifest () {
> > > mkdir -p ${ROOTFS_MANIFEST_DEPLOY_DIR}  
> >  
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 04/11] remove non-portable ldconfig aux-cache
  2023-01-11  8:31     ` Moessbauer, Felix
@ 2023-01-11 12:52       ` Henning Schild
  0 siblings, 0 replies; 28+ messages in thread
From: Henning Schild @ 2023-01-11 12:52 UTC (permalink / raw)
  To: Moessbauer, Felix (T CED INW-CN)
  Cc: Bovensiepen, Daniel (bovi) (T CED INW-CN),
	isar-users, Kiszka, Jan (T CED),
	venkata.pyla

Am Wed, 11 Jan 2023 09:31:00 +0100
schrieb "Moessbauer, Felix (T CED INW-CN)"
<felix.moessbauer@siemens.com>:

> On Wed, 2023-01-11 at 09:19 +0100, Henning Schild wrote:
> > Am Wed, 11 Jan 2023 04:11:33 +0000
> > schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:
> >  
> > > This patch removes the ldconfig aux-cache from the final rootfs.
> > > The cache is both not portable across file systems, as well as non
> > > reproducible.
> > >
> > > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > > ---
> > >  meta/classes/rootfs.bbclass | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/meta/classes/rootfs.bbclass
> > > b/meta/classes/rootfs.bbclass
> > > index 325e7ae..226fa8b 100644
> > > --- a/meta/classes/rootfs.bbclass
> > > +++ b/meta/classes/rootfs.bbclass
> > > @@ -258,6 +258,13 @@ rootfs_postprocess_clean_pycache() {
> > >      sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__'
> > > -delete -print }
> > >
> > > +ROOTFS_POSTPROCESS_COMMAND +=
> > > "rootfs_postprocess_clean_ldconfig_cache"
> > > +rootfs_postprocess_clean_ldconfig_cache() {
> > > +    # the ldconfig aux-cache is not portable and breaks
> > > reproducability
> > > +    # https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=845034#49
> > > +    sudo rm -f ${ROOTFSDIR}/var/cache/ldconfig/aux-cache
> > > +}  
> >
> > Should this not be enabled by default?  
> 
> Yes, this is always enabled.

A right. I somehow missed that first line where it gets enabled.

Henning

> But it did not really fit into any other postprocess command, so I
> decided to add a dedicated one.
> 
> Felix
> 
> >
> > Henning
> >  
> > >  ROOTFS_POSTPROCESS_COMMAND +=
> > > "${@bb.utils.contains('ROOTFS_FEATURES', 'generate-manifest',
> > > 'rootfs_generate_manifest', '', d)}" rootfs_generate_manifest () {
> > > mkdir -p ${ROOTFS_MANIFEST_DEPLOY_DIR}  
> >  
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 06/11] update debian initramfs in deterministic mode
  2023-01-11  8:39     ` Moessbauer, Felix
@ 2023-01-11 12:55       ` Henning Schild
  0 siblings, 0 replies; 28+ messages in thread
From: Henning Schild @ 2023-01-11 12:55 UTC (permalink / raw)
  To: Moessbauer, Felix (T CED INW-CN)
  Cc: Bovensiepen, Daniel (bovi) (T CED INW-CN),
	isar-users, Kiszka, Jan (T CED),
	venkata.pyla

Am Wed, 11 Jan 2023 09:39:34 +0100
schrieb "Moessbauer, Felix (T CED INW-CN)"
<felix.moessbauer@siemens.com>:

> On Wed, 2023-01-11 at 09:23 +0100, Henning Schild wrote:
> > Am Wed, 11 Jan 2023 04:11:35 +0000
> > schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:
> >  
> > > This patch exports the SOURCE_DATE_EPOCH variable in the image
> > > install
> > > task. By that, update-initramfs is switched into reproducible
> > > mode. Before this patch, each trigger of update-initramfs created
> > > a new non-deterministic version of the initramfs.
> > >
> > > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > > ---
> > >  meta/classes/image.bbclass | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/meta/classes/image.bbclass
> > > b/meta/classes/image.bbclass
> > > index b86a428..c981c7a 100644
> > > --- a/meta/classes/image.bbclass
> > > +++ b/meta/classes/image.bbclass
> > > @@ -304,6 +304,13 @@ python() {
> > >  }
> > >
> > >
> > > +# make generation of initramfs reproducible
> > > +rootfs_install_pkgs_install_prepend() {
> > > +    if [ ! -z "${SOURCE_DATE_EPOCH}" ]; then
> > > +        export SOURCE_DATE_EPOCH="${SOURCE_DATE_EPOCH}"
> > > +    fi
> > > +}  
> >
> > Why prepend and not put this right into the task? This will be hard
> > to
> > maintain.  
> 
> Yes, true. However, the rootfs_install_pkgs_install is shared across
> all rootfs, but we really only want to set the SOURCE_DATE_EPOCH
> variable for the final target image install. If we would add it
> globally, this would break SSTATE caching all over the place, as it
> would have influence on the sbuild chroots.
> 
> On the other side, we also cannot whitelist the variable as it
> internally changes the logic of many tools so that they run in
> deterministic mode. And we also have to rebuild parts that depend on
> the value of the variable.

makes sense, but this has to be commented in the code, maybe bits in
the commit message, but likely mostly in the code

Henning

> Felix
> 
> >
> > Henning
> >  
> > >  # here we call a command that should describe your whole build
> > > system, # this could be "git describe" or something similar.
> > >  # set ISAR_RELEASE_CMD to customize, or override do_mark_rootfs
> > > to do something  
> >  
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 03/11] rootfs postprocess: clean python cache
  2023-01-11 12:47       ` Henning Schild
@ 2023-01-11 13:18         ` Moessbauer, Felix
  2023-01-11 13:23           ` Jan Kiszka
  0 siblings, 1 reply; 28+ messages in thread
From: Moessbauer, Felix @ 2023-01-11 13:18 UTC (permalink / raw)
  To: Schild, Henning
  Cc: Bovensiepen, Daniel (bovi), isar-users, Kiszka, Jan, venkata.pyla

On Wed, 2023-01-11 at 13:47 +0100, Henning Schild wrote:
> Am Wed, 11 Jan 2023 09:23:01 +0100
> schrieb "Moessbauer, Felix (T CED INW-CN)"
> <felix.moessbauer@siemens.com>:
> 
> > On Wed, 2023-01-11 at 09:06 +0100, Henning Schild wrote:
> > > Am Wed, 11 Jan 2023 04:11:32 +0000
> > > schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:
> > >  
> > > > When calling python scripts, python automatically creates cache
> > > > files
> > > > to speedup future invocations of the same sources. This often
> > > > happens
> > > > in postinst scripts, that directly run in the image chroot. The
> > > > created debian packages do not ship these files, as the
> > > > debheper
> > > > scripts remove them before installing.
> > > > 
> > > > For the rootfs part, we manually have to do it to also not
> > > > include these in the final image. This patch implements this
> > > > logic
> > > > in
> > > > a custom cleanup postprocess step. As there might be situations
> > > > where
> > > > shipping of a subset of the caches is desireable (e.g. readonly
> > > > rootfs
> > > > images), we add support to control this logic using
> > > > ROOTFS_FEATURES.
> > > > 
> > > > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > > > ---
> > > >  meta/classes/image.bbclass  | 2 +-
> > > >  meta/classes/rootfs.bbclass | 6 ++++++
> > > >  2 files changed, 7 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/meta/classes/image.bbclass
> > > > b/meta/classes/image.bbclass
> > > > index 519a2e5..b86a428 100644
> > > > --- a/meta/classes/image.bbclass
> > > > +++ b/meta/classes/image.bbclass
> > > > @@ -80,7 +80,7 @@ image_do_mounts() {
> > > >  }
> > > > 
> > > >  ROOTFSDIR = "${IMAGE_ROOTFS}"
> > > > -ROOTFS_FEATURES += "clean-package-cache generate-manifest
> > > > export-dpkg-status clean-log-files clean-debconf-cache"
> > > > +ROOTFS_FEATURES += "clean-package-cache clean-pycache
> > > > generate-manifest export-dpkg-status clean-log-files
> > > > clean-debconf-cache" ROOTFS_PACKAGES += "${IMAGE_PREINSTALL}
> > > > ${IMAGE_INSTALL}" ROOTFS_MANIFEST_DEPLOY_DIR ?=
> > > > "${DEPLOY_DIR_IMAGE}"
> > > > ROOTFS_DPKGSTATUS_DEPLOY_DIR ?= "${DEPLOY_DIR_IMAGE}" diff --
> > > > git
> > > > a/meta/classes/rootfs.bbclass b/meta/classes/rootfs.bbclass
> > > > index
> > > > 786682d..325e7ae 100644 --- a/meta/classes/rootfs.bbclass +++
> > > > b/meta/classes/rootfs.bbclass @@ -252,6 +252,12 @@
> > > > rootfs_postprocess_clean_debconf_cache() { sudo rm -rf
> > > > "${ROOTFSDIR}/var/cache/debconf/"* }
> > > > 
> > > > +ROOTFS_POSTPROCESS_COMMAND +=
> > > > "${@bb.utils.contains('ROOTFS_FEATURES', 'clean-pycache',
> > > > 'rootfs_postprocess_clean_pycache', '', d)}"
> > > > +rootfs_postprocess_clean_pycache() {
> > > > +    sudo find ${ROOTFSDIR}/usr -type f -name '*.pyc'
> > > > -delete -print
> > > > +    sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__'
> > > > -delete -print +}  
> > > 
> > > Are we sure that this can never be valid content of any package?
> > > I
> > > suggest we double check with dpkg.  
> > 
> > I already checked this. Shipping the __pycache__ folder is a
> > linitan
> > error [1], shipping any .pyc files is a linitan warning [2].
> > 
> > Adding bbwarn here does not make sense either, as we cannot
> > distinguish between pycache entries from a broken package and ones
> > created by postinst scripts. Anyways, pyc files are just cache
> > files
> > and these should not be part of any package or image.
> 
> Can we not ask dpkg -S for every file before we delete it? Removing
> files owned by package would likely be wrong. No matter what you
> might

This does not scale. We are talking about potentially thousands of pyc
files (e.g. for tensorflow or pytorch).

> think of the quality of such a package and how many debian rules you
> cite. We have these kinds of packages, coming from funny vendors and
> maybe also from weird recipes.

I know. Anyways, the python code will very likely break in case only
the .pyc files are on the system, as these files depend on many
conditions which are different in the buildchroot and on the target.
In case any of the conditions is not met, it will be re-generated from
the .py file. Let us please not try to create overly complex solutions
for use-cases that are broken / invalid in the first place.

In short: I'm strictly against not removing these files. I even thought
about always running this cleanup command unconditionally.

I would also appreciate if we do not delay the whole reproducibility
story just because there might be some exotic and invalid use-cases
that break.

Felix

> 
> I am not worried about packages coming from debian and built with
> debian tooling.
> 
> Henning
> 
> > In case a user really wants to ship .pyc files, he can still
> > disable
> > this rootfs feature. But the debian ruleset should be our baseline,
> > not some erroneous behavior that somebody could implement.
> > 
> > [1]
> > https://lintian.debian.org/tags/package-installs-python-pycache-dir
> > [2]
> >  
> > https://lintian.debian.org/tags/source-contains-prebuilt-python-object
> > 
> > Felix
> > 
> > > 
> > > Henning
> > >  
> > > >  ROOTFS_POSTPROCESS_COMMAND +=
> > > > "${@bb.utils.contains('ROOTFS_FEATURES', 'generate-manifest',
> > > > 'rootfs_generate_manifest', '', d)}" rootfs_generate_manifest
> > > > () {
> > > > mkdir -p ${ROOTFS_MANIFEST_DEPLOY_DIR}  
> > >  
> > 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 03/11] rootfs postprocess: clean python cache
  2023-01-11 13:18         ` Moessbauer, Felix
@ 2023-01-11 13:23           ` Jan Kiszka
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Kiszka @ 2023-01-11 13:23 UTC (permalink / raw)
  To: Moessbauer, Felix (T CED INW-CN), Schild, Henning (T CED SES-DE)
  Cc: Bovensiepen, Daniel (bovi) (T CED INW-CN), isar-users, venkata.pyla

On 11.01.23 14:18, Moessbauer, Felix (T CED INW-CN) wrote:
> On Wed, 2023-01-11 at 13:47 +0100, Henning Schild wrote:
>> Am Wed, 11 Jan 2023 09:23:01 +0100
>> schrieb "Moessbauer, Felix (T CED INW-CN)"
>> <felix.moessbauer@siemens.com>:
>>
>>> On Wed, 2023-01-11 at 09:06 +0100, Henning Schild wrote:
>>>> Am Wed, 11 Jan 2023 04:11:32 +0000
>>>> schrieb Felix Moessbauer <felix.moessbauer@siemens.com>:
>>>>
>>>>> When calling python scripts, python automatically creates cache
>>>>> files
>>>>> to speedup future invocations of the same sources. This often
>>>>> happens
>>>>> in postinst scripts, that directly run in the image chroot. The
>>>>> created debian packages do not ship these files, as the
>>>>> debheper
>>>>> scripts remove them before installing.
>>>>>
>>>>> For the rootfs part, we manually have to do it to also not
>>>>> include these in the final image. This patch implements this
>>>>> logic
>>>>> in
>>>>> a custom cleanup postprocess step. As there might be situations
>>>>> where
>>>>> shipping of a subset of the caches is desireable (e.g. readonly
>>>>> rootfs
>>>>> images), we add support to control this logic using
>>>>> ROOTFS_FEATURES.
>>>>>
>>>>> Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
>>>>> ---
>>>>>  meta/classes/image.bbclass  | 2 +-
>>>>>  meta/classes/rootfs.bbclass | 6 ++++++
>>>>>  2 files changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/meta/classes/image.bbclass
>>>>> b/meta/classes/image.bbclass
>>>>> index 519a2e5..b86a428 100644
>>>>> --- a/meta/classes/image.bbclass
>>>>> +++ b/meta/classes/image.bbclass
>>>>> @@ -80,7 +80,7 @@ image_do_mounts() {
>>>>>  }
>>>>>
>>>>>  ROOTFSDIR = "${IMAGE_ROOTFS}"
>>>>> -ROOTFS_FEATURES += "clean-package-cache generate-manifest
>>>>> export-dpkg-status clean-log-files clean-debconf-cache"
>>>>> +ROOTFS_FEATURES += "clean-package-cache clean-pycache
>>>>> generate-manifest export-dpkg-status clean-log-files
>>>>> clean-debconf-cache" ROOTFS_PACKAGES += "${IMAGE_PREINSTALL}
>>>>> ${IMAGE_INSTALL}" ROOTFS_MANIFEST_DEPLOY_DIR ?=
>>>>> "${DEPLOY_DIR_IMAGE}"
>>>>> ROOTFS_DPKGSTATUS_DEPLOY_DIR ?= "${DEPLOY_DIR_IMAGE}" diff --
>>>>> git
>>>>> a/meta/classes/rootfs.bbclass b/meta/classes/rootfs.bbclass
>>>>> index
>>>>> 786682d..325e7ae 100644 --- a/meta/classes/rootfs.bbclass +++
>>>>> b/meta/classes/rootfs.bbclass @@ -252,6 +252,12 @@
>>>>> rootfs_postprocess_clean_debconf_cache() { sudo rm -rf
>>>>> "${ROOTFSDIR}/var/cache/debconf/"* }
>>>>>
>>>>> +ROOTFS_POSTPROCESS_COMMAND +=
>>>>> "${@bb.utils.contains('ROOTFS_FEATURES', 'clean-pycache',
>>>>> 'rootfs_postprocess_clean_pycache', '', d)}"
>>>>> +rootfs_postprocess_clean_pycache() {
>>>>> +    sudo find ${ROOTFSDIR}/usr -type f -name '*.pyc'
>>>>> -delete -print
>>>>> +    sudo find ${ROOTFSDIR}/usr -type d -name '__pycache__'
>>>>> -delete -print +}
>>>>
>>>> Are we sure that this can never be valid content of any package?
>>>> I
>>>> suggest we double check with dpkg.
>>>
>>> I already checked this. Shipping the __pycache__ folder is a
>>> linitan
>>> error [1], shipping any .pyc files is a linitan warning [2].
>>>
>>> Adding bbwarn here does not make sense either, as we cannot
>>> distinguish between pycache entries from a broken package and ones
>>> created by postinst scripts. Anyways, pyc files are just cache
>>> files
>>> and these should not be part of any package or image.
>>
>> Can we not ask dpkg -S for every file before we delete it? Removing
>> files owned by package would likely be wrong. No matter what you
>> might
> 
> This does not scale. We are talking about potentially thousands of pyc
> files (e.g. for tensorflow or pytorch).
> 
>> think of the quality of such a package and how many debian rules you
>> cite. We have these kinds of packages, coming from funny vendors and
>> maybe also from weird recipes.
> 
> I know. Anyways, the python code will very likely break in case only
> the .pyc files are on the system, as these files depend on many
> conditions which are different in the buildchroot and on the target.
> In case any of the conditions is not met, it will be re-generated from
> the .py file. Let us please not try to create overly complex solutions
> for use-cases that are broken / invalid in the first place.
> 
> In short: I'm strictly against not removing these files. I even thought
> about always running this cleanup command unconditionally.
> 
> I would also appreciate if we do not delay the whole reproducibility
> story just because there might be some exotic and invalid use-cases
> that break.

Yes, focus should be on clean Debian packages first. If broken
downstream stumbles too often, we can still take measures.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2023-01-11 13:24 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-11  4:11 [PATCH 00/11] Make rootfs build reproducible Felix Moessbauer
2023-01-11  4:11 ` [PATCH 01/11] fix rebuild of rootfs_finalize task Felix Moessbauer
2023-01-11  4:11 ` [PATCH 02/11] image.bbclass: fix non-reproducible file time-stamps inside rootfs Felix Moessbauer
2023-01-11  4:11 ` [PATCH 03/11] rootfs postprocess: clean python cache Felix Moessbauer
2023-01-11  8:06   ` Henning Schild
2023-01-11  8:23     ` Moessbauer, Felix
2023-01-11 12:47       ` Henning Schild
2023-01-11 13:18         ` Moessbauer, Felix
2023-01-11 13:23           ` Jan Kiszka
2023-01-11  4:11 ` [PATCH 04/11] remove non-portable ldconfig aux-cache Felix Moessbauer
2023-01-11  8:19   ` Henning Schild
2023-01-11  8:31     ` Moessbauer, Felix
2023-01-11 12:52       ` Henning Schild
2023-01-11  4:11 ` [PATCH 05/11] generate deterministic clear-text password hash Felix Moessbauer
2023-01-11  8:21   ` Henning Schild
2023-01-11  4:11 ` [PATCH 06/11] update debian initramfs in deterministic mode Felix Moessbauer
2023-01-11  8:23   ` Henning Schild
2023-01-11  8:39     ` Moessbauer, Felix
2023-01-11 12:55       ` Henning Schild
2023-01-11  4:11 ` [PATCH 07/11] create custom " Felix Moessbauer
2023-01-11  4:11 ` [PATCH 08/11] make deb_add_changelog idempotent Felix Moessbauer
2023-01-11  4:11 ` [PATCH 09/11] deb_add_changelog: set timestamp to valid epoch Felix Moessbauer
2023-01-11  4:11 ` [PATCH 10/11] deb_add_changelog: use SOURCE_DATE_EPOCH Felix Moessbauer
2023-01-11  8:49   ` Henning Schild
2023-01-11  9:06     ` Moessbauer, Felix
2023-01-11  4:11 ` [PATCH 11/11] make custom linux-image bit-by-bit reproducible Felix Moessbauer
2023-01-11  6:51 ` [PATCH 00/11] Make rootfs build reproducible Jan Kiszka
2023-01-11  9:04 ` Venkata.Pyla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox