* [PATCH v2 0/7] Sstate maintenance script
@ 2022-05-09 10:15 Adriaan Schmidt
2022-05-09 10:15 ` [PATCH v2 1/7] meta-isar: improve cachability Adriaan Schmidt
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Adriaan Schmidt @ 2022-05-09 10:15 UTC (permalink / raw)
To: isar-users; +Cc: Adriaan Schmidt
We have been running CI with shared sstate caches for some months now, in
several downstream projects. This is the cache maintenance script that has
evolved during that time. Detailed documentation is in the script itself.
Main features:
- upload cache artifacts to shared caches on filesystem, http, or s3
- clean old artifacts from shared caches
- analyze in detail why cache misses happen (what has changed in the signatures)
- check the sstate signatures for absolute paths pointing to the build host
The last two are especially interesting, and have already yielded some
improvements to the cacheability of Isar, some already merged, and some
more in "[PATCH 0/7] Further improve cachability of ISAR".
p1 handles another absolute path in a variable (LAYERDIR_isar).
p2..3 are minor patches to bitbake (both already upstream) that greatly
improve accuracy and performance of the sstate analysis.
p4 refactors handling of the apt_* tasks. This was motivated by the sstate
analysis, but I think it also makes the code cleaner.
p5..6 add the sstate maintenance script (2 authors, hence 2 patches).
p7 includes a signature check into the sstate test case. This requires the
changes from "[PATCH 0/7] Further improve cachability of ISAR" for the
test to pass.
One issue: testing!
This is not easy, because it involves infrastructure, and artificial tests
that provide decent coverage would be quite complex to design.
If we declare that we sufficiently trust the sstate code, we could add a
shared/persistent cache to the Isar CI infrastructure. This would further test
the sstate feature and all steps involved in maintaining such a setup.
In addition, it would significantly speed up CI builds.
changes since v1:
- generally improved script
- analysis and cachability improvements in bitbake, dpkg-base, and meta-isar
- added "sstate linting" to the testsuite
Adriaan Schmidt (6):
meta-isar: improve cachability
bitbake-diffsigs: make finding of changed signatures more robust
bitbake-diffsigs: break on first dependent task difference
dpkg-base: refactor dependencies of apt_* tasks
scripts: add isar-sstate
testsuite: add cachability analysis to sstate test
Felix Moessbauer (1):
isar-sstate: add tool to check for caching issues
bitbake/lib/bb/siggen.py | 11 +-
meta-isar/conf/layer.conf | 1 +
meta/classes/dpkg-base.bbclass | 14 +-
meta/classes/dpkg.bbclass | 2 +-
scripts/isar-sstate | 863 +++++++++++++++++++++++++++++++++
testsuite/cibase.py | 8 +-
6 files changed, 884 insertions(+), 15 deletions(-)
create mode 100755 scripts/isar-sstate
--
2.30.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 1/7] meta-isar: improve cachability
2022-05-09 10:15 [PATCH v2 0/7] Sstate maintenance script Adriaan Schmidt
@ 2022-05-09 10:15 ` Adriaan Schmidt
2022-05-09 10:15 ` [PATCH v2 2/7] bitbake-diffsigs: make finding of changed signatures more robust Adriaan Schmidt
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Adriaan Schmidt @ 2022-05-09 10:15 UTC (permalink / raw)
To: isar-users; +Cc: Adriaan Schmidt
LAYERDIR_isar contains the absolute path of `meta-isar`, which breaks
cachability. To resolve this, we override the value that goes into
the signatures.
This also demonstrates the recommended way of dealing with this in
downstream layers that have `LAYERDIR_*`.
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
---
meta-isar/conf/layer.conf | 1 +
1 file changed, 1 insertion(+)
diff --git a/meta-isar/conf/layer.conf b/meta-isar/conf/layer.conf
index 9939bdc1..dec2658f 100644
--- a/meta-isar/conf/layer.conf
+++ b/meta-isar/conf/layer.conf
@@ -17,3 +17,4 @@ LAYERVERSION_isar = "3"
LAYERSERIES_COMPAT_isar = "v0.6"
LAYERDIR_isar = "${LAYERDIR}"
+LAYERDIR_isar[vardepvalue] = "meta-isar"
--
2.30.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 2/7] bitbake-diffsigs: make finding of changed signatures more robust
2022-05-09 10:15 [PATCH v2 0/7] Sstate maintenance script Adriaan Schmidt
2022-05-09 10:15 ` [PATCH v2 1/7] meta-isar: improve cachability Adriaan Schmidt
@ 2022-05-09 10:15 ` Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 3/7] bitbake-diffsigs: break on first dependent task difference Adriaan Schmidt
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Adriaan Schmidt @ 2022-05-09 10:15 UTC (permalink / raw)
To: isar-users; +Cc: Adriaan Schmidt
In `runtaskhashes`, the keys contain the absolute paths to the recipe. When
working with shared sstate caches (where these absolute paths can be different)
we see that compare_sigfiles does not identifiy a changed hash of a dependent
task as "changed", but instead as "removed"&"added", preventing the function
from recursing and continuing the comparison.
By calling `clean_basepaths` before comparing the `runtaskhashes` dicts, we
avoid this.
Backported from upstream:
https://git.openembedded.org/bitbake/commit/?id=7358378b90b68111779e6ae72948e5e7a3de00a9
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
---
bitbake/lib/bb/siggen.py | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/bitbake/lib/bb/siggen.py b/bitbake/lib/bb/siggen.py
index 0d88c6ec..8b23fd04 100644
--- a/bitbake/lib/bb/siggen.py
+++ b/bitbake/lib/bb/siggen.py
@@ -944,8 +944,8 @@ def compare_sigfiles(a, b, recursecb=None, color=False, collapsed=False):
if 'runtaskhashes' in a_data and 'runtaskhashes' in b_data:
- a = a_data['runtaskhashes']
- b = b_data['runtaskhashes']
+ a = clean_basepaths(a_data['runtaskhashes'])
+ b = clean_basepaths(b_data['runtaskhashes'])
changed, added, removed = dict_diff(a, b)
if added:
for dep in added:
@@ -956,7 +956,7 @@ def compare_sigfiles(a, b, recursecb=None, color=False, collapsed=False):
#output.append("Dependency on task %s was replaced by %s with same hash" % (dep, bdep))
bdep_found = True
if not bdep_found:
- output.append(color_format("{color_title}Dependency on task %s was added{color_default} with hash %s") % (clean_basepath(dep), b[dep]))
+ output.append(color_format("{color_title}Dependency on task %s was added{color_default} with hash %s") % (dep, b[dep]))
if removed:
for dep in removed:
adep_found = False
@@ -966,11 +966,11 @@ def compare_sigfiles(a, b, recursecb=None, color=False, collapsed=False):
#output.append("Dependency on task %s was replaced by %s with same hash" % (adep, dep))
adep_found = True
if not adep_found:
- output.append(color_format("{color_title}Dependency on task %s was removed{color_default} with hash %s") % (clean_basepath(dep), a[dep]))
+ output.append(color_format("{color_title}Dependency on task %s was removed{color_default} with hash %s") % (dep, a[dep]))
if changed:
for dep in changed:
if not collapsed:
- output.append(color_format("{color_title}Hash for dependent task %s changed{color_default} from %s to %s") % (clean_basepath(dep), a[dep], b[dep]))
+ output.append(color_format("{color_title}Hash for dependent task %s changed{color_default} from %s to %s") % (dep, a[dep], b[dep]))
if callable(recursecb):
recout = recursecb(dep, a[dep], b[dep])
if recout:
--
2.30.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 3/7] bitbake-diffsigs: break on first dependent task difference
2022-05-09 10:15 [PATCH v2 0/7] Sstate maintenance script Adriaan Schmidt
2022-05-09 10:15 ` [PATCH v2 1/7] meta-isar: improve cachability Adriaan Schmidt
2022-05-09 10:15 ` [PATCH v2 2/7] bitbake-diffsigs: make finding of changed signatures more robust Adriaan Schmidt
@ 2022-05-09 10:16 ` Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 4/7] dpkg-base: refactor dependencies of apt_* tasks Adriaan Schmidt
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Adriaan Schmidt @ 2022-05-09 10:16 UTC (permalink / raw)
To: isar-users; +Cc: Adriaan Schmidt
Currently compare_sigfiles() recursively calculates differences on all
dependent tasks with changed hashes. This is done in arbitrary order, and
only the last of those results is returned while everything else is discarded.
This changes that to instead return the first difference and not calculate
any more, which significantly speeds up diffs of tasks with many dependencies.
backported from upstream:
https://git.openembedded.org/bitbake/commit/?id=8ed7722865d2dcfda1697aaf4e12b828934bf527
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
---
bitbake/lib/bb/siggen.py | 1 +
1 file changed, 1 insertion(+)
diff --git a/bitbake/lib/bb/siggen.py b/bitbake/lib/bb/siggen.py
index 8b23fd04..767aeb0a 100644
--- a/bitbake/lib/bb/siggen.py
+++ b/bitbake/lib/bb/siggen.py
@@ -980,6 +980,7 @@ def compare_sigfiles(a, b, recursecb=None, color=False, collapsed=False):
# If a dependent hash changed, might as well print the line above and then defer to the changes in
# that hash since in all likelyhood, they're the same changes this task also saw.
output = [output[-1]] + recout
+ break
a_taint = a_data.get('taint', None)
b_taint = b_data.get('taint', None)
--
2.30.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 4/7] dpkg-base: refactor dependencies of apt_* tasks
2022-05-09 10:15 [PATCH v2 0/7] Sstate maintenance script Adriaan Schmidt
` (2 preceding siblings ...)
2022-05-09 10:16 ` [PATCH v2 3/7] bitbake-diffsigs: break on first dependent task difference Adriaan Schmidt
@ 2022-05-09 10:16 ` Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 5/7] scripts: add isar-sstate Adriaan Schmidt
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Adriaan Schmidt @ 2022-05-09 10:16 UTC (permalink / raw)
To: isar-users; +Cc: Adriaan Schmidt
Only recipes with `apt://` sources in their SRC_URI run the three tasks
`apt_fetch`, `apt_unpack`, and `cleanall_apt`. The current implementation
creates the tasks (and dependencies) for all recipes and sets them to
`noexec` if they are not needed.
It turns out that bitbake doesn't generate sstate signatures for `noexec`
tasks, but carries them as dependencies to other tasks, which can break
analysis with sstate-diffsigs. Also, I suspect that `noexec` may not have
been designed as "optional", to be added in this way (OE never does this).
The new implementation only adds the three tasks when they are required.
It also:
- removes the dependency of `apt_fetch after do_unpack`
- makes `install_builddeps` depend explicitly on `${BUILDCHROOT_DEP}`,
a dependency which was previously only given via the `apt_fetch` task.
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
---
meta/classes/dpkg-base.bbclass | 14 ++++++--------
meta/classes/dpkg.bbclass | 2 +-
2 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/meta/classes/dpkg-base.bbclass b/meta/classes/dpkg-base.bbclass
index 86933c57..fe6d46dd 100644
--- a/meta/classes/dpkg-base.bbclass
+++ b/meta/classes/dpkg-base.bbclass
@@ -95,10 +95,9 @@ python() {
d.setVar('SRC_URI', ' '.join(new_src_uri))
d.prependVar('SRC_APT', ' '.join(src_apt))
- if d.getVar('SRC_APT').strip() == '':
- d.setVarFlag('do_apt_fetch', 'noexec', '1')
- d.setVarFlag('do_apt_unpack', 'noexec', '1')
- d.setVarFlag('do_cleanall_apt', 'noexec', '1')
+ if len(d.getVar('SRC_APT').strip()) > 0:
+ bb.build.addtask('apt_unpack', 'do_patch', '', d)
+ bb.build.addtask('cleanall_apt', 'do_cleanall', '', d)
}
do_apt_fetch() {
@@ -117,11 +116,11 @@ do_apt_fetch() {
dpkg_undo_mounts
}
-addtask apt_fetch after do_unpack before do_apt_unpack
+addtask apt_fetch
do_apt_fetch[lockfiles] += "${REPO_ISAR_DIR}/isar.lock"
# Add dependency from the correct buildchroot: host or target
-do_apt_fetch[depends] = "${BUILDCHROOT_DEP}"
+do_apt_fetch[depends] += "${BUILDCHROOT_DEP}"
do_apt_unpack() {
rm -rf ${S}
@@ -142,9 +141,8 @@ do_apt_unpack() {
dpkg_undo_mounts
}
-addtask apt_unpack after do_apt_fetch before do_patch
+addtask apt_unpack after do_apt_fetch
-addtask cleanall_apt before do_cleanall
do_cleanall_apt[nostamp] = "1"
do_cleanall_apt() {
for uri in "${SRC_APT}"; do
diff --git a/meta/classes/dpkg.bbclass b/meta/classes/dpkg.bbclass
index 320102ba..af833536 100644
--- a/meta/classes/dpkg.bbclass
+++ b/meta/classes/dpkg.bbclass
@@ -26,7 +26,7 @@ do_install_builddeps() {
}
addtask install_builddeps after do_prepare_build before do_dpkg_build
-do_install_builddeps[depends] += "isar-apt:do_cache_config"
+do_install_builddeps[depends] += "${BUILDCHROOT_DEP} isar-apt:do_cache_config"
# apt and reprepro may not run in parallel, acquire the Isar lock
do_install_builddeps[lockfiles] += "${REPO_ISAR_DIR}/isar.lock"
--
2.30.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 5/7] scripts: add isar-sstate
2022-05-09 10:15 [PATCH v2 0/7] Sstate maintenance script Adriaan Schmidt
` (3 preceding siblings ...)
2022-05-09 10:16 ` [PATCH v2 4/7] dpkg-base: refactor dependencies of apt_* tasks Adriaan Schmidt
@ 2022-05-09 10:16 ` Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 6/7] isar-sstate: add tool to check for caching issues Adriaan Schmidt
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Adriaan Schmidt @ 2022-05-09 10:16 UTC (permalink / raw)
To: isar-users; +Cc: Adriaan Schmidt
This adds a maintenance helper script to work with remote/shared
sstate caches.
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
---
scripts/isar-sstate | 794 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 794 insertions(+)
create mode 100755 scripts/isar-sstate
diff --git a/scripts/isar-sstate b/scripts/isar-sstate
new file mode 100755
index 00000000..8b541cf4
--- /dev/null
+++ b/scripts/isar-sstate
@@ -0,0 +1,794 @@
+#!/usr/bin/env python3
+"""
+This software is part of Isar
+Copyright (c) Siemens AG, 2022
+
+# isar-sstate: Helper for management of shared sstate caches
+
+Isar uses the sstate cache feature of bitbake to cache the output of certain
+build tasks, potentially speeding up builds significantly. This script is
+meant to help managing shared sstate caches, speeding up builds using cache
+artifacts created elsewhere. There are two main ways of accessing a shared
+sstate cache:
+ - Point `SSTATE_DIR` to a persistent location that is used by multiple
+ builds. bitbake will read artifacts from there, and also immediately
+ store generated cache artifacts in this location. This speeds up local
+ builds, and if `SSTATE_DIR` is located on a shared filesystem, it can
+ also benefit others.
+ - Point `SSTATE_DIR` to a local directory (e.g., simply use the default
+ value `${TOPDIR}/sstate-cache`), and additionally set `SSTATE_MIRRORS`
+ to a remote sstate cache. bitbake will use artifacts from both locations,
+ but will write newly created artifacts only to the local folder
+ `SSTATE_DIR`. To share them, you need to explicitly upload them to
+ the shared location, which is what isar-sstate is for.
+
+isar-sstate implements four commands (upload, clean, info, analyze),
+and supports three remote backends (filesystem, http/webdav, AWS S3).
+
+## Commands
+
+### upload
+
+The `upload` command pushes the contents of a local sstate cache to the
+remote location, uploading all files that don't already exist on the remote.
+
+### clean
+
+The `clean` command deletes old artifacts from the remote cache. It takes two
+arguments, `--max-age` and `--max-sig-age`, each of which must be a number,
+followed by one of `w`, `d`, `h`, `m`, or `s` (for weeks, days, hours, minutes,
+seconds, respectively).
+
+`--max-age` specifies up to which age artifacts should be kept in the cache.
+Anything older will be removed. Note that this only applies to the `.tgz` files
+containing the actual cached items, not the `.siginfo` files containing the
+cache metadata (signatures and hashes).
+To permit analysis of caching details using the `analyze` command, the siginfo
+files can be kept longer, as indicated by `--max-sig-age`. If not set explicitly,
+this defaults to `max_age`, and any explicitly given value can't be smaller
+than `max_age`.
+
+### info
+
+The `info` command scans the remote cache and displays some basic statistics.
+The argument `--verbose` increases the amount of information displayed.
+
+### analyze
+
+The `analyze` command iterates over all artifacts in the local sstate cache,
+and compares them to the contents of the remote cache. If an item is not
+present in the remote cache, the signature of the local item is compared
+to all potential matches in the remote cache, identified by matching
+architecture, recipe (`PN`), and task. This analysis has the same output
+format as `bitbake-diffsigs`.
+
+## Backends
+
+### Filesystem backend
+
+This uses a filesystem location as the remote cache. In case you can access
+your remote cache this way, you could also have bitbake write to the cache
+directly, by setting `SSTATE_DIR`. However, using `isar-sstate` gives
+you a uniform interface, and lets you use the same code/CI scripts across
+heterogeneous setups. Also, it gives you the `analyze` command.
+
+### http backend
+
+A http server with webdav extension can be used as remote cache.
+Apache can easily be configured to function as a remote sstate cache, e.g.:
+```
+<VirtualHost *:80>
+ Alias /sstate/ /path/to/sstate/location/
+ <Location /sstate/>
+ Dav on
+ Options Indexes
+ Require all granted
+ </Location>
+</VirtualHost>
+```
+In addition you need to load Apache's dav module:
+```
+a2enmod dav
+```
+
+To use the http backend, you need to install the Python webdavclient library.
+On Debian you would:
+```
+apt-get install python3-webdavclient
+```
+
+### S3 backend
+
+An AWS S3 bucket can be used as remote cache. You need to ensure that AWS
+credentials are present (e.g., in your AWS config file or as environment
+variables).
+
+To use the S3 backend you need to install the Python botocore library.
+On Debian you would:
+```
+apt-get install python3-botocore
+```
+"""
+
+import argparse
+from collections import namedtuple
+import datetime
+import os
+import re
+import shutil
+import sys
+from tempfile import NamedTemporaryFile
+import time
+
+sys.path.insert(0, os.path.join(os.path.dirname(os.path.realpath(__file__)), '..', 'bitbake', 'lib'))
+from bb.siggen import compare_sigfiles
+
+# runtime detection of supported targets
+webdav_supported = True
+try:
+ import webdav3.client
+ import webdav3.exceptions
+except ModuleNotFoundError:
+ webdav_supported = False
+
+s3_supported = True
+try:
+ import botocore.exceptions
+ import botocore.session
+except ModuleNotFoundError:
+ s3_supported = False
+
+SstateCacheEntry = namedtuple(
+ 'SstateCacheEntry', 'hash path arch pn task suffix islink age size'.split())
+
+# The filename of sstate items is defined in Isar:
+# SSTATE_PKGSPEC = "sstate:${PN}:${PACKAGE_ARCH}${TARGET_VENDOR}-${TARGET_OS}:"
+# "${PV}:${PR}:${SSTATE_PKGARCH}:${SSTATE_VERSION}:"
+
+# This regex extracts relevant fields:
+SstateRegex = re.compile(r'sstate:(?P<pn>[^:]*):[^:]*:[^:]*:[^:]*:'
+ r'(?P<arch>[^:]*):[^:]*:(?P<hash>[0-9a-f]*)_'
+ r'(?P<task>[^\.]*)\.(?P<suffix>.*)')
+
+
+class SstateTargetBase(object):
+ def __init__(self, path, cached=False):
+ """Constructor
+
+ :param path: URI of the remote (without leading 'protocol://')
+ """
+ self.use_cache = False
+ if cached:
+ self.enable_cache()
+
+ def __del__(self):
+ if self.use_cache:
+ self.cleanup_cache()
+
+ def __repr__(self):
+ """Format remote for printing
+
+ :returns: URI string, including 'protocol://'
+ """
+ pass
+
+ def exists(self, path=''):
+ """Check if a remote path exists
+
+ :param path: path (file or directory) to check
+ :returns: True if path exists, False otherwise
+ """
+ pass
+
+ def create(self):
+ """Try to create the remote
+
+ :returns: True if remote could be created, False otherwise
+ """
+ pass
+
+ def mkdir(self, path):
+ """Create a directory on the remote
+
+ :param path: path to create
+ :returns: True on success, False on failure
+ """
+ pass
+
+ def upload(self, path, filename):
+ """Uploads a local file to the remote
+
+ :param path: remote path to upload to
+ :param filename: local file to upload
+ """
+ pass
+
+ def delete(self, path):
+ """Delete remote file and remove potential empty directories
+
+ :param path: remote file to delete
+ """
+ pass
+
+ def list_all(self):
+ """List all sstate files in the remote
+
+ :returns: list of SstateCacheEntry objects
+ """
+ pass
+
+ def download(self, path):
+ """Prepare to temporarily access a remote file for reading
+
+ This is meant to provide access to siginfo files during analysis. Files
+ must not be modified, and should be released using release() once they
+ are no longer used.
+
+ :param path: remote path
+ :returns: local path to file
+ """
+ pass
+
+ def release(self, download_path):
+ """Release a temporary file
+
+ :param download_path: local file
+ """
+ pass
+
+ def enable_cache(self):
+ """Enable caching of downloads
+
+ This is a separate function, so you can decide after creation
+ if you want to enable caching.
+ """
+ self.use_cache = True
+ self.cache = {}
+ self.real_download = self.download
+ self.real_release = self.release
+ self.download = self.download_cached
+ self.release = self.release_cached
+
+ def download_cached(self, path):
+ """Download using cache
+
+ This function replaces download() when using the cache.
+ DO NOT OVERRIDE.
+ """
+ if path in self.cache:
+ return self.cache[path]
+ data = self.real_download(path)
+ self.cache[path] = data
+ return data
+
+ def release_cached(self, download_path):
+ """Release when using cache
+
+ This function replaces release() when using the cache.
+ DO NOT OVERRIDE.
+ """
+ pass
+
+ def cleanup_cache(self):
+ """Clean up all cached downloads.
+
+ Called by destructor.
+ """
+ for k, v in list(self.cache.items()):
+ self.real_release(v)
+ del(self.cache[k])
+
+
+class SstateFileTarget(SstateTargetBase):
+ def __init__(self, path, **kwargs):
+ super().__init__(path, **kwargs)
+ if path.startswith('file://'):
+ path = path[len('file://'):]
+ self.path = path
+ self.basepath = os.path.abspath(path)
+
+ def __repr__(self):
+ return f"file://{self.path}"
+
+ def exists(self, path=''):
+ return os.path.exists(os.path.join(self.basepath, path))
+
+ def create(self):
+ return self.mkdir('')
+
+ def mkdir(self, path):
+ try:
+ os.makedirs(os.path.join(self.basepath, path), exist_ok=True)
+ except OSError:
+ return False
+ return True
+
+ def upload(self, path, filename):
+ shutil.copy(filename, os.path.join(self.basepath, path))
+
+ def delete(self, path):
+ try:
+ os.remove(os.path.join(self.basepath, path))
+ except FileNotFoundError:
+ pass
+ dirs = path.split('/')[:-1]
+ for d in [dirs[:i] for i in range(len(dirs), 0, -1)]:
+ try:
+ os.rmdir(os.path.join(self.basepath, '/'.join(d)))
+ except FileNotFoundError:
+ pass
+ except OSError: # directory is not empty
+ break
+
+ def list_all(self):
+ all_files = []
+ now = time.time()
+ for subdir, dirs, files in os.walk(self.basepath):
+ reldir = subdir[(len(self.basepath)+1):]
+ for f in files:
+ m = SstateRegex.match(f)
+ if m is not None:
+ islink = os.path.islink(os.path.join(subdir, f))
+ age = int(now - os.path.getmtime(os.path.join(subdir, f)))
+ all_files.append(SstateCacheEntry(
+ path=os.path.join(reldir, f),
+ size=os.path.getsize(os.path.join(subdir, f)),
+ islink=islink,
+ age=age,
+ **(m.groupdict())))
+ return all_files
+
+ def download(self, path):
+ # we don't actually download, but instead just pass the local path
+ if not self.exists(path):
+ return None
+ return os.path.join(self.basepath, path)
+
+ def release(self, download_path):
+ # as we didn't download, there is nothing to clean up
+ pass
+
+
+class SstateDavTarget(SstateTargetBase):
+ def __init__(self, url, **kwargs):
+ if not webdav_supported:
+ print("ERROR: No webdav support. Please install the webdav3 Python module.")
+ print("INFO: on Debian: 'apt-get install python3-webdavclient'")
+ sys.exit(1)
+ super().__init__(url, **kwargs)
+ m = re.match('^([^:]+://[^/]+)/(.*)', url)
+ if not m:
+ print(f"Cannot parse target path: {url}")
+ sys.exit(1)
+ self.host = m.group(1)
+ self.basepath = m.group(2)
+ if not self.basepath.endswith('/'):
+ self.basepath += '/'
+ self.dav = webdav3.client.Client({'webdav_hostname': self.host})
+ self.tmpfiles = []
+
+ def __repr__(self):
+ return f"{self.host}/{self.basepath}"
+
+ def exists(self, path=''):
+ return self.dav.check(self.basepath + path)
+
+ def create(self):
+ return self.mkdir('')
+
+ def mkdir(self, path):
+ dirs = (self.basepath + path).split('/')
+
+ for i in range(len(dirs)):
+ d = '/'.join(dirs[:(i+1)]) + '/'
+ if not self.dav.check(d):
+ if not self.dav.mkdir(d):
+ return False
+ return True
+
+ def upload(self, path, filename):
+ return self.dav.upload_sync(remote_path=self.basepath + path, local_path=filename)
+
+ def delete(self, path):
+ self.dav.clean(self.basepath + path)
+ dirs = path.split('/')[1:-1]
+ for d in [dirs[:i] for i in range(len(dirs), 0, -1)]:
+ items = self.dav.list(self.basepath + '/'.join(d), get_info=True)
+ if len(items) > 0:
+ # collection is not empty
+ break
+ self.dav.clean(self.basepath + '/'.join(d))
+
+ def list_all(self):
+ now = time.time()
+
+ def recurse_dir(path):
+ files = []
+ for item in self.dav.list(path, get_info=True):
+ if item['isdir'] and not item['path'] == path:
+ files.extend(recurse_dir(item['path']))
+ elif not item['isdir']:
+ m = SstateRegex.match(item['path'][len(path):])
+ if m is not None:
+ modified = time.mktime(
+ datetime.datetime.strptime(
+ item['created'],
+ '%Y-%m-%dT%H:%M:%SZ').timetuple())
+ age = int(now - modified)
+ files.append(SstateCacheEntry(
+ path=item['path'][len(self.basepath):],
+ size=int(item['size']),
+ islink=False,
+ age=age,
+ **(m.groupdict())))
+ return files
+ return recurse_dir(self.basepath)
+
+ def download(self, path):
+ # download to a temporary file
+ tmp = NamedTemporaryFile(prefix='isar-sstate-', delete=False)
+ tmp.close()
+ try:
+ self.dav.download_sync(remote_path=self.basepath + path, local_path=tmp.name)
+ except webdav3.exceptions.RemoteResourceNotFound:
+ return None
+ self.tmpfiles.append(tmp.name)
+ return tmp.name
+
+ def release(self, download_path):
+ # remove the temporary download
+ if download_path is not None and download_path in self.tmpfiles:
+ os.remove(download_path)
+ self.tmpfiles = [f for f in self.tmpfiles if not f == download_path]
+
+
+class SstateS3Target(SstateTargetBase):
+ def __init__(self, path, **kwargs):
+ if not s3_supported:
+ print("ERROR: No S3 support. Please install the botocore Python module.")
+ print("INFO: on Debian: 'apt-get install python3-botocore'")
+ sys.exit(1)
+ super().__init__(path, **kwargs)
+ session = botocore.session.get_session()
+ self.s3 = session.create_client('s3')
+ if path.startswith('s3://'):
+ path = path[len('s3://'):]
+ m = re.match('^([^/]+)(?:/(.+)?)?$', path)
+ self.bucket = m.group(1)
+ if m.group(2):
+ self.basepath = m.group(2)
+ if not self.basepath.endswith('/'):
+ self.basepath += '/'
+ else:
+ self.basepath = ''
+ self.tmpfiles = []
+
+ def __repr__(self):
+ return f"s3://{self.bucket}/{self.basepath}"
+
+ def exists(self, path=''):
+ if path == '':
+ # check if the bucket exists
+ try:
+ self.s3.head_bucket(Bucket=self.bucket)
+ except botocore.exceptions.ClientError as e:
+ print(e)
+ print(e.response['Error']['Message'])
+ return False
+ return True
+ try:
+ self.s3.head_object(Bucket=self.bucket, Key=self.basepath + path)
+ except botocore.exceptions.ClientError as e:
+ if e.response['ResponseMetadata']['HTTPStatusCode'] != 404:
+ print(e)
+ print(e.response['Error']['Message'])
+ return False
+ return True
+
+ def create(self):
+ return self.exists()
+
+ def mkdir(self, path):
+ # in S3, folders are implicit and don't need to be created
+ return True
+
+ def upload(self, path, filename):
+ try:
+ self.s3.put_object(Body=open(filename, 'rb'), Bucket=self.bucket, Key=self.basepath + path)
+ except botocore.exceptions.ClientError as e:
+ print(e)
+ print(e.response['Error']['Message'])
+
+ def delete(self, path):
+ try:
+ self.s3.delete_object(Bucket=self.bucket, Key=self.basepath + path)
+ except botocore.exceptions.ClientError as e:
+ print(e)
+ print(e.response['Error']['Message'])
+
+ def list_all(self):
+ now = time.time()
+
+ def recurse_dir(path):
+ files = []
+ try:
+ result = self.s3.list_objects(Bucket=self.bucket, Prefix=path, Delimiter='/')
+ except botocore.exceptions.ClientError as e:
+ print(e)
+ print(e.response['Error']['Message'])
+ return []
+ for f in result.get('Contents', []):
+ m = SstateRegex.match(f['Key'][len(path):])
+ if m is not None:
+ modified = time.mktime(f['LastModified'].timetuple())
+ age = int(now - modified)
+ files.append(SstateCacheEntry(
+ path=f['Key'][len(self.basepath):],
+ size=f['Size'],
+ islink=False,
+ age=age,
+ **(m.groupdict())))
+ for p in result.get('CommonPrefixes', []):
+ files.extend(recurse_dir(p['Prefix']))
+ return files
+ return recurse_dir(self.basepath)
+
+ def download(self, path):
+ # download to a temporary file
+ tmp = NamedTemporaryFile(prefix='isar-sstate-', delete=False)
+ try:
+ result = self.s3.get_object(Bucket=self.bucket, Key=self.basepath + path)
+ except botocore.exceptions.ClientError:
+ return None
+ tmp.write(result['Body'].read())
+ tmp.close()
+ self.tmpfiles.append(tmp.name)
+ return tmp.name
+
+ def release(self, download_path):
+ # remove the temporary download
+ if download_path is not None and download_path in self.tmpfiles:
+ os.remove(download_path)
+ self.tmpfiles = [f for f in self.tmpfiles if not f == download_path]
+
+
+def arguments():
+ parser = argparse.ArgumentParser()
+ parser.add_argument(
+ 'command', type=str, metavar='command',
+ choices='info upload clean analyze'.split(),
+ help="command to execute (info, upload, clean, analyze)")
+ parser.add_argument(
+ 'source', type=str, nargs='?',
+ help="local sstate dir (for uploads or analysis)")
+ parser.add_argument(
+ 'target', type=str,
+ help="remote sstate location (a file://, http://, or s3:// URI)")
+ parser.add_argument(
+ '-v', '--verbose', default=False, action='store_true')
+ parser.add_argument(
+ '--max-age', type=str, default='1d',
+ help="clean: remove tgz files older than MAX_AGE (a number followed by w|d|h|m|s)")
+ parser.add_argument(
+ '--max-sig-age', type=str, default=None,
+ help="clean: remove siginfo files older than MAX_SIG_AGE (defaults to MAX_AGE)")
+
+ args = parser.parse_args()
+ if args.command in 'upload analyze'.split() and args.source is None:
+ print(f"ERROR: '{args.command}' needs a source and target")
+ sys.exit(1)
+ elif args.command in 'info clean'.split() and args.source is not None:
+ print(f"ERROR: '{args.command}' must not have a source (only a target)")
+ sys.exit(1)
+ return args
+
+
+def sstate_upload(source, target, verbose, **kwargs):
+ if not os.path.isdir(source):
+ print(f"WARNING: source {source} does not exist. Not uploading.")
+ return 0
+
+ if not target.exists() and not target.create():
+ print(f"ERROR: target {target} does not exist and could not be created.")
+ return -1
+
+ print(f"INFO: uploading {source} to {target}")
+ os.chdir(source)
+ upload, exists = [], []
+ for subdir, dirs, files in os.walk('.'):
+ target_dirs = subdir.split('/')[1:]
+ for f in files:
+ file_path = (('/'.join(target_dirs) + '/') if len(target_dirs) > 0 else '') + f
+ if target.exists(file_path):
+ if verbose:
+ print(f"[EXISTS] {file_path}")
+ exists.append(file_path)
+ else:
+ upload.append((file_path, target_dirs))
+ upload_gb = (sum([os.path.getsize(f[0]) for f in upload]) / 1024.0 / 1024.0 / 1024.0)
+ print(f"INFO: uploading {len(upload)} files ({upload_gb:.02f} GB)")
+ print(f"INFO: {len(exists)} files already present on target")
+ for file_path, target_dirs in upload:
+ if verbose:
+ print(f"[UPLOAD] {file_path}")
+ target.mkdir('/'.join(target_dirs))
+ target.upload(file_path, file_path)
+ return 0
+
+
+def sstate_clean(target, max_age, max_sig_age, verbose, **kwargs):
+ def convert_to_seconds(x):
+ seconds_per_unit = {'s': 1, 'm': 60, 'h': 3600, 'd': 86400, 'w': 604800}
+ m = re.match(r'^(\d+)(w|d|h|m|s)?', x)
+ if m is None:
+ print(f"ERROR: cannot parse MAX_AGE '{max_age}', needs to be a number followed by w|d|h|m|s")
+ sys.exit(-1)
+ unit = m.group(2)
+ if unit is None:
+ print("WARNING: MAX_AGE without unit, assuming 'days'")
+ unit = 'd'
+ return int(m.group(1)) * seconds_per_unit[unit]
+
+ max_age_seconds = convert_to_seconds(max_age)
+ if max_sig_age is None:
+ max_sig_age = max_age
+ max_sig_age_seconds = max(max_age_seconds, convert_to_seconds(max_sig_age))
+
+ if not target.exists():
+ print(f"INFO: cannot access target {target}. Nothing to clean.")
+ return 0
+
+ print(f"INFO: scanning {target}")
+ all_files = target.list_all()
+ links = [f for f in all_files if f.islink]
+ if links:
+ print(f"NOTE: we have links: {links}")
+ tgz_files = [f for f in all_files if f.suffix == 'tgz']
+ siginfo_files = [f for f in all_files if f.suffix == 'tgz.siginfo']
+ del_tgz_files = [f for f in tgz_files if f.age >= max_age_seconds]
+ del_tgz_hashes = [f.hash for f in del_tgz_files]
+ del_siginfo_files = [f for f in siginfo_files if
+ f.age >= max_sig_age_seconds or f.hash in del_tgz_hashes]
+ print(f"INFO: found {len(tgz_files)} tgz files, {len(del_tgz_files)} of which are older than {max_age}")
+ print(f"INFO: found {len(siginfo_files)} siginfo files, {len(del_siginfo_files)} of which "
+ f"correspond to old tgz files or are older than {max_sig_age}")
+
+ for f in del_tgz_files + del_siginfo_files:
+ if verbose:
+ print(f"[DELETE] {f.path}")
+ target.delete(f.path)
+ freed_gb = sum([x.size for x in del_tgz_files + del_siginfo_files]) / 1024.0 / 1024.0 / 1024.0
+ print(f"INFO: freed {freed_gb:.02f} GB")
+ return 0
+
+
+def sstate_info(target, verbose, **kwargs):
+ if not target.exists():
+ print(f"INFO: cannot access target {target}. No info to show.")
+ return 0
+
+ print(f"INFO: scanning {target}")
+ all_files = target.list_all()
+ size_gb = sum([x.size for x in all_files]) / 1024.0 / 1024.0 / 1024.0
+ print(f"INFO: found {len(all_files)} files ({size_gb:0.2f} GB)")
+
+ if not verbose:
+ return 0
+
+ archs = list(set([f.arch for f in all_files]))
+ print(f"INFO: found the following archs: {archs}")
+
+ key_task = {'deb': 'dpkg_build',
+ 'rootfs': 'rootfs_install',
+ 'bootstrap': 'bootstrap'}
+ recipes = {k: [] for k in key_task.keys()}
+ others = []
+ for pn in set([f.pn for f in all_files]):
+ tasks = set([f.task for f in all_files if f.pn == pn])
+ ks = [k for k, v in key_task.items() if v in tasks]
+ if len(ks) == 1:
+ recipes[ks[0]].append(pn)
+ elif len(ks) == 0:
+ others.append(pn)
+ else:
+ print(f"WARNING: {pn} could be any of {ks}")
+ for k, entries in recipes.items():
+ print(f"Cache hits for {k}:")
+ for pn in entries:
+ hits = [f for f in all_files if f.pn == pn and f.task == key_task[k] and f.suffix == 'tgz']
+ print(f" - {pn}: {len(hits)} hits")
+ print("Other cache hits:")
+ for pn in others:
+ print(f" - {pn}")
+ return 0
+
+
+def sstate_analyze(source, target, **kwargs):
+ if not os.path.isdir(source):
+ print(f"ERROR: source {source} does not exist. Nothing to analyze.")
+ return -1
+ if not target.exists():
+ print(f"ERROR: target {target} does not exist. Nothing to analyze.")
+ return -1
+
+ source = SstateFileTarget(source)
+ target.enable_cache()
+ local_sigs = {s.hash: s for s in source.list_all() if s.suffix.endswith('.siginfo')}
+ remote_sigs = {s.hash: s for s in target.list_all() if s.suffix.endswith('.siginfo')}
+
+ key_tasks = 'dpkg_build rootfs_install bootstrap'.split()
+
+ check = [k for k, v in local_sigs.items() if v.task in key_tasks]
+ for local_hash in check:
+ s = local_sigs[local_hash]
+ print(f"\033[1;33m==== checking local item {s.arch}:{s.pn}:{s.task} ({s.hash[:8]}) ====\033[0m")
+ if local_hash in remote_sigs:
+ print(" -> found hit in remote cache")
+ continue
+ remote_matches = [k for k, v in remote_sigs.items() if s.arch == v.arch and s.pn == v.pn and s.task == v.task]
+ if len(remote_matches) == 0:
+ print(" -> found no hit, and no potential remote matches")
+ else:
+ print(f" -> found no hit, but {len(remote_matches)} potential remote matches")
+ for r in remote_matches:
+ t = remote_sigs[r]
+ print(f"\033[0;33m**** comparing to {r[:8]} ****\033[0m")
+
+ def recursecb(key, remote_hash, local_hash):
+ recout = []
+ if remote_hash in remote_sigs.keys():
+ remote_file = target.download(remote_sigs[remote_hash].path)
+ elif remote_hash in local_sigs.keys():
+ recout.append(f"found remote hash in local signatures ({key})!?! (please implement that case!)")
+ return recout
+ else:
+ recout.append(f"could not find remote signature {remote_hash[:8]} for job {key}")
+ return recout
+ if local_hash in local_sigs.keys():
+ local_file = source.download(local_sigs[local_hash].path)
+ elif local_hash in remote_sigs.keys():
+ local_file = target.download(remote_sigs[local_hash].path)
+ else:
+ recout.append(f"could not find local signature {local_hash[:8]} for job {key}")
+ return recout
+ if local_file is None or remote_file is None:
+ out = "Aborting analysis because siginfo files disappered unexpectedly"
+ else:
+ out = compare_sigfiles(remote_file, local_file, recursecb, color=True)
+ if local_hash in local_sigs.keys():
+ source.release(local_file)
+ else:
+ target.release(local_file)
+ target.release(remote_file)
+ for change in out:
+ recout.extend([' ' + line for line in change.splitlines()])
+ return recout
+
+ local_file = source.download(s.path)
+ remote_file = target.download(t.path)
+ out = compare_sigfiles(remote_file, local_file, recursecb, color=True)
+ source.release(local_file)
+ target.release(remote_file)
+ # shorten hashes from 64 to 8 characters for better readability
+ out = [re.sub(r'([0-9a-f]{8})[0-9a-f]{56}', r'\1', line) for line in out]
+ print('\n'.join(out))
+
+
+def main():
+ args = arguments()
+
+ if args.target.startswith('http://'):
+ target = SstateDavTarget(args.target)
+ elif args.target.startswith('s3://'):
+ target = SstateS3Target(args.target)
+ elif args.target.startswith('file://'):
+ target = SstateFileTarget(args.target)
+ else: # no protocol given, assume file://
+ target = SstateFileTarget(args.target)
+
+ args.target = target
+ return globals()[f'sstate_{args.command}'](**vars(args))
+
+
+if __name__ == '__main__':
+ sys.exit(main())
--
2.30.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 6/7] isar-sstate: add tool to check for caching issues
2022-05-09 10:15 [PATCH v2 0/7] Sstate maintenance script Adriaan Schmidt
` (4 preceding siblings ...)
2022-05-09 10:16 ` [PATCH v2 5/7] scripts: add isar-sstate Adriaan Schmidt
@ 2022-05-09 10:16 ` Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 7/7] testsuite: add cachability analysis to sstate test Adriaan Schmidt
2022-05-18 11:02 ` [PATCH v2 0/7] Sstate maintenance script Anton Mikanovich
7 siblings, 0 replies; 9+ messages in thread
From: Adriaan Schmidt @ 2022-05-09 10:16 UTC (permalink / raw)
To: isar-users; +Cc: Felix Moessbauer
From: Felix Moessbauer <felix.moessbauer@siemens.com>
This patch adds the 'lint' command to the isar-sstate
script that helps in finding cachability issues.
Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
---
scripts/isar-sstate | 73 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 71 insertions(+), 2 deletions(-)
diff --git a/scripts/isar-sstate b/scripts/isar-sstate
index 8b541cf4..8ea85edc 100755
--- a/scripts/isar-sstate
+++ b/scripts/isar-sstate
@@ -62,6 +62,11 @@ to all potential matches in the remote cache, identified by matching
architecture, recipe (`PN`), and task. This analysis has the same output
format as `bitbake-diffsigs`.
+### lint
+
+The `lint` command searches form common flaws that reduce the
+cachability of a layer.
+
## Backends
### Filesystem backend
@@ -119,6 +124,7 @@ import shutil
import sys
from tempfile import NamedTemporaryFile
import time
+import pickle
sys.path.insert(0, os.path.join(os.path.dirname(os.path.realpath(__file__)), '..', 'bitbake', 'lib'))
from bb.siggen import compare_sigfiles
@@ -556,8 +562,8 @@ def arguments():
parser = argparse.ArgumentParser()
parser.add_argument(
'command', type=str, metavar='command',
- choices='info upload clean analyze'.split(),
- help="command to execute (info, upload, clean, analyze)")
+ choices='info upload clean analyze lint'.split(),
+ help="command to execute (info, upload, clean, analyze, lint)")
parser.add_argument(
'source', type=str, nargs='?',
help="local sstate dir (for uploads or analysis)")
@@ -572,6 +578,15 @@ def arguments():
parser.add_argument(
'--max-sig-age', type=str, default=None,
help="clean: remove siginfo files older than MAX_SIG_AGE (defaults to MAX_AGE)")
+ parser.add_argument(
+ '--sources-dir', type=str, default='/work/',
+ help="lint: absolute path to sources folder (e.g. layerbase)")
+ parser.add_argument(
+ '--build-dir', type=str, default='/build/tmp/',
+ help="lint: absolute path to build folder")
+ parser.add_argument(
+ '--exit-code', type=int, default=None,
+ help="lint: return this instead of number of found issues")
args = parser.parse_args()
if args.command in 'upload analyze'.split() and args.source is None:
@@ -774,6 +789,60 @@ def sstate_analyze(source, target, **kwargs):
print('\n'.join(out))
+def sstate_lint(target, verbose, sources_dir, build_dir, exit_code, **kwargs):
+ ADDITIONAL_IGNORED_VARNAMES = 'PP'.split()
+ if not target.exists():
+ print(f"ERROR: target {target} does not exist. Nothing to analyze.")
+ return -1
+
+ cache_sigs = {s.hash: s for s in target.list_all() if s.suffix.endswith('.siginfo')}
+
+ hits_srcdir = 0
+ hits_builddir = 0
+ hits_other = 0
+ for sig in cache_sigs.values():
+ sig_file = target.download(sig.path)
+ with open(sig_file, 'rb') as f:
+ sigdata_raw = pickle.Unpickler(f)
+ sigdata = sigdata_raw.load()
+
+ pn_issues = []
+ for name, val in sigdata['varvals'].items():
+ if not name[0].isupper():
+ continue
+ if sigdata['basewhitelist'] and name in sigdata['basewhitelist'] or \
+ sigdata['taskwhitelist'] and name in sigdata['taskwhitelist'] or \
+ name in ADDITIONAL_IGNORED_VARNAMES:
+ continue
+ if not val or not val[0] == '/':
+ continue
+ task = sigdata['task']
+ if val.startswith(build_dir):
+ pn_issues.append(f'\033[0;31m-> path in build-dir: {name} = "{val}"\033[0m')
+ hits_builddir += 1
+ elif val.startswith(sources_dir):
+ pn_issues.append(f'\033[0;31m-> path in sources-dir: {name} = "{val}"\033[0m')
+ hits_srcdir += 1
+ else:
+ hits_other += 1
+ if verbose:
+ pn_issues.append(f'\033[0;34m-> other absolute path: {name} = "{val}"\033[0m')
+ if len(pn_issues) > 0:
+ print(f"\033[1;33m==== issues found in {sig.arch}:{sig.pn}:{sig.task} ({sig.hash[:8]}) ====\033[0m")
+ print('\n'.join(pn_issues))
+ target.release(sig_file)
+
+ sum_hits = hits_srcdir + hits_builddir
+ if sum_hits == 0:
+ print(f'no cachability issues found (scanned {len(cache_sigs)} signatures)')
+ else:
+ print(f'warning: found cachability issues (scanned {len(cache_sigs)} signatures)')
+ print(f'-> absolute paths: sources-dir {hits_srcdir}, build-dir {hits_builddir}, other {hits_other}')
+ if exit_code is not None:
+ return exit_code
+ return hits_srcdir + hits_builddir
+
+
def main():
args = arguments()
--
2.30.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 7/7] testsuite: add cachability analysis to sstate test
2022-05-09 10:15 [PATCH v2 0/7] Sstate maintenance script Adriaan Schmidt
` (5 preceding siblings ...)
2022-05-09 10:16 ` [PATCH v2 6/7] isar-sstate: add tool to check for caching issues Adriaan Schmidt
@ 2022-05-09 10:16 ` Adriaan Schmidt
2022-05-18 11:02 ` [PATCH v2 0/7] Sstate maintenance script Anton Mikanovich
7 siblings, 0 replies; 9+ messages in thread
From: Adriaan Schmidt @ 2022-05-09 10:16 UTC (permalink / raw)
To: isar-users; +Cc: Adriaan Schmidt
Call `isar-sstate lint` on the populated sstate cache to ensure
that no undesired absolute paths make it into the sstate signatures.
Signed-off-by: Adriaan Schmidt <adriaan.schmidt@siemens.com>
---
testsuite/cibase.py | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/testsuite/cibase.py b/testsuite/cibase.py
index a1494ce6..2ffb8191 100755
--- a/testsuite/cibase.py
+++ b/testsuite/cibase.py
@@ -6,7 +6,7 @@ import re
import tempfile
import time
-from cibuilder import CIBuilder
+from cibuilder import CIBuilder, isar_root
from avocado.utils import process
class CIBaseTest(CIBuilder):
@@ -103,6 +103,12 @@ class CIBaseTest(CIBuilder):
# Populate cache
self.bitbake(image_target, **kwargs)
+ # Check signature files for cachability issues like absolute paths in signatures
+ result = process.run(f'{isar_root}/scripts/isar-sstate lint {self.build_dir}/sstate-cache '
+ f'--build-dir {self.build_dir} --sources-dir {isar_root}')
+ if result.exit_status > 0:
+ self.fail("Detected cachability issues")
+
# Save contents of image deploy dir
expected_files = set(glob.glob(f'{self.build_dir}/tmp/deploy/images/*/*'))
--
2.30.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 0/7] Sstate maintenance script
2022-05-09 10:15 [PATCH v2 0/7] Sstate maintenance script Adriaan Schmidt
` (6 preceding siblings ...)
2022-05-09 10:16 ` [PATCH v2 7/7] testsuite: add cachability analysis to sstate test Adriaan Schmidt
@ 2022-05-18 11:02 ` Anton Mikanovich
7 siblings, 0 replies; 9+ messages in thread
From: Anton Mikanovich @ 2022-05-18 11:02 UTC (permalink / raw)
To: Adriaan Schmidt, isar-users
09.05.2022 13:15, Adriaan Schmidt wrote:
> We have been running CI with shared sstate caches for some months now, in
> several downstream projects. This is the cache maintenance script that has
> evolved during that time. Detailed documentation is in the script itself.
> Main features:
> - upload cache artifacts to shared caches on filesystem, http, or s3
> - clean old artifacts from shared caches
> - analyze in detail why cache misses happen (what has changed in the signatures)
> - check the sstate signatures for absolute paths pointing to the build host
>
> The last two are especially interesting, and have already yielded some
> improvements to the cacheability of Isar, some already merged, and some
> more in "[PATCH 0/7] Further improve cachability of ISAR".
>
> p1 handles another absolute path in a variable (LAYERDIR_isar).
> p2..3 are minor patches to bitbake (both already upstream) that greatly
> improve accuracy and performance of the sstate analysis.
> p4 refactors handling of the apt_* tasks. This was motivated by the sstate
> analysis, but I think it also makes the code cleaner.
> p5..6 add the sstate maintenance script (2 authors, hence 2 patches).
> p7 includes a signature check into the sstate test case. This requires the
> changes from "[PATCH 0/7] Further improve cachability of ISAR" for the
> test to pass.
>
> One issue: testing!
> This is not easy, because it involves infrastructure, and artificial tests
> that provide decent coverage would be quite complex to design.
>
> If we declare that we sufficiently trust the sstate code, we could add a
> shared/persistent cache to the Isar CI infrastructure. This would further test
> the sstate feature and all steps involved in maintaining such a setup.
> In addition, it would significantly speed up CI builds.
>
> changes since v1:
> - generally improved script
> - analysis and cachability improvements in bitbake, dpkg-base, and meta-isar
> - added "sstate linting" to the testsuite
>
> Adriaan Schmidt (6):
> meta-isar: improve cachability
> bitbake-diffsigs: make finding of changed signatures more robust
> bitbake-diffsigs: break on first dependent task difference
> dpkg-base: refactor dependencies of apt_* tasks
> scripts: add isar-sstate
> testsuite: add cachability analysis to sstate test
>
> Felix Moessbauer (1):
> isar-sstate: add tool to check for caching issues
>
> bitbake/lib/bb/siggen.py | 11 +-
> meta-isar/conf/layer.conf | 1 +
> meta/classes/dpkg-base.bbclass | 14 +-
> meta/classes/dpkg.bbclass | 2 +-
> scripts/isar-sstate | 863 +++++++++++++++++++++++++++++++++
> testsuite/cibase.py | 8 +-
> 6 files changed, 884 insertions(+), 15 deletions(-)
> create mode 100755 scripts/isar-sstate
>
Applied to next, thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-05-18 11:02 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-09 10:15 [PATCH v2 0/7] Sstate maintenance script Adriaan Schmidt
2022-05-09 10:15 ` [PATCH v2 1/7] meta-isar: improve cachability Adriaan Schmidt
2022-05-09 10:15 ` [PATCH v2 2/7] bitbake-diffsigs: make finding of changed signatures more robust Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 3/7] bitbake-diffsigs: break on first dependent task difference Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 4/7] dpkg-base: refactor dependencies of apt_* tasks Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 5/7] scripts: add isar-sstate Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 6/7] isar-sstate: add tool to check for caching issues Adriaan Schmidt
2022-05-09 10:16 ` [PATCH v2 7/7] testsuite: add cachability analysis to sstate test Adriaan Schmidt
2022-05-18 11:02 ` [PATCH v2 0/7] Sstate maintenance script Anton Mikanovich
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox