Hi Christian, On 20.11.2017 09:33, [ext] Christian Storm wrote: >>> [...] >>>> With my suggestion of using a caching proxy, this could be solved >>>> without any additional overhead. >>> >>> Could be the case, what are the drawbacks? >> >> More complexity and stuff to implement. Also maybe download speed. >> >>> What proxy do you propose to use? >> >> I was at first going with my own standalone proxy implementation in >> pure stdlib python, so that it could be completely integrated into >> isar. > > Why not hooking this into the fetcher(s) so that it's integrated rather > than a standalone thing? The bitbake fetcher is not the only step that downloads stuff in isar. There is also multistrap and possible 'apt-get install' calls within a chroot environment. I was going to integrate it into isar at some point, but first I wanted to have a working proof of concept without bitbake in between to be easily testable. Then integrate it tightly into isar later. > As a bonus, you'll have full control on this > from the Isar core/code. I think the main invention here is the code > that does the consistent version/epoch guarantee anyway... Hmm... My hope is that this will be solved by itself, by splitting 'dists' and 'pool'. > > >> I had a very simple solution ready rather quickly, but it was >> only synchronous and as such could only handle one connection at a >> time. Instead of just throwing more threads at it, I wanted to go the >> asyncio route. Sadly the python stdlib does not provide a http >> implementation for asyncio. I wasn't clear how to proceed from here >> further (aiohttp dependency or minimal own http implementation). > > Ah, OK. Wouldn't this account for premature optimization? :) Handling more than one connection in parallel should be possible IMO. Going from one to two is harder then from two to n (n>2). So I was lucky, in a sense, to discover at that early point in implementation that this is harder to do than expected. >> The other idea is to just use a ready made apt caching proxy like apt- >> cache-ng. But here I am unsure if its flexible enough to use in our >> case. Starting it multiple times in parallel with different ports for >> different caches and only user privileges might be possible but I >> suspect that seperating the pool and the dists folder (pool should go >> to DL_DIR while dists is part of the TMP_DIR) could be more difficult. > > I would consider on the bonus side for this that we don't have to > develop/maintain a custom solution, given that it suits our purposes of > course... Agree. But if it only 'sort of' suits our purpose, we might need to write wrapper code around its short comings and maintain that. >>> Maybe I missed something on the proxy suggestion.. Could you >>> please elaborate on this? >> >> As for the integration the basic idea was that for taged bitbake tasks >> the proxy is started and sets the *_PROXY environment variables. This >> should be doable with some mods to the base.bbclass and some external >> python scripts. >> >>> >>> >>>> I do have other ideas to do this, but that would restructure most >>>> of isar. >>> >>> Well, at least speaking for myself, I'd like to hear those as I >>> consider >>> this feature to be essential. Choice in solutions is always good :) >>> >> >> One idea that I got when I first investigated isar, was trying to be oe >> compatible as much as possible. So using this idea would solve the >> reproducable builds as well: >> >> Basically implementing debootstrap with bitbake recipes that are >> created virtually on runtime by downloading and parsing the >> 'dists/*/*/*/Packages.gz' file. > > Those virtual recipes then will have to be serialized as they contain > the version number of the package, right? I'm not sure if I understand your point correctly. I don't think the recipes needs to be written down as a file somewhere. We might have to take a look at the parsing part of bitbake, were the recipe data store is filled. So were the deserialization happens from '*.bb' to entry in ds. Here we just take one or more Debian package lists with some additional information, like the repo url and fill the ds with generated recipes. >> I suppose it should be possible to fetch the Packages file at an early >> parsing step in a bitbake build, if its not already preset, and fill >> the bitbake data store with recipe definitions that fetch those binary >> deb packages, have the appropriate dependencies and install them into >> the root file system. > > Yes, or do a 'download-only' step prior to building as it's available > on Yocto. Not sure if that is possible. Task execution is done after all those recipes are parsed and dependencies are resolved. To add virtual packages ourselves we need to do that before any task is triggered. So fetching the 'Packages.gz' file needs to be very early outside of what recipes normally do. I suspect that this is possible by using bitbake event handlers [1]. >> However, this idea is still in the brain storming phase. >> >> Since that would involve a very big redesign I don't think its feasible >> currently. > > Sounds interesting, at least for me... Thanks. Claudius [1] https://www.yoctoproject.org/docs/latest/bitbake-user-manual/bitbake-user-manual.html#events -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-54 Fax: (+49)-8142-66989-80 Email: ch@denx.de PGP key: 6FF2 E59F 00C6 BC28 31D8 64C1 1173 CB19 9808 B153 Keyserver: hkp://pool.sks-keyservers.net