public inbox for isar-users@googlegroups.com
 help / color / mirror / Atom feed
* Handling of additional python dependencies
@ 2017-09-25 12:44 Claudius Heine
  2017-09-27  7:06 ` Henning Schild
  0 siblings, 1 reply; 5+ messages in thread
From: Claudius Heine @ 2017-09-25 12:44 UTC (permalink / raw)
  To: isar-users

Hi,

I am currently creating a proof of concept implementation for the 
caching apt repo proxy for isar.

My goal was to create this using asyncio, but the python std lacks a 
async http protocol implementation. I tried using as much as I can from 
the sync version of the http protocol that is available the python std 
lib, but that is not that trivial to do. I am now at the point where I 
have to decide if I just used some http asyncio library outside of the 
std or try another route with this. Maybe just use the sync version and 
slap more threads on it.

How is the policy concerning external python dependencies and isar?
Is it possible to just copy those libraries into the scripts/lib/ 
directory, specify it as a host dependency or am I forced to only use 
the python std?

Thanks,
Claudius

-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-54 Fax: (+49)-8142-66989-80 Email: ch@denx.de

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Handling of additional python dependencies
  2017-09-25 12:44 Handling of additional python dependencies Claudius Heine
@ 2017-09-27  7:06 ` Henning Schild
  2017-09-27  7:44   ` Claudius Heine
  0 siblings, 1 reply; 5+ messages in thread
From: Henning Schild @ 2017-09-27  7:06 UTC (permalink / raw)
  To: [ext] Claudius Heine; +Cc: isar-users

Am Mon, 25 Sep 2017 14:44:13 +0200
schrieb "[ext] Claudius Heine" <claudius.heine.ext@siemens.com>:

> Hi,
> 
> I am currently creating a proof of concept implementation for the 
> caching apt repo proxy for isar.

Cant you just use some existing implementation, like apt-cacher or
apt-cacher-ng?

> My goal was to create this using asyncio, but the python std lacks a 
> async http protocol implementation. I tried using as much as I can
> from the sync version of the http protocol that is available the
> python std lib, but that is not that trivial to do. I am now at the
> point where I have to decide if I just used some http asyncio library
> outside of the std or try another route with this. Maybe just use the
> sync version and slap more threads on it.
> 
> How is the policy concerning external python dependencies and isar?
> Is it possible to just copy those libraries into the scripts/lib/ 
> directory, specify it as a host dependency or am I forced to only use 
> the python std?

I would go for a host dependency or a git-submodule, more copies/forks
of stuff are not a good idea. Having bitbake and wic in there already
seems problematic.

Henning

> Thanks,
> Claudius
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Handling of additional python dependencies
  2017-09-27  7:06 ` Henning Schild
@ 2017-09-27  7:44   ` Claudius Heine
  2017-09-27  8:00     ` Henning Schild
  0 siblings, 1 reply; 5+ messages in thread
From: Claudius Heine @ 2017-09-27  7:44 UTC (permalink / raw)
  To: Henning Schild; +Cc: isar-users

Hi,

On 09/27/2017 09:06 AM, Henning Schild wrote:
> Am Mon, 25 Sep 2017 14:44:13 +0200
> schrieb "[ext] Claudius Heine" <claudius.heine.ext@siemens.com>:
> 
>> Hi,
>>
>> I am currently creating a proof of concept implementation for the
>> caching apt repo proxy for isar.
> 
> Cant you just use some existing implementation, like apt-cacher or
> apt-cacher-ng?

I would love to. But those solutions would still require some work AFAIK.

I only have some experience with apt-cacher-ng and I think that its a 
bit of an overkill for our use and is missing some features that would 
be useful:

  * Ability to specify separate caching paths for the 'dists' and 'pool' 
directory. In our case the 'dists' directory of the repo would be part 
of the TMP_DIR while the 'pool' directory would be stored in the DL_DIR. 
Maybe this can be done with symlinks.

  * Just using the first available port and communicating it to the 
calling process. I could imagine that this can be done with some shell 
and netstat programming.

I decided to implement my own proxy, because this is not that difficult 
with the python stl and got the implementation down rather quickly. But 
I had not anticipated that the python stl does not provide a http 
implementation for asyncio. Implementing that myself is a bit out of 
scope IMO.

> 
>> My goal was to create this using asyncio, but the python std lacks a
>> async http protocol implementation. I tried using as much as I can
>> from the sync version of the http protocol that is available the
>> python std lib, but that is not that trivial to do. I am now at the
>> point where I have to decide if I just used some http asyncio library
>> outside of the std or try another route with this. Maybe just use the
>> sync version and slap more threads on it.
>>
>> How is the policy concerning external python dependencies and isar?
>> Is it possible to just copy those libraries into the scripts/lib/
>> directory, specify it as a host dependency or am I forced to only use
>> the python std?
> 
> I would go for a host dependency or a git-submodule, more copies/forks
> of stuff are not a good idea. Having bitbake and wic in there already
> seems problematic.

Host dependencies in that area are not optimal, because the APIs of 
those libraries are from my experience not stable, and updating our code 
to be compatible with a whole range of versions is a pain.

When developing other python applications I normally write a 
'requirements.txt' file that contains the name and version of the 
required python packages, then use virtualenv to install all those 
packages into an isolated directory and then start all python tools from 
within that environment. Maybe something similar could be done here as well.

Thanks,
Claudius

-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-54 Fax: (+49)-8142-66989-80 Email: ch@denx.de

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Handling of additional python dependencies
  2017-09-27  7:44   ` Claudius Heine
@ 2017-09-27  8:00     ` Henning Schild
  2017-09-27 12:13       ` Claudius Heine
  0 siblings, 1 reply; 5+ messages in thread
From: Henning Schild @ 2017-09-27  8:00 UTC (permalink / raw)
  To: Claudius Heine; +Cc: isar-users

Am Wed, 27 Sep 2017 09:44:42 +0200
schrieb Claudius Heine <claudius.heine.ext@siemens.com>:

> Hi,
> 
> On 09/27/2017 09:06 AM, Henning Schild wrote:
> > Am Mon, 25 Sep 2017 14:44:13 +0200
> > schrieb "[ext] Claudius Heine" <claudius.heine.ext@siemens.com>:
> >   
> >> Hi,
> >>
> >> I am currently creating a proof of concept implementation for the
> >> caching apt repo proxy for isar.  
> > 
> > Cant you just use some existing implementation, like apt-cacher or
> > apt-cacher-ng?  
> 
> I would love to. But those solutions would still require some work
> AFAIK.
> 
> I only have some experience with apt-cacher-ng and I think that its a 
> bit of an overkill for our use and is missing some features that
> would be useful:
> 
>   * Ability to specify separate caching paths for the 'dists' and
> 'pool' directory. In our case the 'dists' directory of the repo would
> be part of the TMP_DIR while the 'pool' directory would be stored in
> the DL_DIR. Maybe this can be done with symlinks.
> 
>   * Just using the first available port and communicating it to the 
> calling process. I could imagine that this can be done with some
> shell and netstat programming.
> 
> I decided to implement my own proxy, because this is not that
> difficult with the python stl and got the implementation down rather
> quickly. But I had not anticipated that the python stl does not
> provide a http implementation for asyncio. Implementing that myself
> is a bit out of scope IMO.

As far as i understand your statements there might be ways to use an
existing tool. The fact that we are talking about additional python
deps and how to handle them suggests that writing your own is
"difficult" after all. A few symlinks or upstream patches are IMHO
much easier to maintain than yet another proxy. And a first prototype
that seems to work might still be far from something that actually
works.
We are talking about a cache, so you need to think about eviction,
consistency, time to live ... What do you do with all the .debs when
Packages.gz changes?

Henning

> >   
> >> My goal was to create this using asyncio, but the python std lacks
> >> a async http protocol implementation. I tried using as much as I
> >> can from the sync version of the http protocol that is available
> >> the python std lib, but that is not that trivial to do. I am now
> >> at the point where I have to decide if I just used some http
> >> asyncio library outside of the std or try another route with this.
> >> Maybe just use the sync version and slap more threads on it.
> >>
> >> How is the policy concerning external python dependencies and isar?
> >> Is it possible to just copy those libraries into the scripts/lib/
> >> directory, specify it as a host dependency or am I forced to only
> >> use the python std?  
> > 
> > I would go for a host dependency or a git-submodule, more
> > copies/forks of stuff are not a good idea. Having bitbake and wic
> > in there already seems problematic.  
> 
> Host dependencies in that area are not optimal, because the APIs of 
> those libraries are from my experience not stable, and updating our
> code to be compatible with a whole range of versions is a pain.
> 
> When developing other python applications I normally write a 
> 'requirements.txt' file that contains the name and version of the 
> required python packages, then use virtualenv to install all those 
> packages into an isolated directory and then start all python tools
> from within that environment. Maybe something similar could be done
> here as well.
> 
> Thanks,
> Claudius
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Handling of additional python dependencies
  2017-09-27  8:00     ` Henning Schild
@ 2017-09-27 12:13       ` Claudius Heine
  0 siblings, 0 replies; 5+ messages in thread
From: Claudius Heine @ 2017-09-27 12:13 UTC (permalink / raw)
  To: Henning Schild; +Cc: isar-users

Hi,

On 09/27/2017 10:00 AM, Henning Schild wrote:
> Am Wed, 27 Sep 2017 09:44:42 +0200
> schrieb Claudius Heine <claudius.heine.ext@siemens.com>:
> 
>> Hi,
>>
>> On 09/27/2017 09:06 AM, Henning Schild wrote:
>>> Am Mon, 25 Sep 2017 14:44:13 +0200
>>> schrieb "[ext] Claudius Heine" <claudius.heine.ext@siemens.com>:
>>>    
>>>> Hi,
>>>>
>>>> I am currently creating a proof of concept implementation for the
>>>> caching apt repo proxy for isar.
>>>
>>> Cant you just use some existing implementation, like apt-cacher or
>>> apt-cacher-ng?
>>
>> I would love to. But those solutions would still require some work
>> AFAIK.
>>
>> I only have some experience with apt-cacher-ng and I think that its a
>> bit of an overkill for our use and is missing some features that
>> would be useful:
>>
>>    * Ability to specify separate caching paths for the 'dists' and
>> 'pool' directory. In our case the 'dists' directory of the repo would
>> be part of the TMP_DIR while the 'pool' directory would be stored in
>> the DL_DIR. Maybe this can be done with symlinks.
>>
>>    * Just using the first available port and communicating it to the
>> calling process. I could imagine that this can be done with some
>> shell and netstat programming.
>>
>> I decided to implement my own proxy, because this is not that
>> difficult with the python stl and got the implementation down rather
>> quickly. But I had not anticipated that the python stl does not
>> provide a http implementation for asyncio. Implementing that myself
>> is a bit out of scope IMO.
> 
> As far as i understand your statements there might be ways to use an
> existing tool. The fact that we are talking about additional python
> deps and how to handle them suggests that writing your own is
> "difficult" after all. A few symlinks or upstream patches are IMHO
> much easier to maintain than yet another proxy. And a first prototype
> that seems to work might still be far from something that actually
> works.
> We are talking about a cache, so you need to think about eviction,
> consistency, time to live ...

All of those should be done with a manual task, because otherwise we 
could not build the same image again and again day after day for same 
product. Updating the package index has to be a explicit step.

> What do you do with all the .debs when
> Packages.gz changes?

Normally just leave them be, because some other project might need them. 
My idea is the debs are shared between multiple builds similar how its 
down with the DL_DIR in oe.

Deleting them without thought could lead to breakage of the reproducibility.

One problem we have, that apt-cacher-ng etc. doesn't have to deal with 
is that we want to have one package pool with multiple different package 
dists. So I don't know if it starts messing with the package pool when 
removing packages and breaking builds for other projects.

Claudius

-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-54 Fax: (+49)-8142-66989-80 Email: ch@denx.de

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-09-27 12:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-25 12:44 Handling of additional python dependencies Claudius Heine
2017-09-27  7:06 ` Henning Schild
2017-09-27  7:44   ` Claudius Heine
2017-09-27  8:00     ` Henning Schild
2017-09-27 12:13       ` Claudius Heine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox