public inbox for isar-users@googlegroups.com
 help / color / mirror / Atom feed
From: Uladzimir Bely <ubely@ilbers.de>
To: Henning Schild <henning.schild@siemens.com>
Cc: isar-users@googlegroups.com
Subject: Re: [PATCH] testsuite: Improve SSH ping
Date: Mon, 20 Mar 2023 11:51:57 +0300	[thread overview]
Message-ID: <2470465.XAFRqVoOGU@hp> (raw)
In-Reply-To: <20230317095358.6a73547f@md1za8fc.ad001.siemens.net>

In mail from Friday, 17 March 2023 11:53:58 +03 user Henning Schild wrote:
> Am Fri, 17 Mar 2023 05:11:30 +0100
> 
> schrieb Uladzimir Bely <ubely@ilbers.de>:
> > When qemu machine boots, it may happen that consecutive SSH connection
> > fails right after the previous good one. So, we get a situation when
> > the command/script fails after we consider SSH is ready.
> > 
> > This patch improves detection of SSH server ready status by making at
> > least three good consecutive SSH pings.
> > 
> > Example of debug output that shows the case:
> > 
> > ```
> > 
> > | Waiting for SSH server ready...
> > | SSH ping result: 255, left: 300s # <== machine is booting
> > | SSH ping result: 255, left: 294s
> > | SSH ping result: 255, left: 288s
> > | SSH ping result: 255, left: 282s
> > | SSH ping result: 255, left: 276s
> > | SSH ping result: 255, left: 270s
> > | SSH ping result: 255, left: 264s
> > | SSH ping result: 255, left: 258s
> > | SSH ping result: 0, left: 253s   # <== SSH server is up...
> > | SSH ping result: 0, left: 251s
> > | SSH ping result: 255, left: 250s # <== but one ping failed again
> > | SSH ping result: 0, left: 248s
> > | SSH ping result: 0, left: 245s
> > | SSH ping result: 0, left: 243s
> > | SSH server is ready
> > | `lsmod | grep example_module` returned 0
> > 
> > ```
> > 
> > Signed-off-by: Uladzimir Bely <ubely@ilbers.de>
> > ---
> > 
> >  testsuite/cibuilder.py | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> > 
> > diff --git a/testsuite/cibuilder.py b/testsuite/cibuilder.py
> > index 9e84c3a3..4e568b8e 100755
> > --- a/testsuite/cibuilder.py
> > +++ b/testsuite/cibuilder.py
> > 
> > @@ -257,17 +257,25 @@ class CIBuilder(Test):
> >          self.log.debug('Waiting for SSH server ready...')
> >          
> >          rc = None
> > 
> > +        goodcnt = 0
> > 
> >          while time.time() < timeout:
> >              if proc.poll() is not None:
> >                  self.log.error('Machine is not running')
> >                  return rc
> >              
> >              rc = self.exec_cmd('/bin/true', cmd_prefix)
> > 
> > +            time_left = timeout - time.time()
> > +            self.log.debug('SSH ping result: %d, left: %.fs' % (rc,
> > time_left)) time.sleep(1)
> > 
> >              if rc == 0:
> > -                self.log.debug('SSH server is ready')
> > -                break
> > +                goodcnt += 1
> > +                # Let 3 good SSH pings to make sure SSH connection
> > is stable
> > +                if goodcnt >= 3:
> > +                    self.log.debug('SSH server is ready')
> > +                    break
> > +            else:
> > +                goodcnt = 0
> 
> This looks like an endless loop should ssh never come up. Not sure what
> would break that loop. In the worst case a test-timeout and everything
> being stuck because we might not execute these things in parallel.
> 

In case ssh never come up, we will exit after the timeout (default value 
600sec), goodcnt will always be 0 in this case.

Using something more readable like "for i in range(3)" you've suggested in the 
previous comment won't work because we need to reset the counter and start 
again in case one of pings goes wrong. E.g, if second ping fails, we need 
`goodcnt` changes like 0-0-0-1-0-1-2-3. With "range" function it will act like 
0-0-0-1-0-3.

But you are right, code like "while time.time() < timeout and goodcnt < 3" 
might be a bit more readable. I'll check it.

> Henning
> 
> >          if rc != 0:
> >              self.log.error('SSH server is not ready')





  reply	other threads:[~2023-03-20  8:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-17  4:11 Uladzimir Bely
2023-03-17  8:17 ` Henning Schild
2023-03-17  8:53 ` Henning Schild
2023-03-20  8:51   ` Uladzimir Bely [this message]
2023-03-20  9:12     ` Henning Schild
2023-03-20  9:17       ` Uladzimir Bely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2470465.XAFRqVoOGU@hp \
    --to=ubely@ilbers.de \
    --cc=henning.schild@siemens.com \
    --cc=isar-users@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox