Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

baseline.login result doesn't match job log #244

@gctucker

Description

@gctucker

This is an intermittent issue, in most cases the baseline.login test case result is correct. It should be PASS when login succeeded. However, it is sometimes reported as PASS even though the login did not succeed.

For example, this job:
https://kernelci.org/test/plan/id/5f02ef215c51b2c46485bb2d/
shows only the baseline-uefi.login test case, others are missing because the kernel didn't reach user-space.

It should have been set to FAILED since there was a kernel panic and it didn't reach login as can be seen in the full job log:
https://storage.kernelci.org/next/master/next-20200706/arm/multi_v7_defconfig/gcc-8/lab-collabora/baseline-uefi-qemu_arm-virt-gicv2.html

09:25:04.711763  <1>[    1.813286] 8<--- cut here ---
09:25:04.711951  <1>[    1.813418] Unhandled fault: page domain fault (0x01b) at 0x00000000
09:25:04.712059  <1>[    1.813637] pgd = (ptrval)
09:25:04.712545  Matched prompt kernelci/kernelci-frontend#1: (Unhandled fault.*)\r\n
09:25:04.712836  Setting prompt string to ['/ #']
09:25:04.712987  end: 2.2 auto-login-action (duration 00:00:03) [common]
09:25:04.713306  start: 2.3 expect-shell-connection (timeout 00:04:57) [common]
09:25:04.713415  Setting prompt string to ['/ #']
09:25:04.713515  Forcing a shell prompt, looking for ['/ #']
09:25:04.764033  <1>[    1.813727] [00000000] *pgd
09:25:04.764212  expect-shell-connection: Wait for prompt ['/ #'] (timeout 00:05:00)
09:25:04.764336  Waiting using forced prompt support (timeout 00:02:30)
09:25:04.764527  =00000000
09:25:04.764622  <0>[    1.814136] Internal error: : 1b [#1] SMP ARM
09:29:05.333341  ShellCommand command timed out.: Sending # in case of corruption. Connection timeout 00:05:00, retry in 00:00:30
09:29:05.333574  pattern: ['/ #']
09:29:05.434391  #
09:29:35.519458  ShellCommand command timed out.: Sending # in case of corruption. Connection timeout 00:05:00, retry in 00:00:30
09:29:35.519706  pattern: ['/ #']
09:29:35.620520  #
09:30:01.713627  end: 2.3 expect-shell-connection (duration 00:04:57) [common]
09:30:01.714008  boot-image-retry failed: 1 of 1 attempts. 'expect-shell-connection timed out after 297 seconds'
09:30:01.714287  end: 2 boot-image-retry (duration 00:05:00) [common]

The same issue occurred with the same job run in all the labs, so it seems to be fully reproducible. It may either be an issue in LAVA itself or potentially in kernelci-backend when parsing the callback results, maybe when a kernel panic was also detected by LAVA.

Here's the original LAVA job for the example above:
https://lava.collabora.co.uk/scheduler/job/2482262

and the auto-login-action test case is marked as PASS, the job failed at the expect-shell-connection stage:
https://lava.collabora.co.uk/results/2482262/lava

It seems look like auto-login-action was wrongly set to PASS, maybe as a side-effect of detecting a kernel panic.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions