trunner: add missing expect during flash procedure #386

maska989 · 2025-01-13T09:32:44Z

Description

This change resolves the issue where the flashing process executes a reboot after successful flashing, causing the next test case to receive output from the flash reboot and detect the PSH prompt too early

Motivation and Context

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Chore (refactoring, style fixes, git/CI config, submodule management, no code logic changes)

How Has This Been Tested?

Already covered by automatic testing.
New test added: (add PR link here).
Tested by hand on: armv7a9-zynq7000-zedboard.

Checklist:

My change requires a change to the documentation.
I have updated the documentation accordingly.
I have added tests to cover my changes.
All new and existing linter checks and tests passed.
My changes generate no new compilation warnings for any of the targets.

Special treatment

This PR needs additional PRs to work (list the PRs, preferably in merge-order).
I will merge this PR by myself when appropriate.

github-actions · 2025-01-13T09:46:24Z

Unit Test Results

4 472 tests - 3 486 4 181 ✅ - 3 235 15m 26s ⏱️ - 25m 13s
250 suites - 220 291 💤 - 251
1 files ± 0 0 ❌ ± 0

Results for commit d9c7158. ± Comparison against base commit 1c676a4.

This pull request removes 3486 tests.

flash ‑ armv7a9-zynq7000-qemu:flash
flash ‑ armv7m4-stm32l4x6-nucleo:flash
flash ‑ ia32-generic-qemu:flash
flash ‑ riscv64-generic-qemu:flash
phoenix-rtos-tests/cpp/hello-cpp ‑ armv7a9-zynq7000-qemu:phoenix-rtos-tests/cpp/hello-cpp
phoenix-rtos-tests/cpp/hello-cpp ‑ ia32-generic-qemu:phoenix-rtos-tests/cpp/hello-cpp
phoenix-rtos-tests/cpp/hello-cpp ‑ riscv64-generic-qemu:phoenix-rtos-tests/cpp/hello-cpp
phoenix-rtos-tests/initfini/main ‑ armv7a9-zynq7000-qemu:phoenix-rtos-tests/initfini/main
phoenix-rtos-tests/initfini/main ‑ armv7m4-stm32l4x6-nucleo:phoenix-rtos-tests/initfini/main
phoenix-rtos-tests/initfini/main ‑ ia32-generic-qemu:phoenix-rtos-tests/initfini/main
…

♻️ This comment has been updated with latest results.

nalajcie · 2025-01-13T12:29:56Z

hmm, I think it's invalid to assume every device will boot to psh prompt.

If I understand correctly - the underlying issue is that we're rebooting twice between reboot and first test - so we should focus on removing the underlying issue instead :).

maska989 · 2025-01-13T13:09:55Z

hmm, I think it's invalid to assume every device will boot to psh prompt.

If I understand correctly - the underlying issue is that we're rebooting twice between reboot and first test - so we should focus on removing the underlying issue instead :).

I don't think this is an issue (ash only used in QEMU right now), every target after flashing is rebooted to ensure that process ended correctly (Right now we know only about correct pass of data from host device, and nothing about that system start or not after flash) + 70% of our physical devices are using changing mode after flash and need to be shut down using power cut, without this it will hang on reboot without effect. This change will pinpoint this behavior in the future and separate flashing process from tests cases.
In addition, we're using often different reboots in test cases and flash procedure, removing one of reboot will clog test campaign (using --no-flash flag or using environment with old setting from previous campaign) or flash (lack of state change on device).

But it is my personal opinion, and I'm open to discussion about improvements :)

nalajcie · 2025-01-13T14:39:44Z

hmm, I think it's invalid to assume every device will boot to psh prompt.
If I understand correctly - the underlying issue is that we're rebooting twice between reboot and first test - so we should focus on removing the underlying issue instead :).

I don't think this is an issue (ash only used in QEMU right now), every target after flashing is rebooted to ensure that process ended correctly (Right now we know only about correct pass of data from host device, and nothing about that system start or not after flash) + 70% of our physical devices are using changing mode after flash and need to be shut down using power cut, without this it will hang on reboot without effect. This change will pinpoint this behavior in the future and separate flashing process from tests cases. In addition, we're using often different reboots in test cases and flash procedure, removing one of reboot will clog test campaign (using --no-flash flag or using environment with old setting from previous campaign) or flash (lack of state change on device).

But it is my personal opinion, and I'm open to discussion about improvements :)

Some more remarks:

flashing procedure correctness is checked by inspecting the logs while flashing
whether the device would boot after the flashing procedure or not is not an indicator of flashing failure (flashing might be ok, firmware might be broken) - so IMHO it should be visible in other test result than flash
we can't universally assume that every device with plo would boot to psh shell - we have this information on per-target basis encoded as harness chain - so the proposed change breaks the architecture of trunner - that's why I'm against merging it in current form
we should either optimize the second reboot (in a graceful way - handling special cases as --no-flash flag, etc.) or eg. drain the currently buffered data before next test execution (this is currently done only after each test) or while rebooting, etc.

maska989 · 2025-01-13T14:52:53Z

hmm, I think it's invalid to assume every device will boot to psh prompt.
If I understand correctly - the underlying issue is that we're rebooting twice between reboot and first test - so we should focus on removing the underlying issue instead :).

I don't think this is an issue (ash only used in QEMU right now), every target after flashing is rebooted to ensure that process ended correctly (Right now we know only about correct pass of data from host device, and nothing about that system start or not after flash) + 70% of our physical devices are using changing mode after flash and need to be shut down using power cut, without this it will hang on reboot without effect. This change will pinpoint this behavior in the future and separate flashing process from tests cases. In addition, we're using often different reboots in test cases and flash procedure, removing one of reboot will clog test campaign (using --no-flash flag or using environment with old setting from previous campaign) or flash (lack of state change on device).
But it is my personal opinion, and I'm open to discussion about improvements :)

Some more remarks:
* flashing procedure correctness is checked by inspecting the logs while flashing

* whether the device would boot after the flashing procedure or not is not an indicator of flashing failure (flashing might be ok, firmware might be broken) - so IMHO it should be visible in other test result than `flash`

* we can't universally assume that every device with `plo` would boot to `psh` shell - we have this information on per-target basis encoded as harness chain - so **the proposed change breaks the architecture of `trunner`** - that's why I'm **against** merging it in current form

* we should either optimize the second reboot (in a graceful way - handling special cases as `--no-flash` flag, etc.) or eg. drain the currently buffered data before next test execution (this is currently done only after each test) or while rebooting, etc.

You're absolutely right, after rethink this approach is destructive for trunner, I will consider soften up this second reboot with some buffer changes to nullify this behavior of double reboot situation.

Thanks for visiting this draft pr and give me some advices :)

nalajcie · 2025-01-14T16:56:36Z

trunner/test_runner.py

@@ -244,6 +244,10 @@ def run_tests(self, tests: Sequence[TestOptions]) -> Sequence[TestResult]:
        # Ensure first test will start with reboot
        last_test_failed = True

+        # Unload data before starting tests
+        if self.ctx.should_flash is not False and self.ctx.target.experimental is False:


I don't understand why it's disabled for experimental targets.

Can't we just always drain the buffer (without any IF?) - the same way as below:
self.target.dut.read(timeout=0.1)

Please note that this is still a race - new data might be produced between reading the buffer and issuing reboot before the first test.

maska989 force-pushed the maska989/flash_missing_expect branch from b1a6bc8 to 17dcde8 Compare January 13, 2025 09:47

maska989 force-pushed the maska989/flash_missing_expect branch 2 times, most recently from 2a7f1a7 to 9449ff3 Compare January 14, 2025 10:12

trunner: fix output after flash

d9c7158

maska989 force-pushed the maska989/flash_missing_expect branch from 9449ff3 to d9c7158 Compare January 14, 2025 12:23

nalajcie reviewed Jan 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trunner: add missing expect during flash procedure #386

trunner: add missing expect during flash procedure #386

maska989 commented Jan 13, 2025 •

edited

Loading

github-actions bot commented Jan 13, 2025 •

edited

Loading

nalajcie commented Jan 13, 2025

maska989 commented Jan 13, 2025 •

edited

Loading

nalajcie commented Jan 13, 2025

maska989 commented Jan 13, 2025

nalajcie Jan 14, 2025

trunner: add missing expect during flash procedure #386

Are you sure you want to change the base?

trunner: add missing expect during flash procedure #386

Conversation

maska989 commented Jan 13, 2025 • edited Loading

Description

Motivation and Context

Types of changes

How Has This Been Tested?

Checklist:

Special treatment

github-actions bot commented Jan 13, 2025 • edited Loading

Unit Test Results

nalajcie commented Jan 13, 2025

maska989 commented Jan 13, 2025 • edited Loading

nalajcie commented Jan 13, 2025

maska989 commented Jan 13, 2025

nalajcie Jan 14, 2025

Choose a reason for hiding this comment

maska989 commented Jan 13, 2025 •

edited

Loading

github-actions bot commented Jan 13, 2025 •

edited

Loading

maska989 commented Jan 13, 2025 •

edited

Loading