Skip to content

User/britel/go e2e poc#533

Draft
Britel wants to merge 21 commits intomainfrom
user/britel/go-e2e-poc
Draft

User/britel/go e2e poc#533
Britel wants to merge 21 commits intomainfrom
user/britel/go-e2e-poc

Conversation

@Britel
Copy link
Collaborator

@Britel Britel commented Feb 24, 2026

Convert Trident E2E Tests from Python/YAML to Go Storm Framework

Summary

Converts all Trident E2E validation tests from Python pytest and YAML ADO pipeline templates into Go, fully integrated into the existing storm framework (tools/storm/e2e/). This eliminates the Python/pytest runtime dependency, consolidates the test stack into a single language, and moves test execution orchestration from YAML pipelines into Go code.

21 commits · 45 files changed · +5,411 / -2,948 lines

Category Added Removed
Go (validation + infra) +5,129 -8
Go tests +2,034
Python (pytest E2E) +4 -1,894
YAML (pipeline templates) +103 -1,008
Documentation +175 -18

Motivation

The E2E test infrastructure was split across three technologies:

  • Go (storm) — scenario orchestration, VM setup, A/B updates
  • Python (pytest) — post-deployment system validation
  • YAML (ADO pipelines) — test dispatching, metrics, JUnit publishing

This required maintaining two test runtimes, artifact passing between pytest and storm, and test logic encoded in hard-to-test YAML templates. This PR unifies everything into Go.

What Changed

Phase 1: Validation Test Infrastructure

  • validate_common.go — Shared SSH command runners and 15+ parsers for system command output (blkid, lsblk, mount, passwd/group, efibootmgr, cryptsetup, dmsetup, veritysetup, luksDump JSON, systemd-sysext)
  • testselection.go — Parser for test-selection.yaml configs that drive per-configuration test enablement with ring-level overrides
  • discover.go — Enhanced to read test-selection configs during scenario discovery and attach test tags to scenarios

Phase 2: Core Validation Tests

  • validate_base.go — Partition validation (blkid/lsblk/mount), user validation (/etc/passwd, /etc/group), UEFI fallback boot configuration checks
  • validate_encryption.go — LUKS2/TPM2 verification: cryptsetup status, dmsetup info, luksDump JSON metadata (systemd-tpm2 tokens, KDF params, PCR policies), A/B volume pairs, swap activation

Phase 3: Remaining Validation Tests

  • validate_verity.go — DM-Verity: veritysetup status, data/hash device mapping, A/B active volume matching
  • validate_extensions.go — systemd-sysext/confext status, extension path and name verification
  • validate_rollback.go — Health check rollback: servicing state, error messages, failure log validation
  • validate_ab_staged.go — Staged A/B update state verification

Phase 4: Pipeline Simplification

  • JUnit XML output — Added -j flag to storm-trident run in test_execution_template.yml with ADO PublishTestResults step
  • Metrics & logs — Boot metrics collection (metrics.go) and artifact publishing (logs.go) moved from YAML pipeline steps into the storm scenario
  • Pipeline enablement — Enabled all 18 VM/HOST configurations in storm_e2e.yml (removed dev filter)

Phase 5: Cleanup

  • Deleted Python E2E testsconftest.py, base_test.py, encryption_test.py, extensions_test.py, verity_test.py, rollback_test.py, ab_update_staged_test.py, pytest.ini
  • Deleted YAML templatese2e-test-run.yml, e2e-test-abupdate-scenario.yml, e2e-ab-update-stage-finalize-test-run.yml
  • Updated documentation — E2E README with validation test cases, test-selection format, and configuration profiles

Test Coverage

All new Go code includes unit tests (2,034 lines of test code):

  • Parser tests — Real-world output samples for all 15+ parsers
  • Test selection — 19 unit tests covering YAML parsing, ring overrides, edge cases
  • Discovery — Integration tests for test-selection loading during scenario discovery
  • Registration — Verification that RegisterTestCases() correctly wires up validation for all 19 configuration profiles
  • Pipeline validation — Tests verifying ADO pipeline matrix generation contracts
  • Feature parity — Comprehensive cross-check that Go test coverage matches the original pytest suite across all configurations

Configuration Coverage

All 19 trident configurations are validated:

base · simple · combined · encrypted-partition · encrypted-raid · encrypted-swap · extensions · health-checks-install · memory-constraint-combined · misc · raid-big · raid-mirrored · raid-resync-small · raid-small · rerun · root-verity · split · usr-verity · usr-verity-raid

How to Verify

# Run all E2E unit tests
cd tools/storm && go test ./e2e/... -v

# Run a specific validation test suite
cd tools/storm && go test ./e2e/scenario/ -run TestValidateCommon -v

# Run a storm scenario locally
storm-trident run --config base --ring pr-e2e

Britel and others added 21 commits February 23, 2026 22:47
Implement shared Go helpers for the E2E validation test conversion:

SSH helpers:
- sudoCommand(): Run commands with sudo via SSH, return trimmed stdout
- runCommand(): Run commands without sudo via SSH

Parsers for system command output:
- ParseBlkid(): Parse standard blkid output (key=value per device)
- ParseBlkidExport(): Parse blkid --output export format
- ParseLsblk(): Parse lsblk -J -b JSON output with FlattenPartitions()
- ParseMount(): Parse mount output, FindRootDevice() helper
- ParsePasswd(): Parse /etc/passwd into username-keyed map
- ParseGroup(): Parse /etc/group into group-keyed map with members
- ParseEfiBootMgr(): Parse efibootmgr output for boot entries
- ParseKeyValueLines(): Generic key:value parser (cryptsetup, dmsetup, etc.)
- ParseTable(): Whitespace-separated table parser with header row

Crypto/storage parsers:
- ParseVeritySetupStatus(): Parse veritysetup status output
- ParseLuksDump(): Parse cryptsetup luksDump JSON metadata
- ParseDevMdListing(): Parse ls -l /dev/md for RAID name resolution
- GetRaidNameFromDeviceName(): Resolve /dev/mdN to /dev/md/name
- ParseSysextStatus(): Parse systemd-sysext status JSON

Host config helpers:
- IsPartition()/IsRaid(): Check device type in host status
- CheckPathExists(): Verify remote path exists
- ParseTridentGetOutput(): Parse trident get YAML output

Unit tests: 26 tests covering all parsers with real-world output samples.

Task: bn-f2c3
Implement TestSelection struct and parser for test-selection.yaml files found
in trident_configurations/*/test-selection.yaml. Supports:
- Parsing 'compatible' marker lists (base test tags)
- Ring-level overrides (weekly, daily, post_merge, pullrequest, validation)
  with add/remove operations
- TestTags() for base compatible tags with 'test:' prefix
- TestTagsForRing() for resolved tags after ring-specific overrides
- HasMarker() for checking individual marker membership

Includes 19 unit tests covering: simple/complex YAML parsing, all ring
overrides, edge cases (empty input, invalid YAML, duplicate adds,
nonexistent removes), and real config format validation.

Task: bn-0279
Add ParseCryptsetupStatus() with CryptsetupStatus struct for cryptsetup
status output (cipher, keysize, key location, device, mode, etc.).

Add ParseDmsetupInfo() with DmsetupInfo struct for dmsetup info output
(name, state, UUID, major/minor, open count, etc.).

Both parsers delegate to ParseKeyValueLines internally but expose typed
fields for direct access. Unit tests cover basic output, empty input,
and variant states (PLAIN dm-crypt, SUSPENDED device).

bn-965d
…scovery

- Load test-selection.yaml for each configuration during discovery
- Parse test selection markers and convert to test tags (e.g. test:base)
- Append test tags to scenario tags for storm filtering
- Store test tags in TridentE2EScenario with TestTags()/HasTestTag() accessors
- Add discover_test.go with path helper and integration tests
- Add trident_test.go with TestTags/HasTestTag unit tests
- All 48 tests pass
Convert base_test.py test_partitions to Go. Validates partitions via
blkid, lsblk -J -b, and mount. Checks A/B volume pairs, root mount
points, RAID arrays, partition sizes and UUIDs.

- Add parseSizeToBytes helper for partition size conversion
- Add validatePartitions method on TridentE2EScenario
- Register as validate-partitions test case (gated on test:base tag)
- Add unit tests for parseSizeToBytes

Closes: bn-9225
Convert base_test.py test_users and test_uefi_fallback to Go:
- validateUsers: checks /etc/passwd and /etc/group against host config
- validateUefiFallback: validates UEFI fallback boot configuration
- Register both as test cases in RegisterTestCases() under test:base tag
Convert encryption_test.py test_encryption to Go. Validates:
- blkid --output export device type (crypto_LUKS)
- cryptsetup status (cipher aes-xts-plain64, keysize 512)
- dmsetup info (CRYPT-LUKS2 UUID, active state)
- cryptsetup luksDump --dump-json-metadata (systemd-tpm2 token,
  pbkdf2/sha512 KDF, UKI vs non-UKI PCR policies)
- A/B volume pair active/inactive state with findmnt
- Swap device activation
- Partition and RAID backing device resolution

Handles SELinux workaround for luksDump (lvm_t permission quirk).
Register as validateEncryption test case method on TridentE2EScenario.

bn-3756
Add validate-encryption test case registration gated by the
test:encryption tag, following the same pattern as the existing
test:base validators. This wires up the already-implemented
validateEncryption function so it runs for configurations with
encryption volumes.
Convert verity_test.py test_verity_root to Go. Validates:
- /dev/mapper/root exists in blkid output
- veritysetup status reports type VERITY, status verified, mode readonly
- Data and hash device mapping matches host status verity configuration
- A/B active volume matching with partition vs RAID backing devices

Register as validate-verity test case, conditional on test:root_verity
or test:usr_verity tags from test-selection.yaml.
Convert extensions_test.py test_extensions to Go. Validates:
- systemd-sysext/confext status --json=pretty output parsing
- Extension paths exist on target OS
- Extension names appear in active extension list

Registered as validate-extensions test case gated by test:extensions tag.
All 41 scenario tests pass.
Convert rollback_test.py test_rollback to Go. Validates:
- servicingState matches expected state (not-provisioned for clean install,
  provisioned for A/B update scenarios)
- lastError contains 'Failed health check(s)'
- abActiveVolume absent when not-provisioned, unchanged when provisioned
- Exactly 1 health-check-failure log file exists
- Log contains expected failure messages (script failure, systemd service
  not found errors)

Register as validateRollback test case conditional on test:rollback tag
(health-checks-install config).
Convert ab_update_staged_test.py test_ab_update_staged to Go. Validates:
- trident get shows servicingState == 'ab-update-staged'
- abActiveVolume unchanged from pre-update (volume-a)

Registered as validate-ab-staged test case in the split A/B update flow.
Split the combined ab-update test case into separate stage, validate-staged,
and finalize steps with dedicated abStageOs and abFinalizeOs methods.
Add comprehensive unit tests verifying that RegisterTestCases() correctly
registers verity, extensions, rollback, and ab-staged validation test cases
across all 19 trident configuration profiles.

Tests include:
- TestRegisterTestCases_Phase3_Verity: validates verity registration for
  root_verity/usr_verity test tags
- TestRegisterTestCases_Phase3_Extensions: validates extensions registration
- TestRegisterTestCases_Phase3_Rollback: validates rollback registration
- TestRegisterTestCases_Phase3_AbStaged: validates ab-staged registration
  for configs with AB update
- TestRegisterTestCases_AllPhase3_FeatureParity: comprehensive feature parity
  check across all 19 config profiles matching pytest test suite coverage
- TestDiscoverTridentScenarios_Phase3TestTags: discovery-level Phase 3 tag
  verification (extends as ALLOWED_CONFIGS expands)
Delete all Python E2E test files that have been replaced by Go storm
framework equivalents:
- conftest.py, base_test.py, encryption_test.py, extensions_test.py
- verity_test.py, rollback_test.py, ab_update_staged_test.py
- pytest.ini

Remove pip install pytest/fabric step from e2e-test-run.yml.
Clean Python-specific entries from tests/e2e_tests/.gitignore.

Preserves helpers/, trident_configurations/, and target-configurations.yaml
which are still referenced by active pipeline templates.
Add -j flag to storm-trident run in test_execution_template.yml to produce
JUnit XML output for each E2E scenario. Add handle-junit-test-results.yml
template step to publish test results to ADO.

Storm's built-in JUnit reporter produces XML compatible with ADO's
PublishTestResults task, with pass/fail/skip status, timing, and failure
details for each test case.

JUnit XML files are written to the output directory as
<scenario>_<job-attempt>.junit.xml and published as build artifacts.

Addresses: Phase 4 YAML pipeline simplification - JUnit integration
- Add metrics.go: Boot metrics collection via SSH/systemd-analyze after
  clean install and A/B updates, writing to boot-metrics.jsonl
- Add logs.go: Publish logstream, tracestream, and boot metrics files as
  artifacts via ArtifactBroker for ADO collection
- Register collect-install-boot-metrics after check-trident-ssh
- Register collect-boot-metrics after each AB update (regular and split)
- Register publish-logs at end of scenario for artifact collection
- Add unit tests for boot timing parsing and millisecond conversion
- Update feature parity tests for new test case registration
- Update E2E README with metrics and log collection documentation

This moves boot-metrics and display-logs functionality from separate YAML
pipeline steps into the storm scenario, simplifying the pipeline to a
single storm-trident run invocation.
Enable all 18 VM host configurations in the storm E2E pipeline by
removing the ALLOWED_CONFIGS filter in invert.py. Previously only the
'base' configuration was enabled during development.

Changes:
- invert.py: Remove ALLOWED_CONFIGS=['base'] filter, allow all configs.
  Empty list now means all configs pass through. VM host and BM/container
  filters remain since those runtimes are not yet implemented in storm.
- configurations.yaml: Regenerated with all 18 VM host configs across
  rings (pr-e2e through full-validation).
- storm_e2e.yml: Remove development TODO, document VM HOST enablement
  and pending BM/container support.

The test_execution_template.yml already uses a single storm-trident run
invocation that replaces the multi-step e2e-test-run.yml,
e2e-test-abupdate-scenario.yml, and e2e-ab-update-stage-finalize-test-run.yml
templates. JUnit result publishing and artifact upload are preserved.
Delete e2e-test-run.yml, e2e-test-abupdate-scenario.yml, and
e2e-ab-update-stage-finalize-test-run.yml from testing_common/.

These templates used the old pytest-based E2E test flow which has been
replaced by storm-trident via the storm_e2e.yml → test_execution_template.yml
pipeline path (Phase 4: YAML Pipeline Simplification).

Updated netlaunch-testing.yml and baremetal-testing.yml to remove references
to the deleted e2e-test-run.yml template.
… format

- Add Test Selection section documenting test-selection.yaml format,
  tag mapping, and all 19 configuration profiles with their markers
- Add Validation Test Cases section documenting core, tag-gated,
  and A/B update test cases with source file references
- Remove outdated IN DEVELOPMENT note from Discovery section
- Fix function name reference (DiscoverTridentScenarios)
- Fix minor typos in Discovery section
…verage

Add TestDiscoverTridentScenarios_All19Configs to verify:
- All 19 configuration directories exist in embedded data
- All 18 VM/HOST configs produce discovered scenarios
- raid-big is correctly excluded from VM/HOST (BM-only)

Add TestDiscoverTridentScenarios_FullValidationRingCoverage to verify:
- All 18 VM/HOST scenarios include the full-validation ring
- Correct count of scenarios for full-validation matrix
Validates that the simplified pipeline (Phase 4) is correctly configured
for ADO execution:
- PR-E2E ring produces VM/HOST scenarios (13 configs)
- Matrix JSON format matches test_execution_template.yml expectations
- ADO output variable naming convention is valid
- Full-validation ring covers all 18 VM/HOST configurations

These tests verify the contract between storm-trident matrix generation
and the ADO pipeline templates without requiring ADO access.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant