Skip to content

baseosmgr: Phase 2 — Zboot/kubeapi/HVType seams + tests#5923

Draft
eriknordmark wants to merge 5 commits into
lf-edge:masterfrom
eriknordmark:baseosmgr-unit-tests-phase2
Draft

baseosmgr: Phase 2 — Zboot/kubeapi/HVType seams + tests#5923
eriknordmark wants to merge 5 commits into
lf-edge:masterfrom
eriknordmark:baseosmgr-unit-tests-phase2

Conversation

@eriknordmark
Copy link
Copy Markdown
Contributor

Description

Phase 2 of bringing `pkg/pillar/cmd/baseosmgr/` to non-trivial unit-test
coverage. Builds on #5922 (Phase 1).

The PR is split into two commits so the refactor can be reviewed
independently from the tests:

  1. `baseosmgr: add Zboot/kubeapi/HVType seams` — refactor only;
    no behaviour change. Three seams let the partition state machine,
    the kubevirt drain gate, and the EVE-k personality check be driven
    from unit tests without invoking `/sbin/zboot`, persisting
    grubenv, or pulling in a real zedkube subscription:

    • Zboot interface wraps the subset of `pkg/pillar/zboot` that
      baseosmgr uses (15 functions: Get/Is on partitions, Set / Mark
      state writers, `WriteToPartition` for the install worker).
      `realZboot` adapts the package and stores the log so the
      wrapped signatures don't have to.
    • `seams` struct holds the four small one-off function-pointer
      seams (`isHVTypeKube`, `isVersionHVTypeKube`,
      `getNodeDrainStatus`, `requestNodeDrain`).
      `baseOsMgrContext.seams` holds an instance; `defaultSeams(logArg)`
      returns the production wiring around `base.IsHVTypeKube` /
      `base.IsVersionHVTypeKube` and the kubeapi package, mirroring
      the existing `ctx.paths` grouping pattern.

    Production call sites in `shouldDeferForNodeDrain` and
    `doBaseOsStatusUpdate` now go through `ctx.zboot.<...>` and
    `ctx.seams.<...>` instead of calling the underlying packages
    directly. `installWorker` type-asserts its `ctxPtr` argument
    so it can use `ctx.zboot.WriteToPartition`.

  2. `baseosmgr: add Phase-2 unit tests` — 9 new test files plus
    extensions to `mocks_test.go` (`mockZboot`, `mockWorker`),
    `forcefallback_test.go`, `wrappers_test.go`, `helpers_test.go`.

Coverage delta

Source Before (Phase 1) After (Phase 2)
`go test -count=1 -cover ./cmd/baseosmgr/...` 16.1% 75.3%

Per-area highlights (from `go tool cover -func`):

  • `getPartitionState`, `createZbootStatus`, `getZbootStatus`,
    `updateAndPublishZbootStatus[All]`,
    `baseOsGetActivationStatus`, `baseOsSetPartitionInfoInStatus`,
    `validatePartition`, `validateAndAssignPartition`,
    `isImageInErrorState`, `installWorker`, `processInstallWorkResult`
    → 100%
  • `handleForceFallback` 94.7%, `handleZbootTestComplete` ≈ 90%+,
    `handleUpdateRetryCounter` ≈ 90%+, `updateBaseOsStatusOnReboot` /
    `handleOtherPartRebootReason` covered across the matching-version /
    version-mismatch / not-inprogress / rebooted-from-current branches.
  • `shouldDeferForNodeDrain` covered across NOTSUPPORTED / UNKNOWN /
    NOTREQUESTED / FAILEDDRAIN / FAILEDCORDON / REQUESTED / STARTING /
    COMPLETE.

The remaining ~25% is intentional and split across:

  • `Run()` lifecycle + the `initialize*` pubsub-wiring helpers
    (covered transitively by eden tests).
  • The `realZboot` adapter methods (which exist precisely so we don't
    exercise the live `zboot` package in unit tests; they are exercised
    end-to-end by the eden `tests/update_eve_image/*` suite and by the
    upcoming `tests/baseosmgr/` eden tests for `force_fallback` /
    `retry_update`).
  • The `agentlog.HandleGlobalConfig`-backed branch of
    `handleGlobalConfigImpl` (needs a real pubsub.Subscription to
    exercise).

Why draft

Drafted because the Phase 3 Eden tests (`tests/baseosmgr/force_fallback`
and `tests/baseosmgr/retry_update`) are not yet implemented and may
inform feedback on the seams added here.

Happy to mark ready for review once #5922 has had a pass, or sooner
if reviewers want to look now.

PR dependencies

Builds on #5922.

How to test and validate this PR

```sh
cd pkg/pillar
go test -count=1 -race -cover ./cmd/baseosmgr/...
go test -count=1 -coverprofile=/tmp/b.cov ./cmd/baseosmgr/...
go tool cover -func=/tmp/b.cov
```

Expected: all tests pass, coverage ≈ 75.3%.

For semantic-equivalence review of the refactor commit: every external
entry point keeps its signature and call sites; the only new code paths
exposed to production are the small adapter wrappers in `interfaces.go`
which delegate to the underlying packages. The default seams in `Run()`
preserve the previous direct-call behaviour byte-for-byte.

Changelog notes

No user-facing changes.

PR Backports

  • 16.0-stable: No, test-only addition + non-functional refactor.
  • 14.5-stable: No, test-only addition + non-functional refactor.
  • 13.4-stable: No, test-only addition + non-functional refactor.

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device — N/A, host-side unit tests
  • I've tested my PR on arm64 device — N/A, host-side unit tests
  • I've written the test verification instructions
  • I've set the proper labels to this PR

eriknordmark and others added 5 commits May 8, 2026 02:07
Adds pkg/pillar/docs/baseosmgr.md describing the responsibilities,
pubsub inputs and outputs, internal logical components, control flow
through the five main paths (boot, BaseOsConfig install, content-tree
progress, test-complete commit, reboot-after-failed-update) plus the
force-fallback and retry-update side channels, and a debugging section.
Structured to mirror the existing nim.md and nodeagent.md so the pillar
docs remain consistent across microservices.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Lifts the three on-disk paths baseosmgr persists across reboots
(current_retry_update_counter, config_retry_update_counter,
forceFallbackCounter) out of package-level constants and into a
pathConfig struct hung off baseOsMgrContext. defaultPathConfig()
returns the production paths under /persist/, so behaviour is
unchanged. Run() initialises ctx.paths early.

The five callers — saveCurrentRetryUpdateCounter,
saveConfigRetryUpdateCounter, readSavedCurrentRetryUpdateCounter,
readSavedConfigRetryUpdateCounter, readForceFallbackCounter and
writeForceFallbackCounter — now take *baseOsMgrContext and consult
ctxPtr.paths instead of the constants. The corresponding callers in
Run() and handleForceFallback are updated.

This is the minimal seam needed for upcoming unit tests so the
counter persistence helpers can be driven against t.TempDir()
without touching /persist/.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds unit-test coverage for baseosmgr's pure helpers, lookup
functions, publication helpers and the persistent counter files.
No production behaviour change. The previous commit added the
pathConfig seam that lets the counter tests run against t.TempDir()
instead of /persist/.

Coverage of the package goes from 0% to 16.1% of statements
(go test -count=1 -cover). The remaining uncovered code is the
zboot-driven state machine (validatePartition,
validateAndAssignPartition, doBaseOsActivate, handleZbootTestComplete,
handleForceFallback, updateBaseOsStatusOnReboot, the install worker,
shouldDeferForNodeDrain) plus Run() and the agentlog-backed global
config branch — to be unlocked by a Zboot/kubeapi seam in Phase 2.

Files added (one per logical concern):

- helpers_test.go      log init helper
- mocks_test.go        mockPubSub (Publication+Subscription) and the
                       newTestCtx fixture (per-test temp paths +
                       all pubs/subs wired up)
- baseosmgr_test.go    validateBaseOsConfig, appendError,
                       publishBaseOSMgrStatus
- lookups_test.go      lookupBaseOsConfig / Status / StatusByPartLabel /
                       StatusesByContentID / ConfigByVersion,
                       lookupContentTreeStatus
- publish_test.go      publishBaseOsStatus, unpublishBaseOsStatus,
                       publishZbootStatus
- counters_test.go     save / readSaved current and config retry update
                       counters; absent / present / overwrite / garbage
- forcefallback_test.go read / write force-fallback counter; absent /
                        present / overwrite / garbage
- contenttree_test.go  checkContentTreeStatus missing / loaded-no-change /
                       loaded-from-initial / error;
                       handleContentTreeStatusImpl no-fanout
- globalconfig_test.go handleGlobalConfigImpl / Delete non-global key
                       fast paths
- wrappers_test.go     trivial Delete dispatch wrappers,
                       maybeRetryInstall non-TooEarly / no-config branches,
                       shouldDeferForNodeDrain NOTSUPPORTED short-circuit,
                       handleNodeDrainStatusImpl non-UPDATE ignore

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three seams let the partition state machine, the kubevirt drain
gate, and the EVE-k personality check be driven from unit tests
without invoking /sbin/zboot, persisting grubenv, or pulling in a
real zedkube subscription.

- Zboot interface wraps the subset of pkg/pillar/zboot that
  baseosmgr uses (15 functions: Get/Is on partitions, Set / Mark
  state writers, WriteToPartition for the install worker).
  realZboot adapts the package and stores the log so the wrapped
  signatures don't have to. baseOsMgrContext.zboot is defaulted
  to it; tests substitute a mockZboot.
- A seams struct holds the four small one-off function-pointer
  seams (isHVTypeKube, isVersionHVTypeKube, getNodeDrainStatus,
  requestNodeDrain). baseOsMgrContext.seams holds an instance;
  defaultSeams(logArg) returns the production wiring around
  base.IsHVTypeKube / base.IsVersionHVTypeKube and the kubeapi
  package, mirroring the existing ctx.paths grouping pattern.

Production call sites in shouldDeferForNodeDrain and
doBaseOsStatusUpdate go through ctx.zboot.<...> and
ctx.seams.<...> instead of calling the underlying packages
directly. installWorker now type-asserts its ctxPtr argument so
it can use ctx.zboot.WriteToPartition.

Test fixture (mocks_test.go) is extended with mockZboot, drain
seam knobs, HV-type seam knobs and a seedZbootStatus helper that
mirrors mockZboot's view of a partition into the published
ZbootStatus the way Run() does at startup. Existing Phase-1 tests
keep passing.

No production behaviour change.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Builds on the Zboot / kubeapi / HVType seams to bring baseosmgr's
package coverage from 16.1% to 75.3% of statements (go test -count=1
-cover). No production change. The remaining uncovered code is
Run() and the initialize* pubsub-wiring helpers, the realZboot
adapter (which delegates to the live zboot package and is exercised
by eden tests, not unit tests), and the agentlog-backed
HandleGlobalConfig path.

New test files:

- partition_test.go     getPartitionState, createZbootStatus,
                        getZbootStatus, updateAndPublishZbootStatus,
                        updateAndPublishZbootStatusAll,
                        baseOsGetActivationStatus,
                        baseOsSetPartitionInfoInStatus across
                        active / inprogress / unused / valid /
                        invalid / current / other matrices.
- validate_test.go      validatePartition (other-inprogress same/
                        different version) and validateAndAssignPartition
                        (TooEarly, fresh assignment, already assigned,
                        Activate=false).
- dobaseos_test.go      doBaseOsStatusUpdate branches (ContentTree
                        error, already-on-current, already-in-other-
                        deactivated, EVE-k mismatch, EVE-k-version-
                        check error), doBaseOsActivate preconditions
                        (not-other partition, missing partition status,
                        wrong partition state, missing content tree,
                        size precheck), doBaseOsInactivate, doBaseOsRemove
                        / doBaseOsUninstall, removeBaseOsStatus,
                        baseOsHandleStatusUpdate happy publish,
                        handleBaseOsConfigCreate / Modify validations,
                        baseOsHandleStatusUpdateUUID early returns.
- version_test.go       checkInstalledVersion match / mismatch /
                        no-partition-label / absent-status.
- testcomplete_test.go  handleZbootTestComplete (no-change, not-
                        current-partition, current-not-inprogress,
                        happy commit, MarkActive failure, false-flip)
                        plus handleZbootConfigImpl dispatcher.
- reboot_test.go        updateBaseOsStatusOnReboot (matching version,
                        version mismatch, other-not-inprogress) and
                        handleOtherPartRebootReason (rebooted-from-
                        current is no-op, empty-reason synthesizes
                        power-failure message), plus
                        handleNodeAgentStatusImpl latching and
                        dispatching.
- retrycounter_test.go  isImageInErrorState across all six branches,
                        and handleUpdateRetryCounter across no-status,
                        not-active counter-stable, not-active counter-
                        change-ignored, failed-image bump-and-reinstall,
                        failed-image no-change, and the active-path
                        save+publish.
- forcefallback_test.go (extended) handleForceFallback first-observe,
                        no-change, missing-current, current-not-active,
                        empty-other-version, other-not-unused, happy-
                        path flip-and-persist, BaseOsStatus republish;
                        handleZedAgentStatusImpl dispatch.
- worker_test.go        installWorker (no-target / happy / write-error),
                        installDownloadedObject (not-loaded / no-final-
                        dir / first-call-submits / result-present /
                        result-error), installDownloadedObjects (missing
                        ct / not-loaded), processInstallWorkResult,
                        AddWorkInstall (submits / submit-error).
- nodedrain_test.go     shouldDeferForNodeDrain across UNKNOWN /
                        NOTREQUESTED / FAILEDDRAIN / FAILEDCORDON /
                        REQUESTED / STARTING / COMPLETE; plus
                        handleNodeDrainStatusImpl FAILEDCORDON,
                        FAILEDDRAIN, COMPLETE-with/without-deferred,
                        bad-arg guard rails.
- wrappers_test.go      (extended) trivial Create / Modify dispatch
                        wrappers for NodeAgentStatus, ZbootConfig,
                        ZedAgentStatus, NodeDrainStatus, GlobalConfig
                        and ContentTreeStatus, plus
                        handleBaseOsConfigDeleteByStatus and
                        handleBaseOsConfigDelete known-key.
- mocks_test.go         (extended) mockWorker satisfying
                        worker.Worker for the AddWorkInstall /
                        installDownloadedObject paths.
- helpers_test.go       (extended) shared errBoom test sentinel.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

❌ Patch coverage is 57.84314% with 43 lines in your changes missing coverage. Please review.
✅ Project coverage is 23.14%. Comparing base (63fa63d) to head (fbfe4e6).
⚠️ Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
pkg/pillar/cmd/baseosmgr/interfaces.go 0.00% 28 Missing ⚠️
pkg/pillar/cmd/baseosmgr/baseosmgr.go 0.00% 5 Missing ⚠️
pkg/pillar/cmd/baseosmgr/handlebaseos.go 90.00% 5 Missing ⚠️
pkg/pillar/cmd/baseosmgr/paths.go 0.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5923      +/-   ##
==========================================
+ Coverage   21.62%   23.14%   +1.51%     
==========================================
  Files         464      476      +12     
  Lines       83994    85731    +1737     
==========================================
+ Hits        18166    19839    +1673     
+ Misses      64300    64160     -140     
- Partials     1528     1732     +204     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant