tests/zedagent: add zedagent integration test suite#1152
Open
eriknordmark wants to merge 8 commits into
Open
Conversation
32a6659 to
57f210d
Compare
a859f17 to
8d2adc5
Compare
254ce3b to
f6d75b5
Compare
Add two escript scenarios covering the NIM startup file-ingest path that no Go unit test reaches: cmd/nim/nim.go ingestDevicePortConfig() and ingestDevicePortConfigFile(), plus the hasPersistLastconfig() short-circuit. nim_lastconfig_blocks_ingest verifies that with /persist/checkpoint/lastconfig present NIM emits the explicit suppression log line, leaves /run/global/DevicePortConfig/ empty, and adds no "override" entry to DevicePortConfigList. nim_override_json_ingest verifies that with lastconfig deleted and the /persist/checkpoint directory chattr +i'd to defeat zedagent's race to recreate it, NIM picks up an override.json under /config/DevicePortConfig/, copies it to /run/global/DevicePortConfig/, registers the "override" key in DevicePortConfigList, and stamps ConfigSource.Origin = OVERRIDE (=3). Both scripts mount /dev/sda4 (vfat CONFIG partition) at runtime to inject the override file, since /config is a read-only tmpfs at runtime in the QEMU/LinuxKit EVE image. They also document a non-obvious eden CLI gotcha: `eden eve ssh '<multi-line>'` collapses newlines to spaces, so all multi-command shell snippets must be joined with `;` or `&&` on a single line. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extend the NIM startup-matrix coverage from 2 to 6 of the 6 rows in the matrix documented in pkg/pillar/docs/nim-eden-test-plan.md. nim_usb_json_ingest (R1, sibling of override case) — verifies that ingestDevicePortConfigFile() derives DPC.Key from the file basename when the JSON has no explicit Key field, so a usb.json results in DPCList["usb"] not DPCList["override"]. nim_bootstrap_supersedes_override (R3) — verifies cmd/nim's bootstrap- skip branch: when /config/bootstrap-config.pb is present, legacy *.json files under /config/DevicePortConfig/ are NOT copied into /run/global/. The test deliberately uses a 1-byte placeholder pb because cmd/nim only checks file existence, not content. nim_lastconfig_blocks_bootstrap (R6) — verifies the expectBootstrapDPCs reset branch at nim.go:214: with lastconfig present, NIM does not wait for an installer DPC even if bootstrap-config.pb exists. Asserts steady-state (DPCList has zedagent entry) and that the pb file is not consumed/deleted. nim_bootstrap_only (R2) — exercises the full bootstrap-pb decode path end-to-end: signed pb on /config, lastconfig deleted, NIM ingests via zedagent's republish. Currently `skip`'d via [!exec:nim-bootstrap-pb-gen] until that host-side helper is added; the binary's specification is documented in the file's header. All four reuse the patterns established by nim_lastconfig_blocks_ingest and nim_override_json_ingest: /dev/sda4 mount for /config writes, chattr +i on /persist/checkpoint to defeat zedagent's lastconfig- recreate race, semicolon-joined ssh commands (multi-line single-quoted ssh strings collapse newlines to spaces), and DPCList polling (durable) rather than /run/global/ polling (tmpfs-wiped). Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds tests/network/cmd/nim-bootstrap-pb-gen, a host-side helper that signs an EdgeDevConfig JSON into a /config/bootstrap-config.pb using the eden controller signing key. The nim_bootstrap_*.txt testdata invokes it at runtime to stage controllable bootstrap configs and verify content-level round-trip behavior — a distinctive Logicallabel baked into the bootstrap pb reappears in the device's DevicePortConfigList, confirming end-to-end propagation from /config to NIM's pubsub state. The testdata writes to the CONFIG partition via `eve config mount /run/<path>`, a device-agnostic interface that exposes the persistent partition read/write regardless of which block device backs it. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
nim_dpcl_reapplied_after_reboot — two-reboot test that isolates the pubsub persistent-publication reload path from any re-ingestion. First reboot stages and ingests an override.json under a distinctive Logicallabel "reapply-test" with TimePriority=1990-01-01 (the entry stays in DPCList without becoming the active DPC, so eth0's effective Logicallabel is unchanged for subsequent tests). Second reboot removes the file and clears the chattr +i directory flag so the override CANNOT be re-ingested. The DPCList entry surviving the second reboot is positive evidence that NIM's pubDevicePortConfigList (Persistent:true) is being reloaded on agent startup. Without this isolation, every other test re-ingests the file it observes, masking a regression in the persistent reload path. nim_flowlog_acl_reconcile — toggle test that drives the flowlog gate end to end: NetworkInstanceConfig with EnableFlowlog=false produces no CONNMARK marking rules in iptables mangle table; with EnableFlowlog=true the "SSH and Guacamole mark" multiport rule (matching tcp dports 22,4822) appears via DpcReconciler's getIntendedMarkingRules; deleting the NI removes the rules again. Skips itself on kube EVE because r.HVTypeKube unconditionally installs the marking rules regardless of EnableFlowlog, which makes the iptables witness incapable of distinguishing the two states. Both tests use the established test idioms (eve config mount, eve exec pillar jq, single-line ssh, defensive pre-cleanup, 3-consecutive-success ssh stabilization). Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
LPS multi-port + wireless mismatch, diag-endpoint GCP propagation) radio_silence_persistence (eclient) — extends radio_silence.txt by toggling radio silence ON via the local-manager LPS server, then rebooting EVE and asserting that /run/zedagent/ZedAgentStatus/zedagent.json on the device still carries RadioSilence.Imposed=true after the reboot (the pubsub topic file is keyed by agent name, so the path is zedagent.json, not global.json). The assertion reads EVE-side pubsub directly so it does not depend on the LPS app having reconnected. Exercises zedagent's lastradioconfig persistence on disk and NIM's subZedAgentStatus subscription delivering the restored ZedAgentStatus on a fresh boot through cmd/nim.handleZedAgentStatusImpl. A reset-radio-state.sh helper wipes /persist/checkpoint/lastradioconfig and reboots before the test starts, so the precondition wait-radio-status=false converges even if a prior test or a prior failed run left lastradioconfig with Imposed=true. The substantive test claim is the post-reboot Imposed=true read; the test does not depend on a post-reboot toggle-OFF round-trip, which can hang on the EVE write path. lps_all_mgmt_ports_overridden (eclient) — applies an LPS local network config that overrides BOTH eth0 (DNS) and eth1 (MTU) on the QEMU device model where both adapters are management uplinks. After both ports flip to configApplied=true via the local-manager API, the test asserts the behavioural witnesses on the device — resolv.conf shows the LPS-supplied DNS, ifconfig shows the LPS-supplied MTU. Exercises dpcmanager/lps.go loadLpsConfig and mergeWithLpsConfig in the multi-port case — the structural precondition of areAllMgmtPortUsingLpsConfig() suppression. (The actual fallback suppression behaviour requires SDN-driven controller-failure injection and is out of scope.) lps_wireless_type_mismatch (eclient) — sends an LPS config for eth0 with WirelessDeviceType=WIRELESS_TYPE_WIFI on the QEMU model where eth0 has WirelessType=None. The pre-merge guard in mergeWithLpsConfig rejects the LPS port. Test asserts the distinctive "wireless type mismatch" error string in the local-manager network-info, configApplied=false on the LPS side, and configApplied=true on the controller side. Cleanly skips on the QEMU device model since both adapters are Ethernet — any LPS DPC with WIRELESS_TYPE_WIFI is rejected upstream with "missing WiFi configuration" rather than reaching the branch under test; activating this test requires a custom devmodel.json with a wireless port. nim_diag_remote_endpoints (network) — sets diag.probe.remote.http.endpoint via eden controller update and asserts the new value reaches /run/zedagent/ConfigItemValueMap/global.json under .GlobalSettings["diag.probe.remote.http.endpoint"].StrValue. Tests the upstream propagation (eden→adam→zedagent) only; the downstream NIM consumption (in-memory connTester.DiagRemoteEndpoints and the actual probes on controller failure) needs SDN-driven failure injection and is documented as out of scope. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds four pre-onboard variants of the NIM /config-ingest tests: - nim_bootstrap_only_preonboard - nim_bootstrap_supersedes_override_preonboard - nim_override_only_preonboard - nim_globalconfig_only_preonboard Each test owns its full eden lifecycle (stop, wipe, eden setup with --eve-bootstrap-file or --eve-config-dir, eden start; no onboard). With adam never learning the device exists, /config-sourced state stays authoritative for the entire verification window, closing the controller-takeover race that makes the onboarded nim_bootstrap_supersedes_override test a known false negative. debug.enable.ssh rides inside the staged config in every variant and gates sshd's iptables open via SSHAuthorizedKeys. SSH readiness doubles as the bug-class canary for the lf-edge/eve#5584/#5775 cert-chain regression class: when zedagent rejects the bootstrap pb (or fails to apply GlobalConfig), SSHAuthorizedKeys is never set, the iptables INPUT rule for tcp/22 stays REJECT, and the test times out — a race-free signal. The four tests share preonboard-template.json (a sanitized EdgeDevConfig template) and are grouped in eden.network-preonboard.tests.txt for invocation via `eden test ./tests/network -s eden.network-preonboard.tests.txt`. tests/network/Makefile's setup target globs *.tests.txt so any additional manifest in the directory stages alongside the default eden.network.tests.txt automatically. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Yetus's detsecrets (detect-secrets v1.2.0) plugin flags the literal password value as Secret Keyword. The value is meaningless -- the test rejects the port on wireless-type mismatch before any credentials are inspected -- but the keyword detector pattern-matches "password": "<any-string>" and trips. detect_secrets/filters/heuristic.py is_templated_secret exempts values shaped like <foo>, so use "<redacted>" as the sentinel. Test still exercises the same code path with no behavioural change. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds an Eden testscript suite under tests/zedagent/ that exercises the zedagent microservice end-to-end against a live EVE instance: device info, metrics, app/NI info, attestation FSM, and the /config/-ingest paths consulted at boot. Ten scenarios in two manifests. eden.zedagent.tests.txt (eight tests, run against an onboarded EVE): device_info_completeness, config_items_and_status, maintenance_mode, app_metrics_detail, network_instance_info_metrics, attest_flow (skips without eve.tpm), bootstrap_config_item_ingest, global_config_file_ingest. The two /config-ingest scenarios stage a bootstrap-config.pb or GlobalConfig/global.json after the device is onboarded, freeze /persist/checkpoint so zedagent cannot recreate lastconfig, and reboot. bootstrap_config_item_ingest polls /run/zedagent/ConfigItemValueMap/global.json for its marker (which survives because adam echoes the same configItems back). global_config_file_ingest greps the newlog for the "/config/GlobalConfig contains:" notice instead, because parseConfigItems on adam's first fetch rebuilds globalConfig from defaults+adam's-items and wipes any item adam doesn't push — the log line is the only persistent signal. eden.zedagent-preonboard.tests.txt (two tests, each owns its full eden lifecycle — stop, wipe, eden setup with --eve-bootstrap-file or --eve-config-dir, eden start, no onboard): bootstrap_config_item_ingest_preonboard, global_config_file_ingest_preonboard. With adam never learning the device exists, /config-sourced ConfigItemValueMap stays authoritative throughout the verification window — no controller-takeover race, no log-grep fallback needed. debug.enable.ssh rides inside the staged config in both preonboard tests and gates sshd's iptables open via SSHAuthorizedKeys. SSH readiness doubles as the bug-class canary for the lf-edge/eve#5584/#5775 cert-chain regression class: a zedagent that rejects the bootstrap pb (or fails to apply GlobalConfig) leaves SSHAuthorizedKeys unset and the test times out — a loud, race-free signal. zedagent_test.go provides TestInfo, TestMetric, and TestFlowLog helpers that the testscripts invoke via the `test` command. The preonboard scenarios use preonboard-template.json from PR lf-edge#1165's tests/network/testdata/. The post-onboard suite was validated against a QEMU-based coverage-instrumented EVE; the six baseline scenarios plus the two /config-ingest tests achieve substantially higher cmd/zedagent coverage than the unit tests alone. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
f6d75b5 to
ac1b2d3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an Eden testscript suite under
tests/zedagent/that exercisesthe zedagent microservice end-to-end against a live EVE instance:
device info, metrics, app/NI info, attestation FSM, and the
/config/-ingest paths consulted at boot. Ten scenarios in twomanifests.
eden.zedagent.tests.txt(eight tests, run against an onboarded EVE):device_info_completeness,config_items_and_status,maintenance_mode,app_metrics_detail,network_instance_info_metrics,attest_flow(skips withouteve.tpm),bootstrap_config_item_ingest,global_config_file_ingest.The two
/config-ingest scenarios stage a/config/bootstrap-config.pbor
/config/GlobalConfig/global.jsonafter the device is onboarded,freeze
/persist/checkpointso zedagent cannot recreatelastconfig,and reboot.
bootstrap_config_item_ingestpolls/run/zedagent/ConfigItemValueMap/global.jsonfor its marker (whichsurvives because adam echoes the same configItems back).
global_config_file_ingestgreps the newlog for the"/config/GlobalConfig contains:"notice instead, becauseparseConfigItemson adam's first fetch rebuilds globalConfig fromdefaults + adam's-items and wipes any item adam doesn't push — the log
line is the only persistent signal.
eden.zedagent-preonboard.tests.txt(two tests, each owns its fulleden lifecycle — stop, wipe,
eden setupwith--eve-bootstrap-fileor
--eve-config-dir,eden start, no onboard):bootstrap_config_item_ingest_preonboard,global_config_file_ingest_preonboard. With adam never learning thedevice exists,
/config-sourced ConfigItemValueMap stays authoritativethroughout the verification window — no controller-takeover race, no
log-grep fallback needed.
debug.enable.sshrides inside the staged config in both preonboardtests and gates sshd's iptables open via
SSHAuthorizedKeys. SSHreadiness doubles as the bug-class canary for the lf-edge/eve#5584 /
lf-edge/eve#5775 cert-chain regression class: a zedagent that rejects
the bootstrap pb (or fails to apply GlobalConfig) leaves
SSHAuthorizedKeysunset and the test times out — a loud, race-freesignal.
zedagent_test.goprovidesTestInfo,TestMetric, andTestFlowLoghelpers that the testscripts invoke via the
testcommand. Thepreonboard scenarios use
preonboard-template.jsonfrom #1165'stests/network/testdata/.Test plan
eden test ./tests/zedagentagainst an onboarded EVE — all eightpost-onboard tests pass;
attest_flowself-skips on non-TPMQEMU.
eden test ./tests/zedagent -s eden.zedagent-preonboard.tests.txt— both preonboard tests pass (~3.5 min total).
🤖 Generated with Claude Code