server: populate EdgeDevConfig.controllercert_confighash#152
Merged
milan-zededa merged 1 commit intoMay 11, 2026
Conversation
EVE's handleControllerCertsSha (zedagent/handlecertconfig.go:622-631) compares EdgeDevConfig.controllercert_confighash from each config pull against the device's saved value, and triggers /certs refetch on a mismatch. Adam never set this field, so the trigger was dead and the only fast path to refetching /certs was the auth-envelope SenderCertHash mismatch raised by signing-cert rotation. Pure encrypt-cert rotation had no fast trigger - the cipher decrypt path silently fails when the encrypt cert hash isn't in pubControllerCert, the periodic controllerCertsTask defaults to CertInterval=24h. See lf-edge/eve#5926. Compute a base64-URL-encoded sha256 over the raw bytes of signing.pem and encrypt.pem (the two files getAllCerts already serves to /certs). Any rotation of either file changes the bytes, flips the hash, and EVE's existing handleControllerCertsSha path schedules an immediate /certs refetch. The hash is opaque - EVE only compares it for equality - so file-byte hashing is sufficient and avoids parsing. Set the field on EdgeDevConfig before configProcess computes ConfigHash so a cert-chain change also flips ConfigHash. Otherwise adam would return 304 Not Modified for any pure cert rotation that didn't otherwise change the device config, and EVE would never see the new controllercert_confighash to compare against. The v1 API has no controller cert chain (it's used only by very old EVE versions that don't sign config envelopes); pass an empty hash so its config response keeps the historical zero-value behavior. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eriknordmark
added a commit
to eriknordmark/eden
that referenced
this pull request
May 9, 2026
End-to-end test that rotates the controller's ECDH encryption cert twice and verifies that user-data cipher blocks tagged with the encrypt cert hash keep decrypting on the device. Companion to ctrl_cert_change.txt: that test rotates the signing cert and incidentally exercises the encryption path because Eden historically reused signing-key.pem for ECDH derivation; this test deploys apps with --use-encrypt-cert so their cipher contexts reference the controller's encrypt cert (Type=3) and rotates that cert specifically. Each rotation rotates *both* signing and encrypt certs because EVE's only fast trigger for /certs refetch is signing-cert mismatch in the auth-envelope SenderCertHash. The cipher decrypt path silently fails when the encrypt cert hash is unknown without triggering refetch, adam doesn't populate EdgeDevConfig.controllercert_confighash so handleControllerCertsSha is dead, and the periodic controllerCertsTask defaults to CertInterval=24h. See lf-edge/eve#5926 for the design gap and lf-edge/adam#152 for the fix; once that adam patch ships in an Eden-tracked release, the test could be extended with an encrypt-cert-only rotation step. Order within each rotation: change-encrypt-cert first, then change-signing-cert. change-encrypt-cert re-encrypts the encrypt-tagged cipher blocks and writes the new encrypt files on adam's disk. change-signing-cert is a no-op for those cipher blocks (its reencryptConfigs filter skips them), but its on-disk signing-key swap is what makes adam sign the next auth envelope with a key the device hasn't seen, triggering SenderStatusCertMiss. By the time the device refetches /certs, both new certs are on adam's disk and arrive in a single round-trip. Verification proceeds in three layers: 1. Fresh-app deployment (eclient2 after first rotation, eclient3 after second). The app encrypts with the just-rotated ECDH key; reaching RUNNING means EVE successfully fetched the new encrypt cert into pubControllerCert and decrypted the cipher block. 2. check_encrypt_cert.sh walks /run/zedagent/ControllerCert/ (or /persist/status/zedagent/ControllerCert/ on EVE versions where that pubsub is Persistent: true) and byte-matches a Type=3 entry against encrypt-new.pem. The script normalizes for adam's strings.TrimSpace before computing sha256 to match the bytes actually published. 3. Reboot survival: after both rotations and a reboot, all three apps come back RUNNING and check_encrypt_cert.sh re-confirms the latest rotated encrypt cert is still advertised. Two rotations exercise both controllercerts.bak code paths inside EVE's MaybeSaveControllerCerts: the first rotation runs with no .bak yet, the second runs with .bak from the first rotation present. Registered as test 24/26 in tests/workflow/smoke.tests.txt. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 9, 2026
uncleDecart
approved these changes
May 11, 2026
Member
uncleDecart
left a comment
There was a problem hiding this comment.
LGTM, @milan-zededa, any code paths needs to be fixed regarding multiple EVE instances?
milan-zededa
approved these changes
May 11, 2026
Contributor
I believe the code is OK, |
eriknordmark
added a commit
to eriknordmark/eden
that referenced
this pull request
May 11, 2026
End-to-end test that rotates the controller's ECDH encryption cert twice and verifies that user-data cipher blocks tagged with the encrypt cert hash keep decrypting on the device. Companion to ctrl_cert_change.txt: that test rotates the signing cert and incidentally exercises the encryption path because Eden historically reused signing-key.pem for ECDH derivation; this test deploys apps with --use-encrypt-cert so their cipher contexts reference the controller's encrypt cert (Type=3) and rotates that cert specifically. Each rotation rotates *both* signing and encrypt certs because EVE's only fast trigger for /certs refetch is signing-cert mismatch in the auth-envelope SenderCertHash. The cipher decrypt path silently fails when the encrypt cert hash is unknown without triggering refetch, adam doesn't populate EdgeDevConfig.controllercert_confighash so handleControllerCertsSha is dead, and the periodic controllerCertsTask defaults to CertInterval=24h. See lf-edge/eve#5926 for the design gap and lf-edge/adam#152 for the fix; once that adam patch ships in an Eden-tracked release, the test could be extended with an encrypt-cert-only rotation step. Order within each rotation: change-encrypt-cert first, then change-signing-cert. change-encrypt-cert re-encrypts the encrypt-tagged cipher blocks and writes the new encrypt files on adam's disk. change-signing-cert is a no-op for those cipher blocks (its reencryptConfigs filter skips them), but its on-disk signing-key swap is what makes adam sign the next auth envelope with a key the device hasn't seen, triggering SenderStatusCertMiss. By the time the device refetches /certs, both new certs are on adam's disk and arrive in a single round-trip. Verification proceeds in three layers: 1. Fresh-app deployment (eclient2 after first rotation, eclient3 after second). The app encrypts with the just-rotated ECDH key; reaching RUNNING means EVE successfully fetched the new encrypt cert into pubControllerCert and decrypted the cipher block. 2. check_encrypt_cert.sh walks /run/zedagent/ControllerCert/ (or /persist/status/zedagent/ControllerCert/ on EVE versions where that pubsub is Persistent: true) and byte-matches a Type=3 entry against encrypt-new.pem. The script normalizes for adam's strings.TrimSpace before computing sha256 to match the bytes actually published. 3. Reboot survival: after both rotations and a reboot, all three apps come back RUNNING and check_encrypt_cert.sh re-confirms the latest rotated encrypt cert is still advertised. Two rotations exercise both controllercerts.bak code paths inside EVE's MaybeSaveControllerCerts: the first rotation runs with no .bak yet, the second runs with .bak from the first rotation present. Registered as test 24/26 in tests/workflow/smoke.tests.txt. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
eriknordmark
added a commit
to lf-edge/eden
that referenced
this pull request
May 13, 2026
End-to-end test that rotates the controller's ECDH encryption cert twice and verifies that user-data cipher blocks tagged with the encrypt cert hash keep decrypting on the device. Companion to ctrl_cert_change.txt: that test rotates the signing cert and incidentally exercises the encryption path because Eden historically reused signing-key.pem for ECDH derivation; this test deploys apps with --use-encrypt-cert so their cipher contexts reference the controller's encrypt cert (Type=3) and rotates that cert specifically. Each rotation rotates *both* signing and encrypt certs because EVE's only fast trigger for /certs refetch is signing-cert mismatch in the auth-envelope SenderCertHash. The cipher decrypt path silently fails when the encrypt cert hash is unknown without triggering refetch, adam doesn't populate EdgeDevConfig.controllercert_confighash so handleControllerCertsSha is dead, and the periodic controllerCertsTask defaults to CertInterval=24h. See lf-edge/eve#5926 for the design gap and lf-edge/adam#152 for the fix; once that adam patch ships in an Eden-tracked release, the test could be extended with an encrypt-cert-only rotation step. Order within each rotation: change-encrypt-cert first, then change-signing-cert. change-encrypt-cert re-encrypts the encrypt-tagged cipher blocks and writes the new encrypt files on adam's disk. change-signing-cert is a no-op for those cipher blocks (its reencryptConfigs filter skips them), but its on-disk signing-key swap is what makes adam sign the next auth envelope with a key the device hasn't seen, triggering SenderStatusCertMiss. By the time the device refetches /certs, both new certs are on adam's disk and arrive in a single round-trip. Verification proceeds in three layers: 1. Fresh-app deployment (eclient2 after first rotation, eclient3 after second). The app encrypts with the just-rotated ECDH key; reaching RUNNING means EVE successfully fetched the new encrypt cert into pubControllerCert and decrypted the cipher block. 2. check_encrypt_cert.sh walks /run/zedagent/ControllerCert/ (or /persist/status/zedagent/ControllerCert/ on EVE versions where that pubsub is Persistent: true) and byte-matches a Type=3 entry against encrypt-new.pem. The script normalizes for adam's strings.TrimSpace before computing sha256 to match the bytes actually published. 3. Reboot survival: after both rotations and a reboot, all three apps come back RUNNING and check_encrypt_cert.sh re-confirms the latest rotated encrypt cert is still advertised. Two rotations exercise both controllercerts.bak code paths inside EVE's MaybeSaveControllerCerts: the first rotation runs with no .bak yet, the second runs with .bak from the first rotation present. Registered as test 24/26 in tests/workflow/smoke.tests.txt. Signed-off-by: eriknordmark <erik@zededa.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adam never set
EdgeDevConfig.controllercert_confighashon its config response. EVE'shandleControllerCertsSha(pkg/pillar/cmd/zedagent/handlecertconfig.go:622-631) compares this field against the device's saved value and triggers a/certsrefetch on mismatch — but with adam emitting the empty string, the trigger was dead. The only fast path to a/certsrefetch was a signing-cert rotation that raisedSenderStatusCertMissfrom auth-envelope verification.This made pure encrypt-cert rotation effectively unobservable on the device until the periodic
controllerCertsTaskfired (defaultCertInterval = 24h): the cipher decrypt path silently returns "Controller Certificate get fail" without triggering any refetch. See lf-edge/eve#5926 for the broader gap analysis.Change
Compute
base64.URLEncoding.EncodeToString(sha256(signing.pem || encrypt.pem))and populateEdgeDevConfig.controllercert_confighash. Any rotation of either file flips the hash, EVE compares, schedules a refetch.The hash is opaque to EVE — only equality matters — so hashing the raw file bytes is sufficient and avoids cert parsing. The field is set before
ConfigHashis computed so a cert-chain rotation also flipsConfigHash; otherwise adam would return 304 Not Modified for any pure cert rotation that didn't otherwise change the device config and EVE would never see the new field.The v1 API path passes an empty hash (v1 has no controller cert chain).
Test plan
go build ./...clean.gofmt -l pkg/server/clean.go test ./pkg/server/...passes (no test files in this package today).ctrl_encrypt_cert_changerotating only the encrypt cert (no signing rotation). Verify a freshly-deployed--use-encrypt-certapp reaches RUNNING — proves EVE refetched/certsviahandleControllerCertsShaand decrypted with the new ECDH key.🤖 Generated with Claude Code