fix(kube): persist k3s node password across reboots#5970
Open
naiming-zededa wants to merge 1 commit into
Open
Conversation
16606d8 to
56e0e9c
Compare
cshari-zededa
approved these changes
May 18, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5970 +/- ##
==========================================
+ Coverage 20.64% 21.05% +0.41%
==========================================
Files 489 499 +10
Lines 90431 92129 +1698
==========================================
+ Hits 18667 19399 +732
- Misses 70187 70972 +785
- Partials 1577 1758 +181 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
56e0e9c to
90aebbf
Compare
| # password rides along in the cluster -> single-node snapshot for free | ||
| # (no separate snapshot helper needed). | ||
| # | ||
| # NOTE: this is a green-field fix. Devices that have already booted under |
There was a problem hiding this comment.
You need to update these comments since I believe you fixed the brown field installs too.
Contributor
Author
There was a problem hiding this comment.
Sure. Updated.
zedi-pramodh
approved these changes
May 18, 2026
zedi-pramodh
left a comment
There was a problem hiding this comment.
LGTM, Just address the comments.
The k3s node password lives on a tmpfs overlay and is regenerated on every reboot, causing NodePasswordValidationFailed errors against the server-side etcd secret and potentially leaving the node stuck NotReady. Persist the password to /var/lib/k3s-node-password (inside the TPM-sealed vault) so it survives reboots. Restore it before k3s starts; save it inside check_start_k3s immediately after k3s launches, covering both first-init and restart paths. Added the bronw field case, it finds if there is no k3s-node-password in the persist /var/lib, it will flag it, and delete the secret of the node-password for itself Signed-off-by: naiming-zededa <naiming@zededa.com>
90aebbf to
ca89af2
Compare
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The k3s node password lives on a tmpfs overlay and is regenerated on
every reboot, causing NodePasswordValidationFailed errors against the
server-side etcd secret and potentially leaving the node stuck
NotReady.
Persist the password to /var/lib/k3s-node-password (inside the
TPM-sealed vault) so it survives reboots. Restore it before k3s
starts; save it inside check_start_k3s immediately after k3s launches,
covering both first-init and restart paths.
Added the bronw field case, it finds if there is no k3s-node-password
in the persist /var/lib, it will flag it, and delete the secret of the
node-password for itself
PR dependencies
How to test and validate this PR
With this patch in eve-k, at the single-node mode, convert it to part of the cluster.
Then restart the device after it is part of the cluster, do kubectl describe node,
we should not see the events having: NodePasswordValidationFailed
and test for brown field for existing eve-k cluster.
After the first image upgrade w/ this patch, the node-password will be removed from
the cluster. it takes a second reboot or k3s restart to take effect.
Changelog notes
fix(kube): persist k3s node password across reboots
PR Backports
Checklist
For backport PRs (remove it if it's not a backport):
And the last but not least:
check them.
Please, check the boxes above after submitting the PR in interactive mode.