Skip to content

docs(tpmmgr): expand architecture document#5934

Draft
eriknordmark wants to merge 1 commit into
lf-edge:masterfrom
eriknordmark:tpmmgr-architecture-doc
Draft

docs(tpmmgr): expand architecture document#5934
eriknordmark wants to merge 1 commit into
lf-edge:masterfrom
eriknordmark:tpmmgr-architecture-doc

Conversation

@eriknordmark
Copy link
Copy Markdown
Contributor

Description

Replaces the 30-line stub at pkg/pillar/docs/tpmmgr.md with a full
architecture document covering:

  • tpmmgr's dual shape — a single-shot CLI invoked from the device
    boot scripts (createDeviceCert, createCerts, saveTpmInfo,
    diagnostic print* / test* subcommands) plus a long-running
    service started under zedbox.
  • pubsub I/O — EdgeNodeCert (persistent, ECDH/quote/EK certs with
    the EK's TPM2B_PUBLIC attached as metadata), AttestQuote
    (signed quote + PCRs 0–23), and TpmSanityStatus (consumed by
    nodeagent to drive MaintenanceModeReasonTpmEncFailure /
    TpmQuoteFailure).
  • the well-known TPM handles and NV indices the agent uses
    (EK/SRK/AIK/quote/ECDH/device key persistent handles, device-cert
    NV index, credentials NV index).
  • the CLI subcommand catalog with each boot-script caller, and the
    TPM-rooted vs soft cert-creation paths.
  • the periodic 1-hour TPM sanity check (encrypt/decrypt round trip
    • quote) that catches latent TPM failures before the next baseos
      upgrade or attestation cycle.
  • four control-flow paths — first-boot provisioning, long-running
    startup, attestation quote, and periodic sanity check.

References from the original doc are carried forward in a "Further
reading" section: trustedcomputinggroup.org, the LF Edge "Device
Identity, Onboarding, Security Foundation" and "Device Identity
rooted at TPM" wiki pages, and a link to
https://github.com/google/go-tpm.

The doc is structured to mirror nodeagent.md and baseosmgr.md
so the pillar docs remain consistent across microservices, as part
of the ongoing effort to give every pillar agent an architecture
doc and unit-test suite.

How to test and validate this PR

Docs-only change. Validation is a markdown review:

  • Render pkg/pillar/docs/tpmmgr.md (e.g. on github.com) and
    spot-check formatting (tables, fenced code blocks, links).
  • Cross-reference the "Components" section against
    pkg/pillar/cmd/tpmmgr/tpmmgr.go,
    pkg/pillar/scripts/device-steps.sh, and
    pkg/pillar/evetpm/*.go for technical accuracy.
  • Confirm the well-known TPM handles and NV indices listed match
    pkg/pillar/evetpm/tpm.go.
  • Confirm the CLI subcommand catalog matches the runCommand switch
    in cmd/tpmmgr/tpmmgr.go.

No code changes; no automated test required.

Changelog notes

No user-facing changes.

PR Backports

Docs-only refactor, no need to backport.

  • 16.0-stable: No, docs-only refactor.
  • 14.5-stable: No, docs-only refactor.
  • 13.4-stable: No, docs-only refactor.

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device — N/A, docs only
  • I've tested my PR on arm64 device — N/A, docs only
  • I've written the test verification instructions
  • I've set the proper labels to this PR
  • I've checked the boxes above, or I've provided a good reason why I didn't check them

Replace the 30-line stub with a full architecture doc covering the
dual single-shot CLI / long-running service shape, pubsub I/O
(EdgeNodeCert, AttestQuote, TpmSanityStatus), the well-known TPM
handles and NV indices, the CLI subcommand catalog with each
boot-script caller, TPM-rooted vs soft cert creation, attestation
quote handling, the periodic TPM sanity check that drives
nodeagent's MaintenanceModeReasonTpmEncFailure / TpmQuoteFailure
paths, and the four control-flow paths through the agent. The
debugging section covers the canonical pubsub records, on-disk
file locations, the diagnostic CLI subcommands, and how to force
each transition.

Preserves the references the original doc carried
(trustedcomputinggroup.org, the LF Edge "Device Identity" wiki
pages, https://github.com/google/go-tpm, and the
/persist/newlog/devUpload log location).

Structured to mirror nodeagent.md and baseosmgr.md so the pillar
docs remain consistent across microservices.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: eriknordmark <erik@zededa.com>
@eriknordmark eriknordmark force-pushed the tpmmgr-architecture-doc branch from 3bf903b to e29d62e Compare May 20, 2026 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant