Skip to content

runtime-tools: log container adjustments.#268

Draft
klihub wants to merge 2 commits intocontainerd:mainfrom
klihub:devel/audit-logging
Draft

runtime-tools: log container adjustments.#268
klihub wants to merge 2 commits intocontainerd:mainfrom
klihub:devel/audit-logging

Conversation

@klihub
Copy link
Member

@klihub klihub commented Feb 3, 2026

This PR implements NRI audit logging for OCI Spec adjustments, which has been identified as one of the missing things we need to add (be)for(e) a v1.0. This patch

  • adds an extended generator option for runtimes to set an external audit event logger, and
  • updates the generator to use the configured logger for audit messages when an OCI Spec is adjusted

Here are updated trees for contained and CRI-O:

@klihub klihub force-pushed the devel/audit-logging branch 2 times, most recently from a6cac4e to c0863cc Compare February 4, 2026 07:09
@klihub klihub changed the title runtime-tools: emit audit events for OCI Spec mutation. runtime-tools: emit audit log messages for OCI Spec mutation. Feb 4, 2026
@klihub klihub changed the title runtime-tools: emit audit log messages for OCI Spec mutation. runtime-tools: emit audit log messages for adjustments. Feb 4, 2026
@klihub klihub force-pushed the devel/audit-logging branch from c0863cc to 9953d06 Compare February 4, 2026 15:39
@klihub klihub marked this pull request as ready for review February 4, 2026 17:11
@klihub
Copy link
Member Author

klihub commented Feb 4, 2026

@mikebrow @samuelkarp PTAL

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool.. Maybe add a little more detail to the short explanation in the README.md maybe links to the containerd/crio use of these opts. (where the integration test happens)

@klihub
Copy link
Member Author

klihub commented Feb 6, 2026

@samuelkarp I have a few questions.

Related to the approach taken here, is this roughly what you had in mind ? Related to event details, how detailed events do we want to log, and do we want to log them unconditionally ? The PR now logs unconditionally and detailed events, except in a few extreme cases where details could get really verbose. But should it be configurable, or should details be logged at a different logging level ? About the logged events/messages. The main messages are now exposed consts, with the idea that someone might want to build some tooling where it can come handy to have them exported. But I don't know if this really makes sense. Any thoughts ?

@chrishenzie
Copy link
Contributor

Thanks for jumping on this, @klihub.

You raised some important questions in your comment that I think we should probably settle on before finalizing the implementation. Since o11y is a key requirement for GA, could we open a GitHub issue to agree on the specific design goals and requirements first?

We can treat this PR as a PoC to inform that discussion, but I'd feel more comfortable if we aligned on the "what" and "why" in a design issue before we iterate further on the "how" here.

@klihub
Copy link
Member Author

klihub commented Feb 6, 2026

could we open a GitHub issue to agree on the specific design goals and requirements first?

I created issue #270 for that.

@klihub klihub marked this pull request as draft February 6, 2026 19:01
@klihub klihub force-pushed the devel/audit-logging branch 4 times, most recently from 95bd1fb to ef60131 Compare February 19, 2026 06:24
@klihub klihub changed the title runtime-tools: emit audit log messages for adjustments. runtime-tools: log container adjustments. Feb 19, 2026
}

func (f *FieldOwners) compoundOwnerMap(field int32) (map[string]string, bool) {
func (f *FieldOwners) CompoundOwnerMap(field int32) (map[string]string, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this method is used in the generator so it could remain unexported.

}

// Fields can be used to pass extra information for logged messages.
type Fields = map[string]any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a map, how do we ensure deterministic ordering of logged fields? Is that an exercise left for the Logger?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is (expected to be) ultimately passed on to slog, logrus, or some other structured logger and then they deal with it as they see fit. This not different in any way from how you would expect (or not expect) a log.G(ctx).WithFields(Fields{...}) in containerd core code to result in deterministic logging order.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth considering here whether we want a map or to have named fields. A lot of the lines in this diff use the same names (value.name, value.value, nri.plugin).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repetitive logging of value.name, value.value in most of the cases is due to most of the adjusted fields having scalar values.

By 'named fields' do you mean a struct that has as fields the 'union' of all possible fields (IOW now keys) that we log for various adjustments, and then set those fields in a struct instead as opposed to key/value pairs in a map ?

Wouldn't that be a bit unnatural as the provided logger function is just acting as a postman towards a structured logger ? And wouldn't it also force the logger function to have to interpret the struct, so it can pick only the fields set and pass them on to a structured logger ?

if key, marked := m.IsMarkedForRemoval(); !marked {
g.log(AuditAddProcessEnv,
Fields{
"value.name": m.Key,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the Logger implementation, but this value. prefix will produce longer lines. Is the idea here for the Logger to parse this prefix in order to potentially handle these fields a certain way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the idea is to avoid conflicting with some existing key in an outer logging context. In this particular instance, to avoid a known conflict in CRI-O, which uses the name key to put the container name in the logging context, so that every logged message has (among others) also the name of the associated container. There would certainly be other ways to try avoiding it, and it would also be possible to not log adjustments this way but as a single value formatted with %v. The approach taken here is just one possibility.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But nri.plugin could really just be plugin everywhere. I really don't think that could conflict with anything already in the context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Logger can also add prefixes or adjust as necessary, especially if we have well-known fields here. I don't think we need to think too hard about collision. The Logger could unconditionally prefix nri. on every Field key.


// CreateContainer relays the corresponding CRI request to plugins.
func (r *Adaptation) CreateContainer(ctx context.Context, req *CreateContainerRequest) (*CreateContainerResponse, error) {
func (r *Adaptation) CreateContainer(ctx context.Context, req *CreateContainerRequest) (*CreateContainerResponse, *api.OwningPlugins, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide more context why we need to return the OwningPlugins?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach taken by this PR is to log adjustments to the OCI Spec as/when they are made.

Therefore we need to pass ownership information along with the actual adjustment results, so we can eventually shove it in our runtime-tools/generator wrapper which then will use this new information to pull out the responsible/source plugin when it logs OCI Spec changes as they are being made.

With everything else staying the same (so still the same approach by large), one alternative to this detail would be to add a new OwningPlugins owners = 4; to the CreateContainerResponse message in api.proto, document that it is for the runtime with anything filled in on the stub/plugin side simply getting ignored, and then instead of returning it separately here, just set that new response.Owner field to it and let it travel in the response alongside with the rest of the data.

Add an option for setting an external audit event logger
and use any configured logger to emit audit events as we
adjust the OCI Spec.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Expose owning plugins for adjustments returned by CreateContainer.
Include plugin in errors which originate from processing a request
by a plugin.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
@klihub klihub force-pushed the devel/audit-logging branch from ef60131 to 6e1b91b Compare February 23, 2026 18:15
if key, marked := m.IsMarkedForRemoval(); !marked {
g.log(AuditAddProcessEnv,
Fields{
"value.name": m.Key,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Logger can also add prefixes or adjust as necessary, especially if we have well-known fields here. I don't think we need to think too hard about collision. The Logger could unconditionally prefix nri. on every Field key.

g.log(AuditAddProcessEnv,
Fields{
"value.name": m.Key,
"value.value": "<value omitted>",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"value.value": "<value omitted>",
"value.value": "<removed>",

might be more clear.

Though <removed> (or any string we pick here) could be an actual value. Might be worth making sure the value name has a - prefix?

Copy link
Member Author

@klihub klihub Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "value omitted" content is an attempt to indicate that we do not log the value of environment variables. @chrishenzie was concerned about possibly logging secrets in the environment, so I changed it to this.

But my impression is that your comment is about a different context, when we remove environment values. But those we log with a different message and only with the variable name included. But did I misinterpret your comment ?

}

// Fields can be used to pass extra information for logged messages.
type Fields = map[string]any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth considering here whether we want a map or to have named fields. A lot of the lines in this diff use the same names (value.name, value.value, nri.plugin).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants