Skip to content

Comments

Add writable-cgroups experimental plugin#269

Open
chrishenzie wants to merge 1 commit intocontainerd:mainfrom
chrishenzie:writable-cgroups-experiment
Open

Add writable-cgroups experimental plugin#269
chrishenzie wants to merge 1 commit intocontainerd:mainfrom
chrishenzie:writable-cgroups-experiment

Conversation

@chrishenzie
Copy link
Contributor

Adds a new writable-cgroups plugin, designed to enable safe delegation of cgroup management to containers. This plugin allows containers to mount /sys/fs/cgroup as read-write, enabling workloads (like AI/ML frameworks) to manage their own sub-cgroups.

This plugin serves as a reference implementation and test-bed for validating the nsdelegate security model proposed in KEP-5474 as an alternative to introducing new Kubernetes API fields.

@Divya063 @samuelkarp

@chrishenzie
Copy link
Contributor Author

I'm still unsure on the name, because all the other NRI plugins have "verb-focused" names. Maybe cgroup-adjuster is more appropriate?

Adds a new `writable-cgroups` plugin, designed to enable safe delegation
of cgroup management to containers. This plugin allows containers to
mount `/sys/fs/cgroup` as read-write, enabling workloads (like AI/ML
frameworks) to manage their own sub-cgroups.

This plugin serves as a reference implementation and test-bed for
validating the `nsdelegate` security model proposed in KEP-5474 as an
alternative to introducing new Kubernetes API fields.

Signed-off-by: Chris Henzie <chrishenzie@gmail.com>
@chrishenzie chrishenzie force-pushed the writable-cgroups-experiment branch from c8215f8 to 78fbc4d Compare February 5, 2026 03:01

const (
// WritableCgroupsAnnotation is the annotation key that enables writable cgroups.
WritableCgroupsAnnotation = "cgroups.noderesource.dev/writable"
Copy link
Member

@klihub klihub Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Anything against using writable-cgroups.noderesource.dev as the annotation key ?

@klihub
Copy link
Member

klihub commented Feb 5, 2026

I'm still unsure on the name, because all the other NRI plugins have "verb-focused" names. Maybe cgroup-adjuster is more appropriate?

cgroup-adjuster sounds very generic to me compared to what the plugin does, implying more adjustments to containers' cgroups rather than their cgroup mounts. Maybe cgroup-mount-adjuster ?

@klihub
Copy link
Member

klihub commented Feb 5, 2026

@chrishenzie I only skimmed through it quickly yet, but it LGTM. Should we also add a contrib/kustomize/writable-cgroups (or whatever we end up calling this) like we have for the other plugins ?

@samuelkarp
Copy link
Member

I think writable-cgroups is a fine name. We don't have to have everything named -adjuster.

response to [KEP-5474](https://github.com/kubernetes/enhancements/issues/5474),
which originally proposed adding explicit API support for writable cgroups.

However, the evolving consensus is to move away from API additions and instead
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is the current consensus is still that a user-facing API change is desired, but that writable cgroups should only be available when nsdelegate is present.

Comment on lines +93 to +94
only. It is not recommended for production use until the behavior is
standardized in upstream container runtimes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think once the behavior is standardized there's no need for this plugin either?

Comment on lines +156 to +162
if info.FSType != "cgroup2" {
return true, false
}
if info.Mountpoint == "/sys/fs/cgroup" || info.Mountpoint == "/sys/fs/cgroup/unified" {
return false, true
}
return true, false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find these hard to read:

if (condition) {
	// error case
}
if (condition) {
	// success case
}
// error case

I'd swap the logic in the second conditional (apply De Morgan's law) to make the indented return also an error so the final line is the success case (i.e., the mount we're looking for).

if info.Mountpoint != "/sys/fs/cgroup" && info.Mountpoint != "/sys/fs/cgroup/unified"

return slices.Contains(strings.Split(mounts[0].VFSOptions, ","), "nsdelegate"), nil
}

// isWritableCgroupsEnabled checks if the pod has the writable cgroups annotation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// isWritableCgroupsEnabled checks if the pod has the writable cgroups annotation.
// isWritableCgroupsEnabled checks if the container has the writable cgroups annotation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants