Add writable-cgroups experimental plugin#269
Add writable-cgroups experimental plugin#269chrishenzie wants to merge 1 commit intocontainerd:mainfrom
Conversation
|
I'm still unsure on the name, because all the other NRI plugins have "verb-focused" names. Maybe |
Adds a new `writable-cgroups` plugin, designed to enable safe delegation of cgroup management to containers. This plugin allows containers to mount `/sys/fs/cgroup` as read-write, enabling workloads (like AI/ML frameworks) to manage their own sub-cgroups. This plugin serves as a reference implementation and test-bed for validating the `nsdelegate` security model proposed in KEP-5474 as an alternative to introducing new Kubernetes API fields. Signed-off-by: Chris Henzie <chrishenzie@gmail.com>
c8215f8 to
78fbc4d
Compare
|
|
||
| const ( | ||
| // WritableCgroupsAnnotation is the annotation key that enables writable cgroups. | ||
| WritableCgroupsAnnotation = "cgroups.noderesource.dev/writable" |
There was a problem hiding this comment.
nit: Anything against using writable-cgroups.noderesource.dev as the annotation key ?
|
|
@chrishenzie I only skimmed through it quickly yet, but it LGTM. Should we also add a |
|
I think |
| response to [KEP-5474](https://github.com/kubernetes/enhancements/issues/5474), | ||
| which originally proposed adding explicit API support for writable cgroups. | ||
|
|
||
| However, the evolving consensus is to move away from API additions and instead |
There was a problem hiding this comment.
My understanding is the current consensus is still that a user-facing API change is desired, but that writable cgroups should only be available when nsdelegate is present.
| only. It is not recommended for production use until the behavior is | ||
| standardized in upstream container runtimes. |
There was a problem hiding this comment.
I think once the behavior is standardized there's no need for this plugin either?
| if info.FSType != "cgroup2" { | ||
| return true, false | ||
| } | ||
| if info.Mountpoint == "/sys/fs/cgroup" || info.Mountpoint == "/sys/fs/cgroup/unified" { | ||
| return false, true | ||
| } | ||
| return true, false |
There was a problem hiding this comment.
I find these hard to read:
if (condition) {
// error case
}
if (condition) {
// success case
}
// error case
I'd swap the logic in the second conditional (apply De Morgan's law) to make the indented return also an error so the final line is the success case (i.e., the mount we're looking for).
if info.Mountpoint != "/sys/fs/cgroup" && info.Mountpoint != "/sys/fs/cgroup/unified"| return slices.Contains(strings.Split(mounts[0].VFSOptions, ","), "nsdelegate"), nil | ||
| } | ||
|
|
||
| // isWritableCgroupsEnabled checks if the pod has the writable cgroups annotation. |
There was a problem hiding this comment.
| // isWritableCgroupsEnabled checks if the pod has the writable cgroups annotation. | |
| // isWritableCgroupsEnabled checks if the container has the writable cgroups annotation. |
Adds a new
writable-cgroupsplugin, designed to enable safe delegation of cgroup management to containers. This plugin allows containers to mount/sys/fs/cgroupas read-write, enabling workloads (like AI/ML frameworks) to manage their own sub-cgroups.This plugin serves as a reference implementation and test-bed for validating the
nsdelegatesecurity model proposed in KEP-5474 as an alternative to introducing new Kubernetes API fields.@Divya063 @samuelkarp