Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sig-node: Kubelet-in-UserNS, aka Rootless mode #1371

Merged
merged 1 commit into from
May 24, 2021

Conversation

AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Nov 18, 2019

Allow running the entire Kubernetes components (kubelet, CRI, OCI, CNI, and all kube-*) as a non-root user on the host, by using a user namespace and cgroup v2 (#1370).

Rootless mode has been already adopted by k3s.

Also, kind already supports Rootless Docker/Podman, with unmodified Kubernetes, but it uses very dirty hack to avoid sysctl errors, so this KEP still has to be accepted.

Replaces #1084


POC: https://github.com/rootless-containers/usernetes
Kubernetes PR: kubernetes/kubernetes#92863
Tracking issue: #2033

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 18, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @AkihiroSuda. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 18, 2019
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. labels Nov 18, 2019
@AkihiroSuda
Copy link
Member Author

@BenTheElder
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 18, 2019
@AkihiroSuda AkihiroSuda force-pushed the rootless2 branch 3 times, most recently from 474b798 to 92d0c3d Compare November 18, 2019 21:17
@dims
Copy link
Member

dims commented Nov 25, 2019

/assign @dchen1107 @derekwaynecarr

@AkihiroSuda
Copy link
Member Author

ping

@dims
Copy link
Member

dims commented Jan 27, 2020

@AkihiroSuda i've added this to the weekly sig-node meeting. let's see if we can get some eyes there

@AkihiroSuda
Copy link
Member Author

Thanks, but I'm not likely to be able to attend the meeting due to the timezone, sorry.

@dims
Copy link
Member

dims commented Jan 27, 2020

@AkihiroSuda no worries, i will raise it on your behalf :)

@giuseppe
Copy link
Member

isn't it blocked on #1370?

@AkihiroSuda
Copy link
Member Author

isn't it blocked on #1370?

Yes, w.r.t. the cgroup part, but other parts can be discussed / merged ?
(Anyway, #1370 seems ready to be merged?)

@derekwaynecarr
Copy link
Member

I think we should tackle cgroups v2 first which has a related KEP.

I would prefer we track rootless behaviors in the cgroups v2 KEP which still needed further iteration.

@AkihiroSuda AkihiroSuda force-pushed the rootless2 branch 4 times, most recently from eedcda6 to 6a14171 Compare May 14, 2021 04:47
@AkihiroSuda
Copy link
Member Author

Updated PR to address comments, thanks all for reviewing!

Copy link

@utkarsh2102 utkarsh2102 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I was just going through this and found some super-trivial things that you might be interested in? 😅

keps/sig-node/2033-rootless/README.md Outdated Show resolved Hide resolved
keps/sig-node/2033-rootless/README.md Outdated Show resolved Hide resolved
keps/sig-node/2033-rootless/README.md Outdated Show resolved Hide resolved
keps/sig-node/2033-rootless/README.md Outdated Show resolved Hide resolved
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an incredibly expensive way (from a network POV) to get rid of root-components and the resulting semantics seem dodgy (those rlimits are configured for a reason).

That said, the request, as far as I can see, is pretty non-invasive, so who am I to tell you what (not) to do?

keps/sig-node/2033-rootless/README.md Outdated Show resolved Hide resolved
know that this has succeeded?
-->

- Allow `kubelet` and `kube-proxy` to be executed inside user namespaces create by a non-root user. See ["Required changes to Kubernetes"](#required-changes-to-kubernetes).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... windows has these 'kernel silos' or something like that, need @jsturtevant to weigh in here, on what this would mean for hostProcess containers....

@thockin
Copy link
Member

thockin commented May 18, 2021 via email

@AkihiroSuda
Copy link
Member Author

This seems like an incredibly expensive way (from a network POV) to get rid of root-components and the resulting semantics seem dodgy (those rlimits are configured for a reason).

There is an experiment to remove slirp overhead by using SECCOMP_IOCTL_NOTIF_ADDFD (Kernel 5.9) https://github.com/rootless-containers/bypass4netns .
It is also possible to use LXC's SETUID binary to use the native vEth.

@derekwaynecarr
Copy link
Member

thank you @thockin @gnufied @jsafrane for sharing your insights as well.

it looks like there is no disagreement on exploring further iteration of the concept with the caveats enumerated for kubelet and kube-proxy, but definitely we have a lot to iron out as this evolves.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 19, 2021
@dims
Copy link
Member

dims commented May 20, 2021

@ehashman please approve!

Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature gate has been added, addressing the last of my PRR comments.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AkihiroSuda, derekwaynecarr, ehashman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 20, 2021
@AkihiroSuda
Copy link
Member Author

@ehashman Thanks for approval, can we get this merged?

SIG-release approved this in v1.22 milestone, with the merge deadline May 25.
https://groups.google.com/g/kubernetes-sig-release/c/eggXiBwlzw8/m/ODnL8DYbAQAJ

@dims
Copy link
Member

dims commented May 24, 2021

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 24, 2021
@k8s-ci-robot k8s-ci-robot merged commit f0df4e3 into kubernetes:master May 24, 2021
@AkihiroSuda
Copy link
Member Author

Thanks @dims!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.