-
Notifications
You must be signed in to change notification settings - Fork 195
CC | Run the current tests offloading the image to the guest without the forked containerd #5774
base: CCv0
Are you sure you want to change the base?
Conversation
/fidencio-test |
6667a20
to
0cb184e
Compare
/fidencio-test |
0cb184e
to
e57b089
Compare
/fidencio-test |
.ci/install_nydus_snapshotter.sh
Outdated
install -D -m 755 "bin/containerd-nydus-grpc" "${nydus_snapshotter_binary_target_dir}/containerd-nydus-grpc" | ||
install -D -m 755 "bin/nydus-overlayfs" "${nydus_snapshotter_binary_target_dir}/nydus-overlayfs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add sudo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definity should! :-)
e57b089
to
4c6b741
Compare
/fidencio-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In f8cdfd2#diff-9ed0363d00e4ed02d42e6994eed9e2f806052abc3cd08d66bbb64d5c211f25a1, since it includes both host sharing and guest pulling tests, the snapshotter configuration file requires switching. This entails deleting the image remove_test_image
used for the test and configuring the snapshotter configure_nydus_snapshotter
. However, the test cases here solely involve guest pulling, a better approach would be to delete the test image before running the test scripts, followed by restarting the snapshotter process. The same operation can be performed after the test concludes, rather than repeating the deletion and configuration for each testcase. However, I believe it's fine to do in every test case. So we may need to add the related functions to integration/confidential/lib.sh
.
@@ -31,7 +31,17 @@ RUNTIMECLASS="${RUNTIMECLASS:-kata}" | |||
test_tag="[cc][agent][kubernetes][containerd]" | |||
|
|||
setup() { | |||
setup_common | |||
remove_test_image "$image_unsigned_protected" || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we should remove all images in setup()
like this:
remove_test_image "$image_cosigned" || true
remove_test_image "$image_cosigned_other" || true
remove_test_image "$image_simple_signed" || true
remove_test_image "$image_signed_protected_other" || true
remove_test_image "$image_unsigned_protected" || true
remove_test_image "$image_unsigned_unprotected" || true
remove_test_image "$image_authenticated" || true
@@ -152,4 +162,7 @@ setup() { | |||
|
|||
teardown() { | |||
teardown_common | |||
remove_test_image "$image_unsigned_protected" || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we should remove all images in teardown()
like this:
remove_test_image "$image_cosigned" || true
remove_test_image "$image_cosigned_other" || true
remove_test_image "$image_simple_signed" || true
remove_test_image "$image_signed_protected_other" || true
remove_test_image "$image_unsigned_protected" || true
remove_test_image "$image_unsigned_unprotected" || true
remove_test_image "$image_authenticated" || true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with the CI job only pulling the image on the guest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An image is associated with a snapshotter. Therefore, prior to utilizing the snapshotter, it's necessary to remove all tested images. Otherwise, the image pull request will not be directed to nydus-snapshotter, as containerd interprets the image as already available. So we should remove all tested images before running the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we neglect to remove the images, an error will occur when creating the container:
Warning FailedCreatePodSandBox 3s kubelet Failed to create pod sandbox:
rpc error: code = NotFound desc = failed to create containerd container: error unpacking image:
failed to extract layer sha256:1021ef88c7974bfff89c5a0ec4fd3160daac6c48a075f74cff721f85dd104e68:
failed to get reader from content store: content digest sha256:fbe1a72f5dcd08ba4ca3ce3468c742786c1f6578c1f6bb401be1c4620d6ff705: not found
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we neglect to remove the images, an error will occur when creating the container:
This is a huge drawback, IMHO.
It forces users to cleanup all the images on their systems before using nydus, making this approach only okay for a clean installation. :-/
4c6b741
to
2e90792
Compare
/fidencio-test |
2e90792
to
1536bac
Compare
/fidencio-test |
2 similar comments
/fidencio-test |
/fidencio-test |
3e1c451
to
d1052a1
Compare
/fidencio-test |
d1052a1
to
c30416d
Compare
/fidencio-test |
c30416d
to
57a1fff
Compare
/fidencio-test |
57a1fff
to
c070c2a
Compare
/fidencio-test |
c070c2a
to
43652ea
Compare
/fidencio-test |
And let's hope for the buest. :-) Signed-off-by: Fabiano Fidêncio <[email protected]>
43652ea
to
2b5ca0a
Compare
/fidencio-test |
4f17f93
to
0a967f9
Compare
/fidencio-test |
Chengyu - just to give you an update on what I've been trying today. The test that was 12 was still failing and after printing the containerd, kata and kubelet logs I couldn't spot an error, so I wondered if the problem was due to the fact that nydus snapshotter had already cached the docker config.json, so the missing credentials weren't a problem. I didn't know how to reset the credentials, so I tried moving the test above the others, but in http://jenkins.katacontainers.io/view/CCv0/job/tests-CCv0-ubuntu-20.04-x86_64-CC_CRI_CONTAINERD_K8S-IMAGE_OFFLOAD_TO_GUEST-PR/28/consoleFull it still fails and I can't see an obvious error related to the image pull failing on the host, so we might need to look elsewhere. I'll spend more time debugging tomorrow, but wanted to let you know what I've done in case you have any ideas whilst I'm asleep. |
2023-09-26.09-20-36.mp4 |
@stevenhorsman I have tested in local machine and I can find the log "failed to resolve reference "quay.io/kata-containers/confidential-containers-auth:test": unexpected status from HEAD request". |
Consider printing the log after |
.ci/containerd_nydus_setup.sh
Outdated
NYDUS_SNAPSHOTTER_TARFS_CONFIG="/usr/local/share/nydus-snapshotter/config-coco-host-sharing.toml" | ||
NYDUS_SNAPSHOTTER_GUEST_CONFIG="/usr/local/share/nydus-snapshotter/config-coco-guest-pulling.toml" | ||
NYDUS_SNAPSHOTTER_CONFIG="${NYDUS_SNAPSHOTTER_CONFIG:-${NYDUS_SNAPSHOTTER_TARFS_CONFIG}}" | ||
NYDUS_SNAPSHOTTER_TARFS_EXPORT_MODE="${PULL_ON_HOST_EXPORT_MODE=:-image_block}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fidencio I found why tests pulling image on the host are all failed. There is an extraneous '=' to get PULL_ON_HOST_EXPORT_MODE
, causing the value of export_mode to be export_mode = ":-image_block"
. So if we remove '=' in "${PULL_ON_HOST_EXPORT_MODE=:-image_block}"
, it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed - good spot
Ensure nydus is setup to be used by containerd as part of our tests. Signed-off-by: Fabiano Fidêncio <[email protected]> Signed-off-by: ChengyuZhu6 <[email protected]>
Signed-off-by: Fabiano Fidêncio <[email protected]>
Signed-off-by: Fabiano Fidêncio <[email protected]>
The reason for that being: ``` The test case 5 failed due to a specific behavior when using containerd and the snapshotter to download images. Containerd needs to fetch both the manifest and configuration of the image. In this case, both images`quay.io/kata-containers/confidential-containers:signed` and `quay.io/kata-containers/confidential-containers:unsigned` have the same IDs (sha256). Consequently, test case 4 downloaded image `quay.io/kata-containers/confidential-containers:signed`. So, in test case 5, when containerd detected that the image ID already existed, it used the manifest and image name from `quay.io/kata-containers/confidential-containers:signed` and passed it to kata instead of `quay.io/kata-containers/confidential-containers:unsigned`, resulting in the use of image `quay.io/kata-containers/confidential-containers:signed`. This explains the error in test case 5. As a temporary measure, deleting the image before running each test case should address this. ``` from: https://cloud-native.slack.com/archives/C039JSH0807/p1695618313572309?thread_ts=1695591000.697989&cid=C039JSH0807 Signed-off-by: Fabiano Fidêncio <[email protected]>
cbfe02e
to
2970527
Compare
/fidencio-test |
Locally I'm seeing
but I've fixed the logging now, so we'll see what the automation comes up with and pick the best message :) |
I see the following messages in the CI logs as well:
so I'll work on fixing up the grep to try and match that properly |
d11e8ef
to
2970527
Compare
/fidencio-test |
Update image pull tests to provide registry creds on the host for kubernetes and nydus snapshotter to use Co-authored-by: ChengyuZhu6 <[email protected]> Signed-off-by: stevenhorsman <[email protected]>
Signed-off-by: stevenhorsman <[email protected]>
2970527
to
c63008b
Compare
/fidencio-test |
Signed-off-by: stevenhorsman <[email protected]>
/fidencio-test |
So I think the plan beyond this draft PR is to switch the old jobs (that run the encrypted tests) to use image offload in https://github.com/confidential-containers/operator/compare/main...fidencio:cc-operator:remote_snapshotter?expand=1, so that should cover this. I think we are keeping this open for the pull on host testing |
Oh sorry. I found the tests have included encrypted tests. |
@fidencio @stevenhorsman The reason the tests for pulling images on the host are failing is because the latest version of
|
Signed-off-by: ChengyuZhu6 <[email protected]>
/fidencio-test |
SSIA.
I'm opening the PR, but the majority of the content was written by (and the credits are properly given to) @ChengyuZhu6.
PLEASE, DO NOT RUN
/test
HERE YET.