Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get error: content digest sha256: ***: not found when exporting image #2631

Closed
sunchunming opened this issue Feb 14, 2022 · 25 comments
Closed

Comments

@sunchunming
Copy link

sunchunming commented Feb 14, 2022

Would help check below issue and advise if any update? Thank you.

When exporting image, I found below error

#18 exporting to image
#18 pushing layers 0.1s done
#18 ERROR: content digest sha256:97bac3dab075a8e745a60a2e05e9f678053d6bca7ad1d109867220704b154443: not found

Then I checked the dictionary: '~/.local/share/buildkit/runc-native/content/blobs/sha256' and file 97bac3dab075a8e745a60a2e05e9f678053d6bca7ad1d109867220704b154443 is missing, but could be found in one of the image manifest files.
buildkit version:0.9.0, I didn't config gc in Buildkitd configure file. I suspect there is an issue, which deletes a shared cache record in default gc.

some debug logs:
time="2022-02-23T05:59:48Z" level=debug msg=push
time="2022-02-23T05:59:48Z" level=debug msg="fetch response received" response.header.accept-ranges=bytes response.header.cache-control="max-age=31536000" response.header.connection=keep-alive response.header.content-length=306 response.header.content-type=application/octet-stream response.header.date="Wed, 23 Feb 2022 05:59:48 GMT" response.header.docker-content-digest="sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614" response.header.docker-distribution-api-version=registry/2.0 response.header.etag=""sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614"" response.header.server=nginx response.header.set-cookie="sid=6bd09de24143b204fc15afffd55b05a0; Path=/; HttpOnly" response.status="200 OK"
time="2022-02-23T05:59:48Z" level=debug msg="checking and pushing to" url="http://harbor.jd.com/v2/jpipe-test/prod/jimidatalhwebservices/blobs/sha256:5e44ff2aeae6efe1449c178bf8edd1e7ced6fa5510fda8a3edc8d99c5fd64cc0"
time="2022-02-23T05:59:48Z" level=debug msg="do request" request.header.content-type=application/octet-stream request.header.user-agent=containerd/1.6.0-beta.1+unknown request.method=PUT
time="2022-02-23T05:59:48Z" level=debug msg=push
time="2022-02-23T05:59:48Z" level=debug msg="do request" request.header.accept="application/vnd.docker.image.rootfs.diff.tar.gzip, /" request.header.user-agent=containerd/1.6.0-beta.1+unknown request.method=HEAD
time="2022-02-23T05:59:48Z" level=debug msg="checking and pushing to" url="http://harbor.jd.com/v2/jpipe-test/prod/jimidatalhwebservices/blobs/sha256:d9a2c8ccae4221b6b060b63bf56e53dad2295c1ce9d2cf1eb047ebd4eba1b297"
time="2022-02-23T05:59:48Z" level=warning msg="failed to update distribution source for layer sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614: content digest sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614: not found"
......
time="2022-02-23T06:08:09Z" level=debug msg="checking and pushing to" url="http://harbor.jd.com/v2/jpipe-test/prod/dbbakmasterlb/blobs/sha256:ce68c6bbd0a17f0742f673fb01cf19b88b575861ea0b45533ddbd82a068d1246"
time="2022-02-23T06:08:09Z" level=debug msg="do request" request.header.accept="application/vnd.docker.image.rootfs.diff.tar.gzip, /" request.header.user-agent=containerd/1.6.0-beta.1+unknown request.method=HEAD
time="2022-02-23T06:08:09Z" level=debug msg="fetch response received" response.header.connection=keep-alive response.header.content-length=0 response.header.content-type="text/plain; charset=utf-8" response.header.date="Wed, 23 Feb 2022 06:08:09 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.docker-upload-uuid=94aee7cd-b8fa-4b04-a973-767fa2e87226 response.header.location="http://harbor.jd.com/v2/jpipe-test/prod/dbbakmasterlb/blobs/uploads/94aee7cd-b8fa-4b04-a973-767fa2e87226?_state=O1cQSYNegKctbLxK1DNYrkatUhKemITsws5DR0VRnpt7Ik5hbWUiOiJqcGlwZS10ZXN0L3Byb2QvZGJiYWttYXN0ZXJsYiIsIlVVSUQiOiI5NGFlZTdjZC1iOGZhLTRiMDQtYTk3My03NjdmYTJlODcyMjYiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjItMDItMjNUMDY6MDg6MDkuNDc0NTc4NDkxWiJ9" response.header.range=0-0 response.header.server=nginx response.header.set-cookie="sid=d63550dbee60e074afbebdfa4c37ba99; Path=/; HttpOnly" response.status="202 Accepted"
time="2022-02-23T06:08:09Z" level=debug msg="do request" request.header.content-type=application/octet-stream request.header.user-agent=containerd/1.6.0-beta.1+unknown request.method=PUT
time="2022-02-23T06:08:09Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = content digest sha256:32678decbeb81d3211ddd542bd383f7ff304d63af7a78321e7b01b4021f65614: not found\n"
time="2022-02-23T06:08:09Z" level=debug msg="session finished: "

@Shaked
Copy link

Shaked commented Feb 27, 2022

Hey @sunchunming I am experiencing the same issue. Using buildx with a Buildkit daemon on k8s with an SSD PV/C for /var/lib/buildkit.

I tried removing part of the cache manually using buildctl du and buildctl prune -f but no luck.

I really don’t want to delete the entire cache but I can’t seem to come up with a better solution.

I also discussed this with @tonistiigi on Slack but unfortunately we couldn’t figure it out.

Can you share your complete setup? Are you using BuildKit daemon on docker or k8s? How are you trying to export the image? The more details the better.

@Shaked
Copy link

Shaked commented Feb 27, 2022

I have some interesting updates.

I use an apikey for nvcr.io as it provides better rate limits. They apikey is valid and I checked in both on my local machine and another machine. When I remove it, I see:

--------------------
   1 | >>> FROM nvcr.io/nvidia/l4t-base:r32.5.0
   2 |     #@sha256:c7207a13da6054c738b530aad93e8c620182db85eb7fbb40b103757145e8b5f4
   3 |
--------------------
error: failed to solve: failed to fetch oauth token: unexpected status: 401

Now, when I use it - which is what I have done up until now, I see a very strange log entry in buildkitd:

time="2022-02-27T15:36:27Z" level=debug msg="fetch response received" response.header.connection=keep-alive response.header.content-length=195 response.header.content-type=text/html response.header.date="Sun, 27 Feb 2022 15:36:27 GMT" response.header.server=nginx/1.14.2 response.header.www-authenticate="Bearer realm="https://nvcr.io/proxy_auth\",scope=\"repository:nvidia/l4t-base:pull,push\"" response.status="401 Unauthorized" spanID=d02539606761f313 traceID=f91390545742112856d34f33eacbdb4d

There's only one entry like that when I build after deleting the cache manually as mentioned in my previous comment:

$ cat only-32.Dockerfile
FROM nvcr.io/nvidia/l4t-base:r32.5.0

$ docker buildx build -t test .  -o type=oci,dest=/tmp/test-2 -f only-32.Dockerfile

[+] Building 13.9s (6/6) FINISHED
 => [internal] load build definition from only-32.Dockerfile                                           0.1s
 => => transferring dockerfile: 157B                                                                   0.0s
 => [internal] load .dockerignore                                                                      0.1s
 => => transferring context: 2B                                                                        0.0s
 => [internal] load metadata for nvcr.io/nvidia/l4t-base:r32.5.0                                       2.3s
 => [auth] nvidia/l4t-base:pull,push token for nvcr.io                                                 0.0s
 => [1/1] FROM nvcr.io/nvidia/l4t-base:r32.5.0@sha256:c7207a13da6054c738b530aad93e8c620182db85eb7fbb4  4.0s
 => => resolve nvcr.io/nvidia/l4t-base:r32.5.0@sha256:c7207a13da6054c738b530aad93e8c620182db85eb7fbb4  0.0s
 => => sha256:c8ffbfd7f0aa94bae883cd524d5117f0276ae94540bdb1940a8bf8271b980aae 325B / 325B             0.2s
 => => sha256:95f34310bbda7c3839275a982b49ed6165bf70ebbb4085d5f5815b84bdc89681 497.47kB / 497.47kB     0.4s
 => => sha256:08fb1eee5328933b0a0b71eaaf1e45ad3607827ff49e5bfbf5cc7a7c3a90724e 17.64kB / 17.64kB       1.4s
 => => sha256:024ce79b6790ac770e29c293c9a1476ac9fca8a069b7c6ced5172e4a5afc7807 38.09MB / 38.09MB       1.4s
 => => sha256:1c40f77bb35b9702f8640aa488bfed5c932c2fccb2ff523341a0a7ec2cb3bb3f 249B / 249B             0.5s
 => => sha256:ba785779122ac085811010e33fd7dc9c4dc727a64f5ad3a887f9d360a33fa63f 231B / 231B             0.3s
 => => sha256:f79d264135a36850f7ebcbdb9f258dccc39af3e3840aa3a594f4c3d1e7beaffc 230B / 230B             0.2s
 => => sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 172.36MB / 172.36MB     3.2s
 => => sha256:211b56b73ff1859029c7ed05ec69b0ecacf623ca6872232e73463f6a8a6897ac 2.86MB / 2.86MB         0.5s
 => => sha256:1990ecf0bfb7695832472f00a1414640925d373084a88bdaf3894be44adaedb0 327B / 327B             0.4s
 => => sha256:78aca4be1f3b23abc769e3fa3cb1834494b97c18eeaf3ec0853642762f45046b 249.95kB / 249.95kB     0.2s
 => => sha256:678c9d1557e9bce46106636e21eefe03c621acb00d09ede0a8f4425b1fdfeb48 209.78kB / 209.78kB     0.2s
 => => sha256:ec17ad7cab010d70d38d5b2133b4e5ccff281c5a6a912dbc7d019b8482b468bc 277B / 277B             0.2s
 => => sha256:9b09da3b5483e510a6f560d65ce395bcf73dd272bd7afb04da70f67ced436a0d 15.93MB / 15.93MB       2.3s
 => => sha256:17f974a43cf97d28ea73fc04f696a91c40dc3d73e5209b794a2e07909cd455d1 290.99kB / 290.99kB     0.2s
 => ERROR exporting to oci image format                                                                8.9s
 => => exporting layers                                                                                0.0s
 => => exporting manifest sha256:e355d31beb979b06c03c5f1012451a5e3df83fa51e3035327e7c3c01bddee3ba      0.0s
 => => exporting config sha256:9732e3064b63a1fdf836bca7cb4848bd5b1bb76a6a8c284268b845da0940875d        0.0s
 => => sending tarball                                                                                 4.6s
------
 > exporting to oci image format:
------
error: failed to solve: failed to get reader: content digest sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b: not found

I believe that this might be related to something like: google/go-containerregistry#728

I saw that this is used here:

buildkit/go.sum

Lines 667 to 669 in 6fa5a92

github.com/google/go-containerregistry v0.0.0-20191010200024-a3d713f9b7f8/go.mod h1:KyKXa9ciM8+lgMXwOVsXi7UxGrsf9mM61Mzs+xKUrKE=
github.com/google/go-containerregistry v0.1.2/go.mod h1:GPivBPgdAyd2SU+vf6EpsgOtWDuPqjW0hJZt4rNdTZ4=
github.com/google/go-containerregistry v0.5.1/go.mod h1:Ct15B4yir3PLOP5jsy0GNeYVaIZs/MK/Jz5any1wFW0=

What do you think?

EDIT:

The 401 Unauthorized happens because I have multiple auths in my .docker/config.json e.g ACR, GCR, DockerHub, nvcr. Once I tried using only nvcr the "401 Unauthorized" disappeared but this message still happens:

error: failed to solve: failed to get reader: content digest sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b: not found

@Shaked
Copy link

Shaked commented Feb 28, 2022

I kept on investigating and maybe this will help figuring out something.

I'm building this image:

FROM nvcr.io/nvidia/l4t-base:r32.5.0

Using this command:

docker buildx build -t test .  -o type=oci,dest=/tmp/test-2 -f only-32.Dockerfile --no-cache --progress=plain --platform linux/arm64 --pull

I also tried without --no-cache, --pull and --platform.

Steps taken:

  1. I tried to manually remove the cache using this script. It's not as useful as it should so I had to run it few times until all cache entries related to c7207a13da6054c738b530aad93e8c620182db85eb7fbb40b103757145e8b5f4 were gone.
  2. Then I ran the build:
docker buildx build -t test .  -o type=oci,dest=/tmp/test-2 -f only-32.Dockerfile --no-cache --no-cache --progress=plain --platform linux/arm64 --pull
#1 [internal] load build definition from only-32.Dockerfile
#1 transferring dockerfile: 157B 0.0s done
#1 DONE 0.1s

#2 [internal] load .dockerignore
#2 transferring context: 2B 0.0s done
#2 DONE 0.1s

#3 [internal] load metadata for nvcr.io/nvidia/l4t-base:r32.5.0
#3 DONE 1.3s

#4 [1/1] FROM nvcr.io/nvidia/l4t-base:r32.5.0@sha256:c7207a13da6054c738b530aad93e8c620182db85eb7fbb40b103757145e8b5f4
#4 resolve nvcr.io/nvidia/l4t-base:r32.5.0@sha256:c7207a13da6054c738b530aad93e8c620182db85eb7fbb40b103757145e8b5f4 0.0s done
#4 DONE 1.7s

#5 exporting to oci image format
#5 exporting layers done
#5 exporting manifest sha256:e355d31beb979b06c03c5f1012451a5e3df83fa51e3035327e7c3c01bddee3ba 0.0s done
#5 exporting config sha256:9732e3064b63a1fdf836bca7cb4848bd5b1bb76a6a8c284268b845da0940875d 0.0s done
#5 ...

#4 [1/1] FROM nvcr.io/nvidia/l4t-base:r32.5.0@sha256:c7207a13da6054c738b530aad93e8c620182db85eb7fbb40b103757145e8b5f4
#4 sha256:08fb1eee5328933b0a0b71eaaf1e45ad3607827ff49e5bfbf5cc7a7c3a90724e 17.64kB / 17.64kB 0.2s done
#4 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 172.36MB / 172.36MB 3.3s done
#4 sha256:c8ffbfd7f0aa94bae883cd524d5117f0276ae94540bdb1940a8bf8271b980aae 325B / 325B 0.6s done
#4 sha256:95f34310bbda7c3839275a982b49ed6165bf70ebbb4085d5f5815b84bdc89681 497.47kB / 497.47kB 0.7s done
#4 sha256:ec17ad7cab010d70d38d5b2133b4e5ccff281c5a6a912dbc7d019b8482b468bc 277B / 277B 0.5s done
#4 sha256:024ce79b6790ac770e29c293c9a1476ac9fca8a069b7c6ced5172e4a5afc7807 38.09MB / 38.09MB 1.0s done
#4 sha256:678c9d1557e9bce46106636e21eefe03c621acb00d09ede0a8f4425b1fdfeb48 209.78kB / 209.78kB 0.2s done
#4 sha256:ba785779122ac085811010e33fd7dc9c4dc727a64f5ad3a887f9d360a33fa63f 231B / 231B 0.2s done
#4 sha256:1990ecf0bfb7695832472f00a1414640925d373084a88bdaf3894be44adaedb0 327B / 327B 0.2s done
#4 sha256:17f974a43cf97d28ea73fc04f696a91c40dc3d73e5209b794a2e07909cd455d1 290.99kB / 290.99kB 0.2s done
#4 sha256:211b56b73ff1859029c7ed05ec69b0ecacf623ca6872232e73463f6a8a6897ac 2.86MB / 2.86MB 2.2s done
#4 sha256:78aca4be1f3b23abc769e3fa3cb1834494b97c18eeaf3ec0853642762f45046b 249.95kB / 249.95kB 0.2s done
#4 sha256:f79d264135a36850f7ebcbdb9f258dccc39af3e3840aa3a594f4c3d1e7beaffc 230B / 230B 2.0s done
#4 sha256:1c40f77bb35b9702f8640aa488bfed5c932c2fccb2ff523341a0a7ec2cb3bb3f 249B / 249B 1.8s done
#4 sha256:9b09da3b5483e510a6f560d65ce395bcf73dd272bd7afb04da70f67ced436a0d 15.93MB / 15.93MB 0.4s done
#4 DONE 3.7s

#5 exporting to oci image format
#5 sending tarball
#5 sending tarball 5.4s done
#5 ERROR: failed to get reader: content digest sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b: not found
------
 > exporting to oci image format:
------
error: failed to solve: failed to get reader: content digest sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b: not found
  1. I ran the exact same build on my own machine (macbook pro non ARM):
╰─$ docker buildx build -t test .  -o type=oci,dest=/tmp/test-2 -f only-32.Dockerfile --no-cache --no-cache --progress=plain --platform linux/arm64 --pull
#1 [internal] booting buildkit
#1 pulling image moby/buildkit:buildx-stable-1
#1 pulling image moby/buildkit:buildx-stable-1 2.9s done
#1 creating container buildx_buildkit_shaked
#1 creating container buildx_buildkit_shaked 1.2s done
#1 DONE 4.1s

#2 [internal] load build definition from only-32.Dockerfile
#2 transferring dockerfile: 82B done
#2 DONE 0.0s

#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s

#4 [internal] load metadata for nvcr.io/nvidia/l4t-base:r32.5.0
#4 ...

#5 [auth] nvidia/l4t-base:pull,push token for nvcr.io
#5 DONE 0.0s

#4 [internal] load metadata for nvcr.io/nvidia/l4t-base:r32.5.0
#4 DONE 4.5s

#6 [1/1] FROM nvcr.io/nvidia/l4t-base:r32.5.0@sha256:c7207a13da6054c738b530aad93e8c620182db85eb7fbb40b103757145e8b5f4
#6 resolve nvcr.io/nvidia/l4t-base:r32.5.0@sha256:c7207a13da6054c738b530aad93e8c620182db85eb7fbb40b103757145e8b5f4 done
#6 DONE 0.2s

#7 exporting to oci image format
#7 exporting layers done
#7 exporting manifest sha256:e355d31beb979b06c03c5f1012451a5e3df83fa51e3035327e7c3c01bddee3ba done
#7 exporting config sha256:9732e3064b63a1fdf836bca7cb4848bd5b1bb76a6a8c284268b845da0940875d done
#7 ...

#6 [1/1] FROM nvcr.io/nvidia/l4t-base:r32.5.0@sha256:c7207a13da6054c738b530aad93e8c620182db85eb7fbb40b103757145e8b5f4
#6 sha256:c8ffbfd7f0aa94bae883cd524d5117f0276ae94540bdb1940a8bf8271b980aae 325B / 325B 0.5s done
#6 sha256:024ce79b6790ac770e29c293c9a1476ac9fca8a069b7c6ced5172e4a5afc7807 38.09MB / 38.09MB 3.0s done
#6 sha256:08fb1eee5328933b0a0b71eaaf1e45ad3607827ff49e5bfbf5cc7a7c3a90724e 17.64kB / 17.64kB 1.3s done
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 35.65MB / 172.36MB 5.1s
#6 sha256:ec17ad7cab010d70d38d5b2133b4e5ccff281c5a6a912dbc7d019b8482b468bc 277B / 277B 0.9s done
#6 sha256:1c40f77bb35b9702f8640aa488bfed5c932c2fccb2ff523341a0a7ec2cb3bb3f 249B / 249B 0.4s done
#6 sha256:9b09da3b5483e510a6f560d65ce395bcf73dd272bd7afb04da70f67ced436a0d 10.49MB / 15.93MB 3.8s
#6 sha256:78aca4be1f3b23abc769e3fa3cb1834494b97c18eeaf3ec0853642762f45046b 249.95kB / 249.95kB 1.6s done
#6 sha256:678c9d1557e9bce46106636e21eefe03c621acb00d09ede0a8f4425b1fdfeb48 209.78kB / 209.78kB 0.6s done
#6 sha256:211b56b73ff1859029c7ed05ec69b0ecacf623ca6872232e73463f6a8a6897ac 2.86MB / 2.86MB 0.6s done
#6 sha256:f79d264135a36850f7ebcbdb9f258dccc39af3e3840aa3a594f4c3d1e7beaffc 230B / 230B 0.4s done
#6 sha256:95f34310bbda7c3839275a982b49ed6165bf70ebbb4085d5f5815b84bdc89681 497.47kB / 497.47kB 0.5s done
#6 sha256:8db8dbbd4bb96a8146070c22ef313721a32374ef32ee5fe36a513f74d3915284 220B / 220B 0.4s done
#6 sha256:1990ecf0bfb7695832472f00a1414640925d373084a88bdaf3894be44adaedb0 327B / 327B 0.4s done
#6 sha256:8e855b69096af720db04034cd3c4fe2c55396dc4ae77fbd28484e4e3b6b5cb6a 973B / 973B 0.4s done
#6 sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b 0B / 24.57MB 0.3s
#6 sha256:ba785779122ac085811010e33fd7dc9c4dc727a64f5ad3a887f9d360a33fa63f 0B / 231B 0.3s
#6 sha256:ba785779122ac085811010e33fd7dc9c4dc727a64f5ad3a887f9d360a33fa63f 231B / 231B 0.6s done
#6 sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b 2.10MB / 24.57MB 0.6s
#6 sha256:9b09da3b5483e510a6f560d65ce395bcf73dd272bd7afb04da70f67ced436a0d 11.53MB / 15.93MB 4.2s
#6 sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b 4.19MB / 24.57MB 0.8s
#6 sha256:17f974a43cf97d28ea73fc04f696a91c40dc3d73e5209b794a2e07909cd455d1 0B / 290.99kB 0.2s
#6 sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b 8.39MB / 24.57MB 0.9s
#6 sha256:9b09da3b5483e510a6f560d65ce395bcf73dd272bd7afb04da70f67ced436a0d 12.58MB / 15.93MB 4.5s
#6 sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b 10.49MB / 24.57MB 1.1s
#6 sha256:17f974a43cf97d28ea73fc04f696a91c40dc3d73e5209b794a2e07909cd455d1 290.99kB / 290.99kB 0.6s done
#6 sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b 14.68MB / 24.57MB 1.2s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 46.14MB / 172.36MB 6.2s
#6 sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b 20.97MB / 24.57MB 1.4s
#6 sha256:9b09da3b5483e510a6f560d65ce395bcf73dd272bd7afb04da70f67ced436a0d 13.63MB / 15.93MB 4.8s
#6 sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b 24.57MB / 24.57MB 1.8s
#6 sha256:9b09da3b5483e510a6f560d65ce395bcf73dd272bd7afb04da70f67ced436a0d 14.68MB / 15.93MB 5.3s
#6 sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b 24.57MB / 24.57MB 1.9s done
#6 sha256:9b09da3b5483e510a6f560d65ce395bcf73dd272bd7afb04da70f67ced436a0d 15.93MB / 15.93MB 5.6s done
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 55.57MB / 172.36MB 7.1s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 66.17MB / 172.36MB 8.0s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 76.55MB / 172.36MB 8.7s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 85.98MB / 172.36MB 9.3s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 97.52MB / 172.36MB 10.2s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 109.05MB / 172.36MB 11.1s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 118.49MB / 172.36MB 11.9s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 128.97MB / 172.36MB 12.5s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 140.51MB / 172.36MB 13.4s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 150.99MB / 172.36MB 14.3s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 160.43MB / 172.36MB 15.3s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 170.92MB / 172.36MB 16.5s
#6 sha256:833dc3235950d9d4e94e6cccb0e9878439fcd55ab257113aaf24b9ce3f5b0e1c 172.36MB / 172.36MB 16.6s done
#6 DONE 16.6s

#7 exporting to oci image format
#7 sending tarball
#7 sending tarball 4.4s done
#7 DONE 21.4s
  1. I noticed that on my machine 5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b wasn't missing. I remembered that I spoke to @tonistiigi on Slack about the fact that this file was missing in /var/lib/buildkit/runc-overlayfs/content/blobs/sha256/5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b
  2. I tried copying my local 5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b to my remote buildkitd. Obviously that didn't help because the problem is not that the file doesn't exist but that for some reason, buildkit cannot pull/read it. As stated:

error: failed to solve: failed to get reader: content digest sha256:5000a6c32c5a2a6ee6a3b76422f44b669563d129c78f566029f8527a887cc93b: not found

  1. This issue Failed to get reader from content store #992 by @msg555 seems interesting and might be related?
  2. I'm not familiar with the code yet, but
    var ocidesc ocispecs.Descriptor
    if err := mount.WithTempMount(ctx, mounts, func(root string) error {
    ra, err := s.cs.ReaderAt(ctx, desc)
    if err != nil {
    return errors.Wrap(err, "failed to get reader from content store")
    }
    defer ra.Close()
    seems like a good start
  3. I will keep looking. IMO this seems like something that might become a huge issue.

@lugeng
Copy link
Contributor

lugeng commented Mar 8, 2022

+1, also meet the same issue;

@okgolove
Copy link

Experiencing the same issue. Any workaround here?

@Shaked
Copy link

Shaked commented Apr 14, 2022

@okgolove which version of buildkit and buildx are you using?

@okgolove
Copy link

@Shaked I'm using buildctl instead of buildx.
v0.10.1

@dweomer
Copy link

dweomer commented Jul 12, 2022

Related to rancher/kim#74 I think. If this is indeed the same behavior (assuming you are using the containerd worker), workarounds are:

@imeoer
Copy link
Contributor

imeoer commented Oct 26, 2022

Hi @sunchunming @Shaked @lugeng @okgolove @dweomer

We meet the same issue in our buildkit production environment (buildkitd v0.10.3), there is about a 1% chance of error.

This issue can be fixed by this patch, we have tested 5k image builds for some days.

We have observed that this problem always occurs during the push layers phase, and that the base image layer is not found.

By adding some logs we found that in the https://github.com/imeoer/buildkit/blob/26c11880022774bc6eca6376aef5e698ecf629c5/cache/refs.go#L276, !cr. getBlobOnly() always true, causing the push image process to not download the lazy layer into the content store first, then throw the error in https://github.com/imeoer/buildkit/blob/26c11880022774bc6eca6376aef5e698ecf629c5/cache/remote.go#L304.

This issue seems difficult to reproduce, perhaps we can take a look deeper. cc @tonistiigi @sipsma

@mmmmmmmxl
Copy link

Hi @sunchunming @Shaked @lugeng @okgolove @dweomer

We meet the same issue in our buildkit production environment (buildkitd v0.10.3), there is about a 1% chance of error.

This issue can be fixed by this patch, we have tested 5k image builds for some days.

We have observed that this problem always occurs during the push layers phase, and that the base image layer is not found.

By adding some logs we found that in the https://github.com/imeoer/buildkit/blob/26c11880022774bc6eca6376aef5e698ecf629c5/cache/refs.go#L276, !cr. getBlobOnly() always true, causing the push image process to not download the lazy layer into the content store first, then throw the error in https://github.com/imeoer/buildkit/blob/26c11880022774bc6eca6376aef5e698ecf629c5/cache/remote.go#L304.

This issue seems difficult to reproduce, perhaps we can take a look deeper. cc @tonistiigi @sipsma

Thank you very much for your contribution, I have tested 3k images build for two weeks in our build cluster and the issue didn't reproduce.

@jbguerraz
Copy link

Same issue here. If the patch helps why not merging it ?

@imeoer
Copy link
Contributor

imeoer commented Dec 25, 2022

@jbguerraz Will try to submit a PR.

@ahuret
Copy link

ahuret commented Jan 20, 2023

hey ! any news ? :)

@ohmer
Copy link

ohmer commented Feb 1, 2023

Looks like I am experiencing the same issue here. Using version built from #3447, this all goes away.

Running in GitHub Action. My workflow is derived from the testing image example (https://docs.docker.com/build/ci/github-actions/examples/#test-your-image-before-pushing-it):

    steps:
      - uses: actions/checkout@v3

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2

      - name: Create Buildx local cache folder
        run: |
          BUILDX_CACHE_FOLDER=$(mktemp -d -q)
          echo "BUILDX_CACHE_FOLDER=${BUILDX_CACHE_FOLDER}" >> $GITHUB_ENV

      - name: Set Buildx remote cache locations
        run: |
          BASE="type=s3,region=us-west-1,bucket=xxx,prefix=${{ matrix.architecture.runner }}/,access_key_id=${{ env.AWS_ACCESS_KEY_ID }},secret_access_key=${{ env.AWS_SECRET_ACCESS_KEY }},session_token=${{ env.AWS_SESSION_TOKEN }}"
          case ${GITHUB_EVENT_NAME} in
            pull_request)
              echo 'BUILDX_CACHE_FROM<<EOF' >> $GITHUB_ENV
              echo "${BASE},name=${GITHUB_HEAD_REF}" >> $GITHUB_ENV
              echo "${BASE},name=${GITHUB_BASE_REF}" >> $GITHUB_ENV
              echo 'EOF' >> $GITHUB_ENV
              echo "BUILDX_CACHE_TO=${BASE},name=${GITHUB_HEAD_REF}" >> $GITHUB_ENV
              ;;
            push | workflow_dispatch)
              BUILDX_CACHE="${BASE},name=${GITHUB_REF_NAME#deploy/}"
              echo "BUILDX_CACHE_FROM=${BUILDX_CACHE}" >> $GITHUB_ENV
              echo "BUILDX_CACHE_TO=${BUILDX_CACHE}" >> $GITHUB_ENV
              ;;
            *)
              echo "Event not supported"
              exit 1
              ;;
          esac

      - name: Build and export image to Docker
        uses: docker/build-push-action@v3
        with:
          context: .
          file: docker/Dockerfile
          load: true
          tags: ${{ github.repository }}:test
          cache-from: ${{ env.BUILDX_CACHE_FROM }}
          cache-to: type=local,dest=${{ env.BUILDX_CACHE_FOLDER }},mode=max

...
some testing depending on docker --load
...

      - name: Build and push image to Amazon ECR
        uses: docker/build-push-action@v3
        with:
          context: .
          file: docker/Dockerfile
          push: ${{ startsWith(github.ref_name, 'deploy/') }}
          tags: ${{ steps.metadata.outputs.tags }}
          labels: ${{ steps.metadata.outputs.labels }}
          cache-from: type=local,src=${{ env.BUILDX_CACHE_FOLDER }}
          cache-to: ${{ env.BUILDX_CACHE_TO }}

Failures have been very consistent. It started around the 0.11.2 release so I first thought of a regression. Pinning version 0.11.1 did not change the outcome, oddly.

Versions used are GitHub Actions defaults:

Would really appreciate some help. All I can offer is a bunch of testing, not a Go expert here...

@tonistiigi
Copy link
Member

tonistiigi commented Feb 1, 2023

@ohmer How often do you see it? Do you have a public workflow that we could use for debugging?

edit: also, did you test with vanilla 0.11.2 as well? There were some other patches in 0.11.

@ohmer
Copy link

ohmer commented Feb 1, 2023

@tonistiigi over the past 2 days, workflow failure rate was around 95%. The sample is around 100 run (we are a small team). That's private repository, really can't be made public but happy to share anything not related to our app code. I did not have buildkit version pinned until workflow started to fail, so was running 0.11 until 0.12 was published (4 days ago?). I only switched to my fork late today (from NZ, 6:45pm here).

@tonistiigi
Copy link
Member

@ohmer
Copy link

ohmer commented Feb 1, 2023

@tonistiigi will try this tomorrow on my private repo. In the meantime, I extracted the workflow to a public repo and can reproduce: https://github.com/sharesight/buildkit-debug/actions/runs/4062589459/jobs/6993829704. Bucket is public if you want to have a look at the cache state on S3.

@ahuret
Copy link

ahuret commented Feb 1, 2023

FYI had the same issue, I compiled buildkit with this change #3447 and didn't get this issue again. However, I can't confim it works for every cases.

@ohmer
Copy link

ohmer commented Feb 2, 2023

@ohmer Could you check what this patch does for you: https://github.com/moby/buildkit/compare/master...tonistiigi:buildkit:blobonly-debug?expand=1

@tonistiigi was not able work on this today, did not forget about it.

@ohmer
Copy link

ohmer commented Feb 7, 2023

@tonistiigi Was able to test your patch. Here is the method and result.

  1. Created a public repo: https://github.com/sharesight/buildkit-debug
  2. Added a simple workflow using well known actions and basic Dockerfile. Did not reproduce the problem, v0.11.2 used.
  3. Create a first PR (https://github.com/sharesight/buildkit-debug/pull/1) triggering a complete rebuild of the image. Still did not reproduce the problem
  4. Create a second PR (https://github.com/sharesight/buildkit-debug/pull/2) adding a more complicated Dockerfile. Workflow stil using v0.11.2. Problem reproduced. Workflow ran twice and failed each time: https://github.com/sharesight/buildkit-debug/actions/runs/4062589459/attempts/1, https://github.com/sharesight/buildkit-debug/actions/runs/4062589459
  5. Added a commit to second PR, to use patch above against revision v0.11.2. Ran the workflow twice and succeeded each time: https://github.com/sharesight/buildkit-debug/actions/runs/4118068370/attempts/1, https://github.com/sharesight/buildkit-debug/actions/runs/4118068370. The buildkit image I built is public accessible: public.ecr.aws/m4n2v6l4/buildkit:blobonly-debug.

I built buildkit with docker buildx build --tag=moby/buildkit:local --output=type=docker .. Using an older Docker Desktop, image was not exported to Docker with make images which runs docker buildx build --tag=moby/buildkit:local --output=type=docker,buildinfo-attrs=true --attest=type=sbom --attest=type=provenance,mode=max ..

Is there anything else I should try? Does it help?

@sizov-kirill
Copy link

sizov-kirill commented Feb 8, 2023

We ran into this issue in our Github Actions pipelines. You can find logs of failed building here and here. As a temporary fix, we switched to version 0.10.0 of the buildkit and now it works fine. Hope it will be fixed in future.

@imeoer
Copy link
Contributor

imeoer commented Feb 10, 2023

It should have been fixed in #3566.

@pr3d4t0r
Copy link

@imeoer - The same issue just popped up. I was able to build and push an image without issues two weeks ago, retried today, same error: #25 ERROR: failed to push pr3d4t0r/kallisto:3.2.0: content digest sha256:8b150fd943bcd54ef788cece17523d19031f745b099a798de65247900d102e18: not found

@imeoer
Copy link
Contributor

imeoer commented Feb 12, 2023

@pr3d4t0r Did you apply the patch before running test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests