-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
App of apps being overwritten by image-updater #896
Comments
I am trying to use those filters but I am having those errors:
What am i doing wrong ? 😬 |
I am using kustomize to update the deployment of argocd-image-updater apiVersion: apps/v1
kind: Deployment
metadata:
name: argocd-image-updater
spec:
selector:
matchLabels:
app.kubernetes.io/name: argocd-image-updater
template:
spec:
volumes:
- name: scripts
configMap:
name: argocd-image-updater-scripts
defaultMode: 0777
containers:
- name: argocd-image-updater
args:
- run
- --match-application-label app.company.com/name=myapp
volumeMounts:
- name: scripts
mountPath: /scripts |
In your snippet below, the command line switch and its parameter are being passed as a single argument. To fix it, you can either use args:
- run
- --match-application-label=app.company.com/name=myapp (note the equal sign between the parameter and the value) or args:
- run
- --match-application-label
- app.company.com/name=myapp |
The root app-of-apps is behaving like being the app, resulting of having 2 argo apps responsible to handle resources, which causes |
According to the logs, it does not even update the
|
Does the logs help you @chengfang ? |
What are the SharedResourceWarning's details? |
|
At this point, I highly doubt that this has to do with the Image Updater. Image Updater itself doesn't manage or manipulate resources such as ConfigMaps, or other types. Are you using a mono repo for all your apps, including the root app by any chance? |
Ok I understand. But explain why it is working in |
Can you post your root application's spec here? |
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
finalizers:
- resources-finalizer.argocd.argoproj.io
labels:
app.kubernetes.io/managed-by: argocd-autopilot
app.kubernetes.io/name: root
name: root
namespace: argocd
spec:
destination:
namespace: argocd
server: https://kubernetes.default.svc
ignoreDifferences:
- group: argoproj.io
jsonPointers:
- /status
kind: Application
project: default
source:
path: projects
repoURL: https://gitlab.com/xxx/argocd.git
syncPolicy:
automated:
allowEmpty: true
prune: true
selfHeal: true
syncOptions:
- allowEmpty=true
status:
health: {}
summary: {}
sync:
comparedTo:
destination: {}
source:
repoURL: ""
status: "" |
I assume that is the version stored in Git, I am more interested in the live resource in the cluster. I suspect something (maybe image updater) might have fiddled with the source block there. |
I'll update to |
I have encountered a similar problem. But I am not using the "App of Apps" pattern. My setup looks like this. I deploy multiple nginx servers with different Docker images/tags in different kubernetes namespaces. They all use the same helm chart for deployment, but with different values.yaml. Each application has its own ArgoCD application resource, which is applied manually to kubernetes.
With argocd-image-updater v0.14.0 everything works as indented. After I updated argocd-image-updater to v0.15.0 something strange happened. Our monitoring issued an alert because no more metrics could be collected from App-B-Staging. I started to investigate and noticed that the namespace app-b-staging was empty. The deployment was gone. Then I checked argocd and I got a SharedResourceWarning for App-B-Staging. ArgoCD was trying to deploy App-B-Staging to the app-b-dev namespace with the app-b:dev Dockerimage. Somehow the configurations have to be mixed up? After that, I downgraded argocd-image-updater back to v0.14.0, reapplied the ArgoCD application resources, and everything worked as expected. |
Same issue for a week with the I tried hard refresh, rollout of most of the ArgoCD's components and even a manual cleaning into Note Update - Same as @LGLN-LS, downgrading argocd-image-updater to |
Thanks everyone! Appreciate the insights here. I assume y'all who are hitting on this problem are using the default |
I am using both methods. But the one which fails is the default one indeed. |
Indeed it's the specs from recently auto-updated apps without Git |
I found this issue because version 0.15.0 overwrote my application, too. In our case, one of our own "Application" instances trumped its resources into another "Application" instance. |
I had observed the same behaviors with I managed to see that within the app of apps, the last application in the list (lexicographically) was overwritten with its So eventually I had:
It also removed those resources at some point, effectively causing my cluster to be in a bad state. It happened both with a regular app of apps, but also with an applicationset setup. I opened up an issue with ArgoCD argoproj/argo-cd#20440 that explains some of my behavior, but that was before I found this issue |
I also had this problem and in my case I was using the app of apps pattern. Funny enough, after deploying the new version of image updater. It replaced the application for Promtheus instance with an App service of the pattern :S |
This bug is particularly bad if you have any apps with
We just had a bunch of resources deleted because of this failure mode. |
Thanks for all the inputs. I've tested with a sample app but couldn't reproduce the issue. The updates were correctly written to git repo kustomization.yaml for each child app, and the |
@chengfang I don't know if it makes a difference, but we don't use the git write-back. After a brief glance at the code it looks like the spec is only updated when you use the |
After changing the write-back-method to argocd, I was able to reproduce it with the sample app. After the image-update run, the 2 child apps (app1, app2) got removed, and the root app was erroneously updated to be like app2. Will keep looking. kubectl describe -n argocd app root
History:
Deploy Started At: 2024-11-05T19:14:05Z
Deployed At: 2024-11-05T19:14:05Z
Id: 0
Initiated By:
Automated: true
Revision: e64ea670f0a64d8f5671c261c68fe52ff0b1c7c3
Source:
Path: app-of-apps/apps
Repo URL: https://github.com/chengfang/image-updater-examples.git
Target Revision: main
Deploy Started At: 2024-11-05T19:16:24Z
Deployed At: 2024-11-05T19:16:24Z
Id: 1
Initiated By:
Automated: true
Revision: e64ea670f0a64d8f5671c261c68fe52ff0b1c7c3
Source:
Kustomize:
Images:
nginx:1.12.2
Path: app-of-apps/source/overlays/app2
Repo URL: https://github.com/chengfang/image-updater-examples.git
Target Revision: main |
I wish I saw this issue before we updated to v0.15.0 this morning. Spent several hours trying to track down very weird behavior. We also follow the App of Apps (of Apps) pattern and were seeing one "leaf" app update to match the resources of another and couldn't figure out what was going on. Like others, reverting back to v0.14.0 resolved it. All apps are git-hosted Helm charts, and the Apps/AppSets often override Helm values inline and make use of image-updater with argo writeback. I'm not sure how much help I would be on this, but happy to help test/troubleshoot however possible. |
Signed-off-by: Cheng Fang <[email protected]>
Signed-off-by: Cheng Fang <[email protected]>
With the linked PR, my sample app of apps now works and updates correctly. It will be great if we can get some more testing from your real apps. |
@chengfang if an image is available then I can give it a try |
If you could upload an image, I will test it |
I'm not using app of apps and yet two different ArgoCD applications running on different clusters have a bunch of "SharedResourceWarning" after updating image updater to 0.15.0 I would just remove the 0.15.0 release and make it unavailable before other people stumble upon it. The blast radius for this issue is inexplicably large and may break production for a lot of users. |
I pushed the fixed image (my local build) to my personal quay repo for testing purpose: https://quay.io/repository/cfang/argocd-image-updater?tab=tags |
Thanks @chengfang for your effort. I can confirm that the problem existed in version 0.15.0 (App of Apps, no write-back to Git). With your version 0.15.1 everything works as expected. |
I can confirm as well that with @chengfang's version 0.15.1 everything works correctly. |
Make sure to clear out any added resources after rolling back to 0.14.0. |
…argoproj-labs#918) Signed-off-by: Cheng Fang <[email protected]>
…5] (#920) Signed-off-by: Pasha Kostohrys <[email protected]> Signed-off-by: Cheng Fang <[email protected]> Co-authored-by: pasha-codefresh <[email protected]>
I side that 0.15.0 should been recalled. In our case, it basically destroyed the deployed app with all the volumes. It took me hours to recover from that, including recovering a backup. If I had no backup, I would have lost some data forever. |
The patch release v0.15.1 was just released: https://github.com/argoproj-labs/argocd-image-updater/releases/tag/v0.15.1 |
Even in dev/test environments 0.15.0 can be very destructive. I dont see the reason to keep the release public. |
I don't really think this fairly holds up when argocd itself literally points to argocd-image-updater in the top 3 of it's recommended blogpost/presentations. The fact is that a lot of people that use argocd also use image-update because keeping control over which images actually enter the cluster is important! And they do it in production. I think the correct course of action is to pull 0.15.0. No need to leave that release public when it can be so incredibly destructive. Furthermore I think argocd-image-updater should acknowledge that a great many people that use argocd also use argocd-image-updater and that it would be proper to start treating it as a production critical project by now. For example the warning warns about potentially a lot of breaking changes but the previous release with breaking changes was 0.12.0 which is more than two and a half years old! Off course with that kind of stability people will start to consider argocd-image-updater stable even if you keep a big warning on the project 🙈. I agree with you that they shouldn't do that and that the warning is unambiguous but in my experience this is not how people actually behave. edit: ps.: I made an issue for this! |
Describe the bug
I am using ArgoCD with the “App of Apps” pattern. After updating argo-cd-image-updater to version 0.15.0, I encountered an unexpected side effect.
When updating the image of a child application, ArgoCD also updates the parent application (“App of Apps”). This causes a resource conflict, as both the child application (“myapp”) and the parent application (“root”) end up supervising the same resources.
To Reproduce
Expected behavior
Only the child application (“myapp”) should be updated when its image is changed, without the parent application (“root”) taking control over the same resources.
Additional context
This issue was not present with the previous version of argo-cd-image-updater (0.14.x).
Version
The text was updated successfully, but these errors were encountered: