Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1656 delete work webhook #1659

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

1656 delete work webhook #1659

wants to merge 6 commits into from

Conversation

EricDurante
Copy link
Collaborator

I tried to make this as simple as I could.

Since deleting/removing things in ScholarSphere seems kind of complicated (and I'm not that familiar with it), I'm not even completely sure that this satisfies the requirements for this issue. So I'd appreciate it if someone could check me on that.

A couple of things to note specifically:

  • In addition to a withdrawn state, work versions can also be in a removed state. I'm not sure what the difference is, but since I haven't seen any place in the code where work versions get transitioned into the removed state, I've only dealt with the withdrawn state.
  • I'm not sure what does or what should happen if a work is withdrawn and then the same work is subsequently re-published with the creation of a new version. I assume that RMD will find the work again and create a new open access location for it based on the publication's DOI?
  • I've made the RMD hostname configurable - both for the new webhook and for the existing RmdClient class. So this will need to be configured appropriately for the various deployments before release (in addition to the new shared secret for the webhook call).
  • When I pulled main, I had a couple dozen failing tests in my dev environment, and I'm not sure why. There are still tests that are failing for me in this branch (though somehow there are less now), and some seem to be failing for different reasons than when I started. I haven't tried to dig in to what's going on with this yet. I'm going to see what (if anything) is failing on CI first.
  • Obviously, I haven't attempted to do anything to handle any errors produced by the HTTP call to RMD. I'm not sure if or how we'd want to handle any such errors, but I can make changes if anybody thinks that there's anything specific we should try to do at this point. It's definitely not the end of the world if one of these calls fails somehow, but if it failed because of a network issue or something like that, it would be ideal if we were notified.

I realized that I had this configuration locally, but it hadn’t been
added to the sample config file.
@EricDurante EricDurante marked this pull request as draft December 19, 2024 18:55
@ajkiessl
Copy link
Collaborator

From an initial look, here are my thoughts on some of the questions above (and the code):

Since deleting/removing things in ScholarSphere seems kind of complicated (and I'm not that familiar with it), I'm not even completely sure that this satisfies the requirements for this issue

I haven't quite had the time before break to dig very deep into this, but I think what you've done here:

    event :withdraw do
      after_commit do
        if work.withdrawn?
          WorkRemovedWebhookJob.perform_later(work.uuid)
        end
      end

should work. Since we don't want the callback triggered unless the last work version is being withdrawn.

Also, triggering the callback after destroying a work sounds good too. You might have some unneeded callbacks sent when an unpublished draft is destroyed, though.

Those are the only too ways to remove a work right now and should satisfy the requirements.

In addition to a withdrawn state, work versions can also be in a removed state

The "removed" state and "remove" event appear to be deprecated and must be a remnant of old scholarsphere, so we shouldn't need to worry about that. There aren't any work versions in production in the removed state, and as you mentioned, there's nowhere in the code to transition work versions to "removed".

I'm not sure what does or what should happen if a work is withdrawn and then the same work is subsequently re-published with the creation of a new version. I assume that RMD will find the work again and create a new open access location for it based on the publication's DOI?

Yeah, that's what will happen. I'm also assuming that's what we'd want to happen.

When I pulled main, I had a couple dozen failing tests in my dev environment, and I'm not sure why. There are still tests that are failing for me in this branch (though somehow there are less now), and some seem to be failing for different reasons than when I started. I haven't tried to dig in to what's going on with this yet. I'm going to see what (if anything) is failing on CI first.

I see CI is failing at the bundle step. It seems to be working over here: #1658 though, so we may need to pull in those changes. Not sure if Justin is ready for that.

It's definitely not the end of the world if one of these calls fails somehow, but if it failed because of a network issue or something like that, it would be ideal if we were notified.

If a job fails it will end up in the dead queue in Sidekiq. So we can address it or rerun it from there. So I agree that it's not a big deal to fine-tune error handling just yet.

Tests are erroring out with this not being set to a valid URL.
It was hard-coded to this value in the source previously, and there are
apparently some tests that depend on it being set as such.
Dann’s gone :(
@EricDurante EricDurante marked this pull request as ready for review January 14, 2025 21:41
@EricDurante
Copy link
Collaborator Author

In response to @ajkiessl's comment (which I think is correct),

Also, triggering the callback after destroying a work sounds good too. You might have some unneeded callbacks sent when an unpublished draft is destroyed, though.

I took a look to see if it would be possible to only fire the webhook upon deletion of published works. The problem that I found is that a Work can't be deleted without first deleting its WorkVersions, and WorkVersions that are in the published state actually can't be deleted per the application logic. Admin's can of course delete Works that have been published, but as far as I can tell, they do so by first putting all of the Work's WorkVersions in a non-published state, then deleting each of the WorkVersions, then finally deleting the Work. This means that at the point when the Work is deleted, we no longer have the information that we would need to determine if the Work had previously been published and thus whether or not we should trigger the webhook.

I can think of somewhat awkward workarounds for preserving information about prior published state or perhaps for determining whether or not the webhook should be triggered each time a WorkVersion is deleted rather than when the Work is deleted. However, they're not perfect, and I also think that they're probably unnecessary. If ScholarSphere sends a webhook with a URL for a ScholarSphere OpenAccessLocation that doesn't exist in RMD, then RMD will simply do nothing and respond with a 404, and I think that'll be the end of the story. If that were going to happen really frequently, then maybe we'd want to try to avoid the unnecessary webhook requests, but I think it's going to happen rarely.

So, I'm inclined to leave that part working as it currently is (that is, always triggering the webhook when a Work is deleted regardless of whether or not it was ever published) because it seems like both the easier and the safer thing to do. Otherwise, I think that this is ready to go.

@Smullz622 & @jlandiseigsti, Do you agree with the above, or do you think I'm missing anything? Do you see any other issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants