Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulkrax OAI importer issues #247

Closed
Tracked by #113
KatharineV opened this issue Oct 9, 2023 · 17 comments
Closed
Tracked by #113

Bulkrax OAI importer issues #247

KatharineV opened this issue Oct 9, 2023 · 17 comments
Assignees
Labels
bug something isn't working Knapsack Upgrade

Comments

@KatharineV
Copy link
Collaborator

I created two Bulkrax OAI importers on demo.adventist-knapsack. Both importers pulled 10 records from the adl:image OAI set. The importers both have status of "Complete (with failures)." The error code is ArgumentError. All 20 records came into the repo, but no files came in.

  1. First importer, selected to skip thumbnails
  2. Second importer, selected to include thumbnails

The importers did not map metadata exactly according to the ADL and Hyku Maps document.

Work on Demo (first importer): https://demo.adventist-knapsack-staging.notch8.cloud/concern/images/20119303_untitled?locale=en
Work on ADL Prod (first importer): https://adl.b2.adventistdigitallibrary.org/concern/images/20119303_untitled
Work's OAI metadata: http://oai.adventistdigitallibrary.org/OAI-script?verb=GetRecord&metadataPrefix=oai_adl&identifier=20119303

Work on Demo (second importer): https://demo.adventist-knapsack-staging.notch8.cloud/concern/images/20119409_may_covington?locale=en
Work on ADL Prod (second importer): https://adl.b2.adventistdigitallibrary.org/concern/images/20119409_may_covington
Work's OAI metadata: http://oai.adventistdigitallibrary.org/OAI-script?verb=GetRecord&metadataPrefix=oai_adl&identifier=20119409

MAPPING ISSUES:

  • < identifier> should map to the Identifier (local) field. It did not map at all.
  • <aark_id> should map to AARK Identifier. It mapped there and also mapped to Identifier (local). Note that on prod it maps to both, creating a second Identifier (local) entry after the contents of < identifier > map there.
  • Related URL and Thumbnail URL failed to map. No files imported. No URL shows in the related URL field. Expected behavior would bring in files and show the URL.
@KatharineV KatharineV added bug something isn't working Knapsack Upgrade labels Oct 9, 2023
@KatharineV
Copy link
Collaborator Author

Continued testing shows that all OAI sets import with status "Complete (with failures)." Consistent ArgumentError across all tests.

Checking metadata mapping for all OAI sets and work types that import via existing Adventist OAI shows same identifier and AARK identifier mapping errors and failure to import files. Other fields map as expected.

Work types tested:
Published work (adl:book and adl:issue)
Generic work (adl:other)
Thesis work (adl:thesis)

@KatharineV
Copy link
Collaborator Author

Please note that importers I created for testing on 10/9 did not have files attached at the time, and no files have rendered in the repo. Today (10/11) I reran an adl:image importer and selected "reharvest." The files are still not showing up.

https://demo.adventist-knapsack-staging.notch8.cloud/importers/35?locale=en

@jeremyf
Copy link
Contributor

jeremyf commented Oct 13, 2023

Triage:

  • On previous adventist we're using v5.3.0
  • Knapsack uses v5.4.1

I have compared the parser mappings and Knapsack adds a handler for description.abstract import. Which means we should have the same mappings.

I'll be looking into the loading sequence as well as the difference between v5.3.0 and v5.4.1 to see what might have introduced the errors.

@jeremyf
Copy link
Contributor

jeremyf commented Oct 13, 2023

One possible bug introduced is from this PR: samvera/bulkrax#853

@KatharineV
Copy link
Collaborator Author

Team, I've continued testing Bulkrax imports to Knapsack, and I want to report that a CSV with a valid URL in the Related URL field has not imported the file. The importer says it failed due to an argument error, but the metadata has imported and the work is created. The file is what's missing, and it did import correctly on SDAPI staging (see links below). The title of this ticket should perhaps more accurately read "Bulkrax importer issues on Knapsack." Both OAI and CSV imports are impacted.

Knapsack (file failed to import): https://demo.adventist-knapsack-staging.notch8.cloud/importers/80?locale=en
SDAPI Staging (file imported as expected): https://sdapi.s2.adventistdigitallibrary.org/importers/141?locale=en

This is critical and I would mark this ticket among the highest priorities to fix with whatever remains of our Knapsack/upgrade hours.

kirkkwang referenced this issue Nov 19, 2023
This commit will fix the OAI importer by adjusting the `FileSetActor`
and `ImportUrlJob`.  The reason why it wasn't working is because it was
based on Hyrax 2.9.6 method signatures which has changed in Hyrax 3.5.0.

Ref:
  - https://github.com/scientist-softserv/adventist-dl/issues/624
@kirkkwang kirkkwang self-assigned this Nov 19, 2023
kirkkwang referenced this issue Nov 19, 2023
This commit will fix the OAI importer by adjusting the `FileSetActor`
and `ImportUrlJob`.  The reason why it wasn't working is because it was
based on Hyrax 2.9.6 method signatures which has changed in Hyrax 3.5.0.

Ref:
  - https://github.com/scientist-softserv/adventist-dl/issues/624
kirkkwang referenced this issue Nov 20, 2023
# Story

This commit will fix the OAI importer by adjusting the `FileSetActor`
and `ImportUrlJob`. The reason why it wasn't working is because it was
based on Hyrax 2.9.6 method signatures which has changed in Hyrax 3.5.0.

Ref:
  - https://github.com/scientist-softserv/adventist-dl/issues/624

# Expected Behavior Before Changes
Executing an OAI import would result in a `RuntimeError`.

# Expected Behavior After Changes
Executing an OAI import should result in a successful import.

# Screenshots / Video


![image](https://github.com/scientist-softserv/adventist_knapsack/assets/19597776/8379038b-8cce-4447-a956-7fd67ab8de37)


![image](https://github.com/scientist-softserv/adventist_knapsack/assets/19597776/41387134-dda0-46c4-8a1b-aaefac8ec4f5)
@KatharineV
Copy link
Collaborator Author

An update: As of today (11-29-2023), trying to create either a CSV or OAI importer causes an error message to appear, and nothing works. The importers don't create and fail. They just don't create.

Image

@kirkkwang kirkkwang added the needs rework issue needs additional work label Nov 29, 2023
@kirkkwang
Copy link
Contributor

Thanks for checking @KatharineV we'll take a look

@kirkkwang
Copy link
Contributor

@KatharineV seems it was a Fedora issue, we've since restarted it and it should be working, can you check again when you get a chance?

@kirkkwang kirkkwang removed the needs rework issue needs additional work label Nov 29, 2023
@KatharineV
Copy link
Collaborator Author

@kirkkwang The CSV importer worked beautifully this time: https://adl.adventist-knapsack-staging.notch8.cloud/importers/35?locale=en

The OAI importer looks successful, but the works it created won't open. I can see them in the catalog view, but trying to open the works generates an error message. Could it be related to the Fedora issue?

image

image

image

https://adl.adventist-knapsack-staging.notch8.cloud/concern/theses/20121848_problems_in_presenting_the_gospel_to_the_hindu_mind

kirkkwang referenced this issue Nov 29, 2023
This commit will update the hyrax-webapp submodule where we move the
`#solr_document` method from private to public.  The method being
private was causing a crash in the Universal Viewer.  As per Hyrax, the
method should have been public in the first place.

Ref:
  - samvera/hyku@dbe996e
  - https://github.com/samvera/hyrax/blob/b334e186e77691d7da8ed59ff27f091be1c2a700/app/presenters/hyrax/file_set_presenter.rb#L10
  - https://github.com/scientist-softserv/adventist-dl/issues/624
kirkkwang referenced this issue Nov 29, 2023
# Story

This commit will update the hyrax-webapp submodule where we move the
`#solr_document` method from private to public. The method being private
was causing a crash in the Universal Viewer. As per Hyrax, the method
should have been public in the first place.

Ref:
-
samvera/hyku@dbe996e
-
https://github.com/samvera/hyrax/blob/b334e186e77691d7da8ed59ff27f091be1c2a700/app/presenters/hyrax/file_set_presenter.rb#L10
  - https://github.com/scientist-softserv/adventist-dl/issues/624

# Screenshots / Video

## Before (on staging)
<img width="1361" alt="image"
src="https://github.com/scientist-softserv/adventist_knapsack/assets/19597776/40dc261a-943d-4e2a-bdea-ad5b8b648b90">

## After
<img width="1575" alt="image"
src="https://github.com/scientist-softserv/adventist_knapsack/assets/19597776/f5777733-55af-4bb6-8cae-18739a487633">
@ShanaLMoore ShanaLMoore reopened this Nov 29, 2023
@ShanaLMoore
Copy link

@KatharineV I believe this issue has been resolved. I clicked on the link you've provided and see the work.

Image

@KatharineV
Copy link
Collaborator Author

@ShanaLMoore It's weirdly still not loading for me! I tried Firefox, Edge, and Chrome in case it was a browser issue. I got the "We're sorry but something went wrong" message on all three attempts.

@ShanaLMoore
Copy link

ShanaLMoore commented Nov 30, 2023

@KatharineV I needed to restart fedora. We are currently looking into this issue (still) but this particular work wasn't loading because it couldn't connect to fedora. Please try it again

image

@KatharineV
Copy link
Collaborator Author

The work page opens now, but the work still doesn't display to me as it did for you in the screenshot above. Some aspects of the page aren't loading yet. See below:

image

@ShanaLMoore ShanaLMoore added the needs rework issue needs additional work label Nov 30, 2023
@ShanaLMoore
Copy link

ShanaLMoore commented Nov 30, 2023

Pulling this back to In Development.

Jeremy, LaRita and I confirmed that the page loads but we don't see a file. (Chrome, Safari)

I see the file attached in FireFox even after a cache refresh, and can download it too, but I'm in the minority so I think this is a moot point. I'll eventually restart all the things to check again, but LaRita and Jeremy confirmed they can't see this on their FireFox as well.

However I can find the file set in knapsack staging's rails console using find and where clauses. I can also see the solrdocument.

See slack thread for more sleuthing notes: https://assaydepot.slack.com/archives/C0311DN2YCA/p1701385487550419?thread_ts=1701364436.127349&cid=C0311DN2YCA

@ShanaLMoore ShanaLMoore self-assigned this Feb 12, 2024
@ShanaLMoore
Copy link

ShanaLMoore commented Feb 13, 2024

This is critical and I would mark this ticket among the highest priorities to fix with whatever remains of our Knapsack/upgrade hours.

ref: #247

It looks like this issue with the CSV import has been resolved.

On knapsack staging I created the same importer and it imported with success.

Additionally, the follow previously missing mappings are here:

  • aark id
  • identifier
  • related url

Image

Looking into OAI next...

@ShanaLMoore
Copy link

ShanaLMoore commented Feb 13, 2024

Hi @KatharineV

I'd hate to ask this but I'm wondering if you can re test this ticket and/or lay out the steps to reproduce your issue. It's a rather old ticket now, so I also totally understand if you don't remember, to which I'd suggest we close this one and create new issues as they arise.

In the above comment I successfully imported the related url csv on the demo tenant of adventist knapsack. see results

I've also successfully created an OAI article importer (limit 3). Here are the results. I've recreate the image ones too, as detailed in the description.

Image

I am able to visit each imported record and download the PDFs. Additionally I see the identifier and aark identifiers displayed.

At this time it isn't clear to me what the issue is to be able to address it. Please let us know how you'd like to proceed.

@ShanaLMoore ShanaLMoore removed the needs rework issue needs additional work label Feb 13, 2024
@KatharineV
Copy link
Collaborator Author

@ShanaLMoore Thanks for the heads-up. I did retest this ticket and I'm seeing the same thing as you. Whatever was causing issues before is no longer present. I will close this ticket and create a new one if new issues arise, as you suggest.

@kirkkwang kirkkwang transferred this issue from notch8/adventist-dl May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something isn't working Knapsack Upgrade
Projects
Archived in project
Development

No branches or pull requests

4 participants