Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added update_locations.py, moved subjects files to repeatable, added … #22

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

crugas
Copy link
Collaborator

@crugas crugas commented Jan 14, 2025

…write_to_file to utilities.py and utilities_tests.py. Closes #17

Description

Adds update_locations.py to the repeatable python scripts directory, and adds the associated updatelocations_tests.py file to tests. It also adds a write_to_file() function (and associated unittests) to utilities.py, which writes data to a jsonlines file to be saved in the user's test_data folder. This was originally created to store JSON data of metadata records being modified, in case we would need to go back and retrieve old data before it was modified.

Smaller changes include docstrings to utilities_tests.py, some cleanup of imports of delete_dometadata.py, and moving new/merg/update_subjects.py files to the repeatable directory.

Related GitHub Issue

#17

Testing

See updatelocations_tests.py and utilities_tests.py for the tests created for these changes.

Screenshot(s):

Checklist

  • ✔️ Have you assigned at least one reviewer?
  • 🔗 Have you referenced any issues this PR will close?
  • ⬇️ Have you merged the latest upstream changes into your branch?
  • 🧪 Have you added tests to cover these changes? If not, why:
  • 📘 Have you updated/added any relevant readmes/wiki pages/comments in the codebase?
  • 📚 Have you updated/added any external documentation (e.g. Confluence, AirTable, GitHub Projects)?

…write_to_file to utilities.py and utilities_tests.py
@crugas crugas requested a review from lorawoodford January 14, 2025 22:40
@crugas crugas linked an issue Jan 14, 2025 that may be closed by this pull request
Copy link
Contributor

@lorawoodford lorawoodford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crugas Comments inline. Let me know if you have thoughts.

Takes a location record and adds a new repository code to it, assigning it to that repository
Args:
location_data (dict): the Location JSON data
repository_code (int): the repository code to add to the location
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nitpicky and not enough to warrant changing it at this point, but just a suggestion that the term "code" here is a bit misleading as it suggest you're looking for the alpha "repo code" (like "NMAI"), rather than the integer type id of the repo.

env_file = find_dotenv(f'.env.prod')
load_dotenv(env_file)
local_aspace = ASpaceAPI(os.getenv('as_api'), os.getenv('as_un'), os.getenv('as_pw'))
uris = read_csv(str(Path(os.getcwd(), "../../test_data/ACMA_PROD_Locations_URIs-13-01-25.csv")))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we hardcode this file path here it becomes a burden to rerun since a code modification will be needed, right? Can this be handled as an argument passed at the time the script is invoked instead? That way we can go on running this with new data again and again without ever having to modify this script.

return updated_repo

def main():
original_location_json = str(Path('../../test_data', 'update_locations_original_data.jsonl'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect to find a call to the test_data directory from any production/runnable script, and this might end up being misleading. What is the purpose of this? It's more of a logging functionality, right? Should this be stored there instead? Semantically speaking, the contents of this file may or may not have anything to to with test data... it could be the record of actually updated production data and having it is test_data makes that confusing.

location_type, location_id = location_uri[1], location_uri[2]
location_data = local_aspace.get_object(location_type, location_id)
write_to_file(original_location_json, location_data)
updated_location = add_repo(location_data, 24)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded 24 here means that this isn't repurposable for future use if a different repo needs us to run this. Could this be an argument passed to the script at the time it is called?

tests/updatelocations_tests.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update Locations for ACMA
2 participants