Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sprint1 Staging #76

Closed
wants to merge 31 commits into from
Closed

Sprint1 Staging #76

wants to merge 31 commits into from

Conversation

jsavell
Copy link
Member

@jsavell jsavell commented Apr 30, 2024

This PR encompasses:
#64
#65
#66
#67
#68
#69
#70
#71
#72
#74
#77

The following new application.yml settings are now introduced:

app.dataLoader.initialize: true
app.filter.embargoTypeNone: None
app.filter.submissionTypeNone: None

Their respective environment variable names are:

APP_DATALOADER_INITIALIZE="true"
APP_FILTER_EMBARGOTYPENONE="None"
APP_FILTER_SUBMISSIONTYPENONE="None"

The following SQL may be needed to updating existing repositories when not using the ddl-auto to update.

Submission Type (List):

INSERT INTO submission_list_column (id,title,input_type_id) VALUES (DEFAULT,'Submission Type (List)', 1);
INSERT INTO submission_list_column_value_path (submission_list_column_id,value_path,value_path_order) VALUES (currval('submission_list_column_id_seq'),'submissionTypes',0)
INSERT INTO submission_list_column_value_path (submission_list_column_id,value_path,value_path_order) VALUES (currval('submission_list_column_id_seq'),'name',1)

Embargo Type:

INSERT INTO submission_list_column (id,title,input_type_id) VALUES (DEFAULT,'Embargo Type', 1);
INSERT INTO submission_list_column_value_path (submission_list_column_id,value_path,value_path_order) VALUES (currval('submission_list_column_id_seq'),'embargoTypes',0)
INSERT INTO submission_list_column_value_path (submission_list_column_id,value_path,value_path_order) VALUES (currval('submission_list_column_id_seq'),'name',1)

kaladay and others added 30 commits November 27, 2023 14:41
… the admin list.

This is the first solution described below that fully hides the "Search Box" from the filter list to avoid this confusion.
The previous solution of hiding the "Search Box" is incomplete and still displays the "Search Box" in the "Disabled Filters" column.
The "Search Box" is filtered out of the "Disabled Filters" by this commit.

The following are the results of the investigation into this problem:
The Search Box filter, when selected, gets saved to the database and returned to the UI on an appropriate `/submission-list/filter-columns-by-user` call.

What is happening is that the Search Box is explicitly being removed from the filter list:
- https://github.com/TexasDigitalLibrary/Vireo/blob/61fb5873989eee667951f7b80244fa4f91c084ea/src/main/webapp/app/controllers/submission/submissionListController.js#L150

And then later is handled differently as an "Excluded Column":
- https://github.com/TexasDigitalLibrary/Vireo/blob/db2b4cacea96a29912138626fe1ed5985cb5d073/src/main/webapp/app/controllers/submission/submissionListController.js#L172

It appears to be originally added here with the exluded column:
- TexasDigitalLibrary@c39ff04
- TexasDigitalLibrary#285

Vireo Slack messages relating to Search Box:
- https://texas-digital-library.slack.com/archives/C7R78CADB/p1550676084006100
- https://texas-digital-library.slack.com/archives/C7R78CADB/p1682428409804819?thread_ts=1682428314.318099&cid=C7R78CADB
- https://texas-digital-library.slack.com/archives/C7R78CADB/p1550696674010100

Referenced Github issues from the conversations:
- TexasDigitalLibrary#1146
… handling.

The relocation allows for faster transactional operation.
With the loop outside of the transaction, then the transaction can be committed for each pass.
This then allows for reduced memory requirements for large transactions (such as when generate a huge amount of submissions).

The exception handling is improved by catching the exceptions at the CLI level such that exceptions can be reported without crashing the entire server.
Now, if a CLI fails due to some error, then the server does not exit.
Relocate "generate" CLI code outside of transaction and improve error handling.
…flow.

The organization is being changed but the dirty flag is not being set.
Manually set the dirty flag before sending.

The `OrganizationRepo` listeners are either not doing anything or they are not checking for when status is a failure.
The `ApiResponseActions.READ` and `ApiResponseActions.BROADCAST` never do anything because the `resetManageOrganization()` only does anything when the organization is provided.
No organization is being provided.
…t does something.

The `OrganizationRepo.listen(ApiResponseActions.READ,...` call was removed for being a no-op.
Even older code suggests that this should in fact be doing something.
Add the function call back, but this time add the selected organization if one exists so that the function is not a no-op.

Whether or not this change fixes or breaks something is unclear.
Be wary of regressions.
Be hopeful of bonus bug fixes.
Issue 58: Unable to re-arrange the submission workflow in Manage Workflow.
…everal others.

The embargoes are now generated from the actual list of embargoes.
If there are no embargoes then it falls back to a randomly generated string.

Handle other similar cases that are easy to resolve at the same time.
This makes the generated data more accurate and easier to test.
Especially when using them for filtering.
This was identified by @jsavell.
Improve CLI generation, fixing embargoes to be actual embargoes and several others.
There is existing (dead) code for this already.
Re-vitalize the dead code, clean it up, and get it working.

The `embargoTypes.name` SQL in the back-end operates differently in that it is not involved in part of a column name and is exclusively for filtering.
This changes the nature of the query and I opted to go with sub-queries for selecting the IDs because I can design it this way and avoid `LEFT JOIN` (or `INNER JOIN`).
This has not seen extensive performance testing but this new SQL query conditions should be run through some performance testing.

I went with `Embargo Type` for the new custom filter because the existing code used that name.

A custom filter `uniqueEmbargoType` is created because the `unique`/`uniq` angularjs is not available.
Using `unique` or `uniq` yields an error like this:
```
Unknown provider: uniqueFilterProvider <- uniqueFilter
```

Using the custom filter (`uniqueEmbargoType`) avoids having to pass the additional ` | filter:{isActive: true}` and similar.
This can be done and is done within the same filter.

There are no existing filter unit tests to base off of.
I may follow up with a commit that adds filter unit tests after reviewing other angularjs projects where we have implemented a filter unit test.

The addition of the `Embargo Type` causes a test in the `SystemDataLoaderTest` class to fail due to a hard-coded value representing all of the default filters.
This has been updated to include the new filter in the total.

The UI has an existing filter view, but it was slightly tweak that filter view to address problems in the user experience.
I opted to make its design more closely match that of the Status filter (which goes in line with the issue request using Status filter as an example).
This adds the filter (in the context of AngularJS) for the Embargo Type.
Issue 54: Add custom filter for embargo type by list.
This borrows heavily on the work resolved for Issue 54 (see commit cbe080b).

Most of the field values Controller end points are under the Submission Controller rather than its own Controller.
This introduces complications because the UI is using "FieldValue" and mapping it to a Submission Controller.
Getting the list of FieldValues, independent of a Submission, does not belong in the Submission Controller.
There are a lot of work-arounds or other unusual design changes here to accommodate this situation.
A new `FieldValueRepo` is added on the UI.

This adds new functionality for selecting all Field Values associated with a given Field Predicate.
This is used to select the specific Field Predicate that is hard-coded as `submission_type`.

It turns out the escape string usage in `cbe080bcdc4477bc1f5f8ceeff27abbf0bb94528` is not entirely correct.
The string does need to be escaped, but not the `%` and not the `_`.
Only the slashes.
Change the `escapeString()` function to handle this use case.

The **Submission Type** column already exists.
Create a new one called `Submission Type (List)`.
This exposed a problem in the design where the names of the columns are used to generate the array key names and HTML attribute names.
The use of `(` and `)` is not well supported there.
Break out the sanitizer used in several places into a single function call and have that function call also handle `(` and `)`.
Other cases are not handled and I am leaving those to be handled on an as-needed basis.
To avoid conflicts with the existing **Submission Type**, the new **Submission Type (List)** is suffixed with `List` in several places.

The `submissionListColumns` variable in the `processUpdate` of the `SubmissionListController` is being used for both the **Manage Filter Columns** and the **Submission List Columns**.
This is a problem where the new `Submission Type (List)` is only a Filter and is not a Column to select.
The code had to be re-worked to address this.
I chose the quick and simple approach of making another variable.
There may be better ways but this seems good enough at this time.

A recent change introduced setting `$dirty` on `$scope.getSelectedOrganization()` but that did not handle the case where `$scope.getSelectedOrganization()` is undefined or NULL.
This broke some unit tests but is now fixed.

Update the CLI tests to generate Submission Types that are not always different.
The Submission Type is created outside of the transaction loop to ensure only a small set is created.
If there already exists Submission Types then no new ones are created and the existing ones are re-used.
This allows for building generated test data on a system with existing data already populated.
…ted to embargo.

This is just a copy and paste mistake.
Issue 55: Add custom filter for submission type by list.
Doing a `findAll()` and then counting the length of the array is highly enefficient.
Using `findAll()` slowing down the generation of large data sets for testing.
Change the code to an SQL count via the JPA `count()` method.
The test data consisted of randomly generated data with:
- 44075 submissions.
- 1458591 field values.

The submission list page took ~3.5 minutes to load the Submission Type list data.
Switching to an explicit query, the submission list page took ~320 milliseconds to load the Submission Type list data.

This is an overwhelming clear improvement that shows that the use of Spring+JPA internals are highly inefficient for the Field Values and related data.
For the filters, the actual model does not need to be loaded and only the value.

The use of a JDBC query is chosen after doing tests against Postgresql and H2.
The use of a native query via `@Query` worked for Postgresql but H2 used clob2 and failed to convert the results into strings.
Using a literal query is chosen over a JPA query because the JPA wants to understand the model structure.
The Field Predicate is not loaded immediately available for JPA for the FieldValue JPA Repository.

Only very basic SQL is used in the query and so there should be very little SQL compatibility risks.
The biggest risk is that of the table naming structure.
This structure appears to be sufficiently consistent between Postgresql and H2 in this regard.
… changes.

I accidentally removed too much code when I cleaned up the previous commit (abb885a) to separate unrelated changes.
This adds back the changes that I forgot to commit.

This updates the CLI to use the `getAllValuesByFieldPredicateValue()` call.
The behavior of the case of `None` is treated as as synonymous with a NULL value for that field in the database.
I've looked over the different use cases and have found that in some cases `Unknown` and `Unassigned` are used.

Make this `None` filter value customizable rather than hard-coding the opinion of `None` being NULL.

The `app.filter.embargoTypeNone` and `app.filter.submissionTypeNone` may now be changed to something other than `None`.
The default remains set to `None`.
The query is not a static final string and this line is no longer needed or used.
…ance_follow_up

Issue 55: Use a explicit Query for fetching the Submission Type list.
Make the Embargo Type and Submission Type `None` values configurable.
…erge

[73][MERGE] Add config option to bypass data loader
These are added as commented out so that the default `application.yml` settings are used.

The `APP_DATALOADER_INITIALIZE` should only be `true` for the initial setup but after that it is recommended to set this to `false`.

The `APP_FILTER_EMBARGOTYPENONE` and `APP_FILTER_SUBMISSIONTYPENONE` should be set to `None` as the default.
These can be changed as needed, such as setting `APP_FILTER_SUBMISSIONTYPENONE` to `Unknown`.
Add new environment variables to the example environment file.
@jsavell jsavell changed the base branch from main to tamu-main May 7, 2024 20:47
@jsavell jsavell marked this pull request as ready for review May 7, 2024 20:48
@jsavell jsavell requested a review from a team May 9, 2024 14:49
@jsavell
Copy link
Member Author

jsavell commented May 31, 2024

Closing because all our updates were successfully merged upstream.

@jsavell jsavell closed this May 31, 2024
@jsavell jsavell deleted the sprint1-staging branch May 31, 2024 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants