Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store pca scores results #5265

Open
wants to merge 50 commits into
base: master
Choose a base branch
from
Open

store pca scores results #5265

wants to merge 50 commits into from

Conversation

isaak
Copy link
Member

@isaak isaak commented Jan 3, 2025

Description

-- adds an option to store the score of top 10 principal components. For now, the option is available if a user submitted the pca job analysis to run in the background and expected an email notification about the job status.

-- adds pca stat terms to the cxgn_statistics.obo file
-- updates pca selenium tests
-- updates fixture with the db patches and loaded cxgn_statistics.obo
-- updates mech tests in t/unit_mech/AJAX/Search/Trait.t, t/unit_mech/AJAX/_BrAPI_v1.t, t/unit_mech/AJAX/_BrAPIv2_phenotyping.t. Updates expected counts in records, pages, total cvtermes, etc., to reflect new cvterm entries in the fixture db.

Required:
-- run db patch
-- reload cxgn_ontology.obo
-- potentially delete double 'VARIABLE_OF' cvterm entry in the database

image

Checklist

  • Refactoring only
  • Documentation only
  • Fixture update only
  • Bug fix
    • The relevant issue has been closed.
    • Further work is required.
  • New feature
    • Relevant tests have been created and run.
    • Data was added to the fixture
      • Data was added via a patch in /t/data/fixture/patches/.
    • User-Facing Change
      • The user manual in /docs has been updated.
    • Any new Perl has been documented using perldoc.
    • Any new JavaScript has been documented using JSDoc.
    • Any new legacy JavaScript has been moved from /js to /js/source/legacy.

@isaak isaak mentioned this pull request Jan 7, 2025
Copy link
Contributor

@chris263 chris263 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-)

@ryan-preble
Copy link
Contributor

I ran the db patches and tried to save PCA results and was met with an error... I sent a slack message about it

@isaak
Copy link
Member Author

isaak commented Jan 10, 2025

I ran the db patches and tried to save PCA results and was met with an error... I sent a slack message about it

Please run this: #5038

and follow this message from @naama:
log into your postgres db and search for 'variable_of' cvterm

you will probably find 2, and you need to have only one with a cv_id of the 'relationship ontology'

you want to delete the duplicated 'variable_of 'cvterm , but before deleting it you want to update your cvterm_relationship table and do an update begin; update cvterm_relationship set type_id = CORRECT_VARIABLE_OF_CVTERM_ID where type_id = CVTERM_ID_TO_BE_DELETED (edited)

@ryan-preble
Copy link
Contributor

I ran the db patches and tried to save PCA results and was met with an error... I sent a slack message about it

Please run do this: #5038

and follow this message from @naama: log into your postgres db and search for 'variable_of' cvterm

you will probably find 2, and you need to have only one with a cv_id of the 'relationship ontology'

you want to delete the duplicated 'variable_of 'cvterm , but before deleting it you want to update your cvterm_relationship table and do an update begin; update cvterm_relationship set type_id = CORRECT_VARIABLE_OF_CVTERM_ID where type_id = CVTERM_ID_TO_BE_DELETED (edited)

Tried this, still does not work. Current issues I get (running breedbase docker):

  • Analysis details page is broken:
    image
  • After running db patches and reloading ontologies perl /home/production/cxgn/chado_tools/chado/bin/gmod_load_cvterms.pl -H dbhost_db -D dbname -s SGNSTAT -r postgres -d Pg -p dbpass /home/production/cxgn/sgn/ontology/cxgn_statistics.obo -u The error persists. variable_of cvterm is not duplicated but is missing outright. Note that the test fixture does not appear to have this by default.
  • Saving PCA results is bugged. Trying to do two PCAs on one dataset results in the following output:
    image
    This is a bit confusing because one cannot tell which PCA each save button belongs to.
  • Pressing the save button always results in error on first press saying each plot in the dataset (outliers_test_dataset from the test fixture in this case) is invalid. Error message fades away, and pressing the button a second time appears to allow one to view the saved PCs. But, following that link then takes you to the analysis details page, which is still broken.

I suspect all of these issues are caused by the cvterm problem, but I do not know how to fix it

@isaak
Copy link
Member Author

isaak commented Jan 13, 2025

The broken analysis detail page is fixed #5264. I merged master (with the fix) to the branch, so that should be resolved.

The 'variable_of' cvterm with the reference to the right cv should be in the database. That is required for anything related to traits.

Could you reproduce how the results links are all showing at the bottom, instead of right below the corresponding plots?

It should be as below:
image

@ryan-preble
Copy link
Contributor

The broken analysis detail page is fixed #5264. I merge master (with the fix) to the branch, so that should be resolved.

Could you reproduce how the results links are all showing at the bottom, instead of right below the corresponding plots?

  1. Use a dataset to create a PCA. Use the link in the user's home account to view that analysis on the PCA page.
  2. At the PCA page, run the exact same PCA using the same dataset. The results will be immediately loaded, but with two identical plots and two identical buttons.

@isaak
Copy link
Member Author

isaak commented Jan 13, 2025

The broken analysis detail page is fixed #5264. I merge master (with the fix) to the branch, so that should be resolved.
Could you reproduce how the results links are all showing at the bottom, instead of right below the corresponding plots?

1. Use a dataset to create a PCA. Use the link in the user's home account to view that analysis on the PCA page.

2. At the PCA page, run the exact same PCA using the same dataset. The results will be immediately loaded, but with two identical plots and two identical buttons.

Thanks, Ryan. I reproduced it. If you followed the results link from your profile page, it should have loaded the results in the pca page like below. Doesn't it do the same for you? It is not necessary to click the run pca button again for the same dataset, otherwise the same output is displayed again. I will think about a solution for it anyway.

image

@ryan-preble
Copy link
Contributor

The broken analysis detail page is fixed #5264. I merge master (with the fix) to the branch, so that should be resolved.
Could you reproduce how the results links are all showing at the bottom, instead of right below the corresponding plots?

1. Use a dataset to create a PCA. Use the link in the user's home account to view that analysis on the PCA page.

2. At the PCA page, run the exact same PCA using the same dataset. The results will be immediately loaded, but with two identical plots and two identical buttons.

Thanks, Ryan. I reproduced it. If you followed the results link from your profile page, it should have loaded the results in the pca page like below. Doesn't it do the same for you? It is not necessary to click the run pca button again for the same dataset, otherwise the same output is displayed again. I will think about a solution for it anyway.

Yes it works so long as one does not try to run the pca again for the same dataset. However, I am still getting an error with actually saving the results. When clicking the save PCs button, I get the following error message:
image
image
I can then click the save button again, and it will allow me to visit the analysis details page. This mostly works, however attempting to open the analysis results tab crashes it.
image

Any idea why the plots would be invalid? It could be related to the data in the test fixture - not sure.

@isaak
Copy link
Member Author

isaak commented Jan 20, 2025

2. At the PCA page, run the exact same PCA using the same dataset. The results will be immediately loaded, but with two identical plots and two identical buttons.

fixed 7e74097

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants