Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle long requests with POST #42

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ESapenaVentura
Copy link
Contributor

@ESapenaVentura ESapenaVentura commented Jan 19, 2023

As explained in the documentation for Esearch and EFetch, when dealing with long lists of terms/ids, they need to be POSTed rather than included in the url as a parameter.

This fixes an issue with GSE196830, which has too many runs.

Also, minor fix on the requests library, which was imported twice

Copy link
Contributor

@amnonkhen amnonkhen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a ticket citing relevant accession for which this problem was relevant. You can look as examples the existing geo-to-hca tickets.
Add the ticket id to the pr.

"usehistory": "y",
"format": "json",
}
if len(term) <= 200:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use "magic numbers" in the code.
Define it as a constant.
If the number's source's external documentation, please add a link to the relevant document.

Copy link
Contributor Author

@ESapenaVentura ESapenaVentura Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation is very cryptic, literally says more than several hundred characters long for eSearch, and if more than about 200 UIDs are to be provided, the request should be made using the HTTP POST method for EFetch. That's why I chose 200, the terms field can probably take more characters but I'd need to make tests to see how many hundred.

The documentation is here https://www.ncbi.nlm.nih.gov/books/NBK25499/#chapter4.ESearch for Esearch and https://www.ncbi.nlm.nih.gov/books/NBK25499/#_chapter4_EFetch_ for EFetch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add it as a constant, though!

if efetch_response.status_code == STATUS_ERROR_CODE:
raise handle_errors.NotFoundSRA(efetch_response, accessions)
return efetch_response
elif mode == 'prepare':
return Request(method='GET',
if len(accessions_string) <= 200:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My earlier comment about the '200' magic number is amplified here.
Using something more then once is even a stringer reason to have it as a constant.

@amnonkhen amnonkhen changed the title Properly handling long requests with POSTS handle long requests with POST Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants