-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENG-6284] render tsv/csv #834
[ENG-6284] render tsv/csv #834
Conversation
2503bce
to
47d2150
Compare
move SKIPPABLE_COLUMNS into osfmap
47d2150
to
a81abf7
Compare
b75d4e0
to
ce20fc4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple nits or questions, but nothing blocking. Tests look sufficient, but behavior should still be confirmed manually on staging.
Pass complete
trove/trovesearch/page_cursor.py
Outdated
@@ -14,10 +14,13 @@ | |||
MANY_MORE = -1 | |||
MAX_OFFSET = 9997 | |||
|
|||
DEFAULT_PAGE_SIZE = 13 | |||
MAX_PAGE_SIZE = 10000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Is this maximum reasonable? Looks like it was previously 101
.
Edit: I see the commit message called it "absurd," but I'm guessing it's also "justified for the sake of rendering files"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah the need here is downloading all results in one response, but i hesitated to make that behavior automagic by mediatype... considered making withFileName
obviate pagination whenever present, but overall i opted for consistent query param behavior, putting the onus on the client to string together all the params needed for the desired result (e.g. acceptMediatype=text/csv&page[size]=10000&withFileName=my-file-name
for a full csv download with up to 10000 rows)
if 10000 at once turns out to be unreasonable in practice... a more complicated (but less costly all-at-once) alternative might be view logic that queries/renders smaller pages one at a time and streams the results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update: now streams, loading only one page (~100 rows) at a time, but streaming more than ~4000 items total still times out -- can further optimize or we can talk about increasing those timeouts for responses that are actively sending data...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might not run into those same timeouts for ~4k items with production resourcing (or configuration -- unsure where you got that figure, but by default most nginx timeouts are between successive operations rather than the whole response), but I suspect it's fine for now and we can reevaluate if encountering that issue later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CardsearchResponse => CardsearchHandle ValuesearchResponse => ValuesearchHandle
88e566a
to
f3def1e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
allow rendering search responses as lines of tab-separated or comma-separated values
main point:
simple_tsv
andsimple_csv
renderers introve.render
acceptMediatype=text/tab-separated-values
oracceptMediatype=text/csv
DEFAULT_TABULAR_SEARCH_COLUMN_PATHS
introve.vocab.osfmap
withFileName=foo
query param to get a response withContent-Disposition: attachment
and a filename based on "foo"changes made along the way:
ProtoRendering
as renderer output type, to better decouple rendering from view logicStreamableRendering
for responses that might could be streamed, like csv/tsv (tho it's not currently handled any differently fromSimpleRendering
)BaseRenderer
(and each existing renderer) to have a consistent call signature (and returnProtoRendering
)trove.render.get_renderer
withtrove.render.get_renderer_type
-- instantiate the renderer with response datatrove.views._responder
with common logic for building a djangoHttpResponse
for aProtoRendering
withFileName
/Content-Disposition
trove.vocab.osfmap
for easier reusetrove.render.simple_json
intotrove.render._simple_trovesearch
(for renderers that include only the list of search results)tests.trove.derive._base
intotests.trove._input_output_tests
(for tests following the same simple input/output pattern as deriver and renderer tests)tests.trove.render
to cover the new rendererssimple_tsv
andsimple_csv
, as well as the existing renderersjsonapi
,simple_json
,jsonld
, andturtle