Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue on docs #116

Closed
tinneyi opened this issue Oct 15, 2024 · 3 comments
Closed

Issue on docs #116

tinneyi opened this issue Oct 15, 2024 · 3 comments
Assignees

Comments

@tinneyi
Copy link

tinneyi commented Oct 15, 2024

Path: /restapi/pagination

Your docs state:
_The minCursor and maxCursor fields in the response are boundaries that help you page through the result set.

For queries with descending order (_time DESC), use minCursor from the response as the cursor in your next request to go to the next page. You reach the end when your provided cursor matches the minCursor in the response.

For queries with ascending order (time ASC), use maxCursor from the response as the cursor in your next request to go to the next page. You reach the end when your provided cursor matches the maxCursor in the response.

But this is not quite the whole story.

Descending

If your sort is "desc" then minCursor starts at a non-zero value and each pagination is counts down until it reaches zero. At this point we know we have all the results.

EXAMPLE 1:
POST for first 10000 records of 28176.

curl -sDX 'https://api.axiom.co/v1/datasets/_apl?format=tabular' -H 'Authorization: Bearer <REDACTED>' -H 'Accept: application/json' -H 'Content-Type: application/json' -d '{"apl": "bank_feeds | sort by _time | limit 10000", "startTime": "2024-08-01T10:10:10Z"}' --output - | python -c 'import json,sys; print(json.load(sys.stdin)["status"])'

{'elapsedTime': 608386, 'minCursor': '0d44kphznnn5s-076250b666003989-00001388', 'maxCursor': '0d4jht0ruubk0-07685b45ed0027bd-00000d9b', 'blocksExamined': 6, 'rowsExamined': 28176, 'rowsMatched': 28176, 'numGroups': 0, 'isPartial': False, 'cacheStatus': 1, 'minBlockTime': '2024-08-19T15:04:37Z', 'maxBlockTime': '2024-09-30T08:57:36Z'}

POST for second 10000 records of 28176.

curl -sDX 'https://api.axiom.co/v1/datasets/_apl?format=tabular' -H 'Authorization: Bearer <REDACTED>' -H 'Accept: application/json' -H 'Content-Type: application/json' -d '{"apl": "bank_feeds | sort by _time | limit 10000", "startTime": "2024-08-01T10:10:10Z", "cursor": "0d44kphznnn5s-076250b666003989-0000138"}' --output - | python -c 'import json,sys; print(json.load(sys.stdin)["status"])'

{'elapsedTime': 371562, 'minCursor': '0d3woxi4micxs-076250b666003989-000007d0', 'maxCursor': '0d44kmblnhhj4-07685b45ed0027bd-000003e7', 'blocksExamined': 6, 'rowsExamined': 28176, 'rowsMatched': 18176, 'numGroups': 0, 'isPartial': False, 'cacheStatus': 1, 'minBlockTime': '2024-08-19T15:04:37Z', 'maxBlockTime': '2024-09-30T08:57:36Z'}

POST for third and final 10000 records of 28176

curl -sDX 'https://api.axiom.co/v1/datasets/_apl?format=tabular' -H 'Authorization: Bearer <REDACTED>' -H 'Accept: application/json' -H 'Content-Type: application/json' -d '{"apl": "bank_feeds | sort by _time | limit 10000", "startTime": "2024-08-01T10:10:10Z", "cursor": "0d3woxi4micxs-076250b666003989-000007d0"}' --output - | python -c 'import json,sys; print(json.load(sys.stdin)["status"])'

{'elapsedTime': 587296, 'minCursor': '0d3jzb5diil8g-076250b666003989-00000000', 'maxCursor': '0d49i41y7iuio-076250b666003989-0000163b', 'blocksExamined': 4, 'rowsExamined': 19692, 'rowsMatched': 8176, 'numGroups': 0, 'isPartial': False, 'cacheStatus': 1, 'minBlockTime': '2024-08-19T15:04:37Z', 'maxBlockTime': '2024-09-24T12:24:54Z'}

Ascending

However, if you sort ascending, the minCursor begins at zero, then increments until the last page of results, then it returns to zero.

EXAMPLE 2:
POST for first 10000 records of 28176.

curl -sDX 'https://api.axiom.co/v1/datasets/_apl?format=tabular' -H 'Authorization: Bearer <REDACTED>' -H 'Accept: application/json' -H 'Content-Type: application/json' -d '{"apl": "bank_feeds | sort by _time asc | limit 10000", "startTime": "2024-08-01T10:10:10Z"}' --output - | python -c 'import json,sys; print(json.load(sys.stdin)["status"])'
                                                 
{'elapsedTime': 414633, 'minCursor': '0d3jzb5diil8g-076250b666003989-00000000', 'maxCursor': '0d3xnxbrmi7sw-07685b451f000f26-00000376', 'blocksExamined': 6, 'rowsExamined': 28176, 'rowsMatched': 28176, 'numGroups': 0, 'isPartial': False, 'cacheStatus': 1, 'minBlockTime': '2024-08-19T15:04:37Z', 'maxBlockTime': '2024-09-30T08:57:36Z'}

POST for second 10000 records of 28176.

curl -sDX 'https://api.axiom.co/v1/datasets/_apl?format=tabular' -H 'Authorization: Bearer <REDACTED>' -H 'Accept: application/json' -H 'Content-Type: application/json' -d '{"apl": "bank_feeds | sort by _time asc | limit 10000", "startTime": "2024-08-01T10:10:10Z", "cursor": "0d3xnxbrmi7sw-07685b451f000f26-00000376"}' --output - | python -c 'import json,sys; print(json.load(sys.stdin)["status"])'

{'elapsedTime': 507944, 'minCursor': '0d3xnxvzayx34-076250b666003989-000007d0', 'maxCursor': '0d47oxdr4jv28-07685b45ed0027bd-00000757', 'blocksExamined': 6, 'rowsExamined': 28176, 'rowsMatched': 18176, 'numGroups': 0, 'isPartial': False, 'cacheStatus': 1, 'minBlockTime': '2024-08-19T15:04:37Z', 'maxBlockTime': '2024-09-30T08:57:36Z'}

POST for third and final 10000 records of 28176

curl -sDX 'https://api.axiom.co/v1/datasets/_apl?format=tabular' -H 'Authorization: Bearer <REDACTED>' -H 'Accept: application/json' -H 'Content-Type: application/json' -d '{"apl": "bank_feeds | sort by _time asc | limit 10000", "startTime": "2024-08-01T10:10:10Z", "cursor": "0d47oxdr4jv28-07685b45ed0027bd-00000757"}' --output - | python -c 'import json,sys; print(json.load(sys.stdin)["status"])'

{'elapsedTime': 273659, 'minCursor': '0d3jzb5diil8g-076250b666003989-00000000', 'maxCursor': '0d4jht0ruubk0-07685b45ed0027bd-00000d9b', 'blocksExamined': 5, 'rowsExamined': 25176, 'rowsMatched': 8176, 'numGroups': 0, 'isPartial': False, 'cacheStatus': 1, 'minBlockTime': '2024-08-19T15:04:37Z', 'maxBlockTime': '2024-09-30T08:57:36Z'}

QUESTION:

For ascending sorts, because minCursor starts and end at xero, how do I know that I didn't already receive all the records in the first POST.
EXAMPLE 3:
Here I ask for the first 30,000 of 28,176 results, which will return the whole result set in one go:

curl -sDX 'https://api.axiom.co/v1/datasets/_apl?format=tabular' -H 'Authorization: Bearer <REDACTED>' -H 'Accept: application/json' -H 'Content-Type: application/json' -d '{"apl": "bank_feeds | sort by _time asc | limit 30000", "startTime": "2024-08-01T10:10:10Z"}' --output - | python -c 'import json,sys; print(json.load(sys.stdin)["status"])'  
                                                   
{'elapsedTime': 480645, 'minCursor': '0d3jzb5diil8g-076250b666003989-00000000', 'maxCursor': '0d4jht0ruubk0-07685b45ed0027bd-00000d9b', 'blocksExamined': 6, 'rowsExamined': 28176, 'rowsMatched': 28176, 'numGroups': 0, 'isPartial': False, 'cacheStatus': 1, 'minBlockTime': '2024-08-19T15:04:37Z', 'maxBlockTime': '2024-09-30T08:57:36Z'}

But, I don't know that there are no more records, so I make a second POST using the maxCursor:

curl -sDX 'https://api.axiom.co/v1/datasets/_apl?format=tabular' -H 'Authorization: Bearer <REDACTED>' -H 'Accept: application/json' -H 'Content-Type: application/json' -d '{"apl": "bank_feeds | sort by _time asc | limit 30000", "startTime": "2024-08-01T10:10:10Z", "cursor": "0d4jht0ruubk0-07685b45ed0027bd-00000d9b"}' --output -
                                                                     
{"code":500,"message":"internal server error"}

I cannot see any way to correctly identify that there are no more records to ask for.

@tothmano
Copy link
Collaborator

tothmano commented Jan 8, 2025

@tinneyi Many thanks for this and sorry about the very late reply.

We have recently changed the docs on pagination and recommend timestamp-based pagination for most users: https://axiom.co/docs/restapi/pagination

Could you please let me know if this addresses your concerns?

@tothmano tothmano self-assigned this Jan 8, 2025
@tinneyi
Copy link
Author

tinneyi commented Jan 8, 2025 via email

@tothmano
Copy link
Collaborator

@tinneyi Many thanks for the detailed explanation. I'm sorry you had to figure this out yourself. We will do our best to provide better guidance and quicker responses in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants