Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
indexer-alt: allow sequential pipeline immediate reset (#20119)
## Description If the sequential pipeline committer can guarantee that it could process more checkpoints by looking at its pending buffer, we now reset the polling interval immediately, so it does not wait to issue the next write. This mimics a similar behaviour in the concurrent pipeline. I made this change after noticing how the pipeline behaves when ingestion is stuck retrying a checkpoint, on my local machine. Usually when running locally, performance is limited by checkpoint download rate, but in a sequential pipeline, if a checkpoint failed to download, it is possible for many checkpoints to end up processed and pending. With the previous implementation, once ingestion had recovered (the checkpoint is fetched), the pending buffer kept growing because it was only able to land `MAX_BATCH_CHECKPOINTS / commit_interval`, so if checkpoints were getting added faster than that, it would never recover. With this change, the pipeline recovers almost instantly, and I expect that in GCP where bandwidth is not the rate limiting factor, this should improve throughput during backfill, and synthetic benchmarks. ## Test plan Run the indexer with a large ingestion buffer and concurrency, wait for ingestion to fail to fetch a checkpoint, and then notice how the situation recovers (instead of getting worse until the pipeline eventually complains that it has too many pending checkpoints): ``` sui$ cargo run -p sui-indexer-alt --release -- \ --database-url "postgres://postgres:postgrespw@localhost:5432/sui_indexer_alt" \ indexer --remote-store-url https://checkpoints.mainnet.sui.io \ --last-checkpoint 1200000 --pipeline sum_packages \ --checkpoint-buffer-size 50000 --ingest-concurrency 20000 ``` ## Stack - #20089 - #20114 - #20116 - #20117 --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK: - [ ] REST API:
- Loading branch information