Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
indexer-alt: epochs pipelines (#20150)
## Description Adds two tables and pipelines: `kv_epoch_starts` and `kv_epoch_ends`, to index epoch information. This pipeline is different from epoch indexing in `sui-indexer` in a number of ways: - It is an append-only pipeline. The columns that are written at the start and end of the epoch are split into two separate tables that can be written to concurrently (by separate pipelines). - The first row of `kv_epoch_starts` is written by the bootstrap process which seeds the `kv_genesis` table as well, this avoids having to condition on whether the checkpoint being processed is the genesis checkpoint in the main pipeline. - Instead of indexing the number of transactions in the epoch, it tracks the transaction high watermark -- readers will need to read the records to calculate the number of total transactions (this avoids having to read the last epoch's total transactions in the write path). - We index the `SuiSystemState` object as BCS, rather than the summary structure. - We explicitly record whether the epoch advancement at the end of the epoch triggered safe mode (the system state object also tracks whether the epoch was started in safe mode). - Fields related to information that came from `SystemEpochInfoEvent` have all been consolidated in `kv_epoch_ends`, and they are all optional, in case of safe mode. It's worth elaborating on the last bullet point, because this is quite a subtle, but large change: - Today, `total_stake` and `storage_fund_balance` are written at the start of an epoch based on the fields of the `SystemEpochInfoEvent` emitted from the previous epoch, and are `NOT NULL`. - The remaining fields were nullable, but only because they would be written to later, once the epoch was over. This was awkward to work with in a couple of ways: - It meant that for the genesis epoch, we needed to make some numbers up (all zeroes) because we did not have an event to read from. - We had to do something similar if we hit safe mode. - When indexing the start and end of epochs separately, it meant that we had to duplicate work (finding the system epoch info event). By making the fields nullable, and consolidating them in `kv_epoch_ends`, we can simplify the pipelines: - `kv_epoch_starts` and the bootstrapping logic can work purely based on the system state object. - `kv_epoch_ends` can work purely based on the `SystemEpochInfoEvent`, and can leave fields `NULL` if we are in safe mode. In the case of `kv_epoch_starts` we could also have cut down fields to just `epoch`, `cp_lo` and `system_state`. I chose not to do this because the system state is actually quite a large object, and it is beneficial to avoid having to deserialize to answer simpler queries. ## Test plan Ran the indexer on the first 1M checkpoints, and correlated the resulting info in the respective tables from the data that the current indexer produced: ``` sui_indexer_alt=# SELECT epoch, protocol_version, cp_lo, TO_TIMESTAMP(start_timestamp_ms / 1000), reference_gas_price FROM kv_epoch_starts; epoch | protocol_version | cp_lo | to_timestamp | reference_gas_price -------+------------------+--------+------------------------+--------------------- 0 | 4 | 0 | 2023-04-12 18:00:00+01 | 1000 1 | 4 | 9770 | 2023-04-13 18:00:02+01 | 1000 2 | 4 | 85169 | 2023-04-14 18:00:04+01 | 1000 3 | 4 | 161192 | 2023-04-15 18:00:08+01 | 1000 4 | 4 | 237074 | 2023-04-16 18:00:11+01 | 1000 5 | 4 | 314160 | 2023-04-17 18:00:15+01 | 1000 6 | 4 | 391107 | 2023-04-18 18:00:18+01 | 1000 7 | 4 | 467716 | 2023-04-19 18:00:21+01 | 1000 8 | 4 | 544978 | 2023-04-20 18:00:26+01 | 1000 9 | 5 | 621933 | 2023-04-21 18:00:28+01 | 1000 10 | 6 | 699410 | 2023-04-22 18:00:31+01 | 1000 11 | 6 | 777074 | 2023-04-23 18:00:34+01 | 1000 12 | 6 | 855530 | 2023-04-24 18:00:36+01 | 1000 13 | 6 | 933559 | 2023-04-25 18:00:39+01 | 1000 (14 rows) sui_indexer_alt=# SELECT epoch, cp_hi, tx_hi, TO_TIMESTAMP(end_timestamp_ms / 1000), safe_mode, storage_fund_balance, storage_fund_reinvestment, storage_charge, storage_rebate, stake_subsidy_amount, total_gas_fees, total_stake_rewards_distributed, leftover_storage_fund_inflow FROM kv_epoch_ends ORDER BY epoch ASC; epoch | cp_hi | tx_hi | to_timestamp | safe_mode | storage_fund_balance | storage_fund_reinvestment | storage_charge | storage_rebate | stake_subsidy_amount | total_gas_fees | total_stake_rewards_distributed | leftover_storage_fund_inflow -------+--------+--------+------------------------+-----------+----------------------+---------------------------+----------------+----------------+----------------------+----------------+---------------------------------+------------------------------ 0 | 9770 | 9771 | 2023-04-13 18:00:02+01 | f | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 1 | 85169 | 85174 | 2023-04-14 18:00:04+01 | f | 2973880 | 0 | 3952000 | 978120 | 0 | 102000000 | 102000000 | 0 2 | 161192 | 161199 | 2023-04-15 18:00:08+01 | f | 717398960 | 0 | 715403200 | 978120 | 0 | 1000000 | 1000000 | 0 3 | 237074 | 237084 | 2023-04-16 18:00:11+01 | f | 733657184 | 0 | 1430198400 | 1413940176 | 0 | 2000000 | 2000000 | 0 4 | 314160 | 314171 | 2023-04-17 18:00:15+01 | f | 733657184 | 0 | 0 | 0 | 0 | 0 | 0 | 0 5 | 391107 | 391119 | 2023-04-18 18:00:18+01 | f | 733657184 | 0 | 0 | 0 | 0 | 0 | 0 | 0 6 | 467716 | 467730 | 2023-04-19 18:00:21+01 | f | 735633184 | 0 | 1976000 | 0 | 0 | 1000000 | 1000000 | 0 7 | 544978 | 544994 | 2023-04-20 18:00:26+01 | f | 729859616 | 0 | 702475600 | 708249168 | 0 | 1000000 | 1000000 | 0 8 | 621933 | 621950 | 2023-04-21 18:00:28+01 | f | 729859616 | 0 | 0 | 0 | 0 | 0 | 0 | 0 9 | 699410 | 699428 | 2023-04-22 18:00:31+01 | f | 729859616 | 0 | 0 | 0 | 0 | 0 | 0 | 0 10 | 777074 | 777093 | 2023-04-23 18:00:34+01 | f | 729859616 | 0 | 0 | 0 | 0 | 0 | 0 | 0 11 | 855530 | 855550 | 2023-04-24 18:00:36+01 | f | 729859616 | 0 | 0 | 0 | 0 | 0 | 0 | 0 12 | 933559 | 933586 | 2023-04-25 18:00:39+01 | f | 735866656 | 0 | 13832000 | 7824960 | 0 | 6000000 | 6000000 | 0 (13 rows) ``` ## Stack - #20118 - #20132 - #20147 - #20148 - #20149 --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK: - [ ] REST API:
- Loading branch information