-
Notifications
You must be signed in to change notification settings - Fork 6
Database Design
We use a few different data sources:
- ScyllaDB
- AWS S3
- NEAR JSON-RPC
The main source of data for the ReadRPC project is ScyllaDB. However, sometimes, the data size is too big to duplicate in ScyllaDB when the data is on AWS S3. Sometimes, we store a “map” of the AWS S3 data in the ScyllaDB to know exactly where to look.
Thus, the design of the ScyllaDB tables follows the approach for the most efficient way to read the data and return the result to a ReadRPC requester.
Data in ScyllaDB is filled with dedicated indexers built on top of the NEAR Lake Framework (Rust):
-
state-indexer
- everything account state related + metadata for blocks/chunks -
tx-indexer
- collecting “full” transactions (Transaction itself, related Receipts and ExecutionOutcomes) for faster reading
A contract state can be one of these types:
- Data - contract state data
- AccessKey_ - Access Keys added to an account
- Contract code - contract code itself
- Account - account-related data like balance
Since the read use cases for all these types are separate, we split them into separate tables to avoid unnecessary data filtering (following the recommended approach of ScyllaDB).
StateChangeWithCauseView
structure from near-primitives::views
is used.
While I haven’t created a doc where a reader can learn more about StateChangesWithCauseView
, I’ll try to explain in a few words here.
This structure represents all the changes to the state happening in the particular Block.
If a contract call happens and, as a result, its state has changed, it will be reflected in the StateChangeWithCauseView
for the block where it happened.
For example, if someone has created a post in NEAR Social, it means that the value of the state for the NEAR Social contract has changed
{"root.near": { post: { main: null } } }
{"root.near": { post: { main: { content: "Hello world!", type: "md", blockHeight: 100500 } } } }
Though the data structure might be different, all the state is represented as a key-value storage in the blockchain, so we can observe a change that will tell us:
The state for contract social.near
has been updated in this block. Now the value for the key <blob>
is <blob>
, thank you for your attention.
So, if we watch for all such changes, we can store them and have a history:
<block_height> <contract_id> <key> <value>
Thus, later, we can retrieve a value for a specific block.
account_id | block_height | block_hash | data_key | data_value |
---|---|---|---|---|
varchar | varint | varchar | varchar | BLOB |
hex of the key BLOB to be able to perform LIKE queries |
PARTITION KEY (account_id, data_key)
CLUSTERING KEY block_height WITH CLUSTERING ORDER BY (block_height DESC)
Note: NEAR blockchain nature allows more than one change to the same key within a single block. It will end up storing the very last of them, though. Thus, instead of relying on rewriting the same row repeatedly, we introduced a logic in the indexer where we ignore all the previous changes to the key and store only the last one in ScyllaDB.
P.S. The first naive approach was to rely on the UPSERT
, but asynchronous code screwed it, so we had the above logic.
account_id | block_height | block_hash | data_key | data_value |
---|---|---|---|---|
varchar | varint | varchar | varchar | BLOB |
hex of the key BLOB to be able to perform LIKE queries |
PARTITION KEY (account_id, data_key)
CLUSTERING KEY block_height WITH CLUSTERING ORDER BY (block_height DESC)
You may notice that the table state_changes_access_key
has the same structure as state_changes_data. This is because the data structure representing these types of data is the same. Thus, we just split the change types for more efficient read operations.
The same AccessKey
in the NEAR blockchain can be added and deleted for the same account a bunch of times within the same block, and yet we need to store only the very last change, so we have a custom logic to skip all the changes for the same account with the same AccessKey
except for the last change.
account_id | block_height | active_access_keys |
---|---|---|
varchar | varint | map<varchar, BLOB> |
Map key is the account’s PublicKey and the value is AccessKey (permission, allowance, etc.) |
PARTITION KEY account_id
CLUSTERING KEY block_height WITH CLUSTERING ORDER BY (block_height DESC)
This table efficiently answers the request “Give me all AccessKeys for the account for the given block”.
It stores the history of AccessKeys
for the account at a given moment.
The logic behind this table is hidden behind the feature account_access_keys
. Because having this naive approach leads to fat partitions, we need to redesign it entirely.
account_id | block_height | block_hash | data_value |
---|---|---|---|
varchar | varint | varchar | BLOB |
Contract code bytes |
PARTITION KEY account_id
CLUSTERING KEY block_height WITH CLUSTERING ORDER BY (block_height DESC)
This table stores the history of contract code changes.
account_id | block_height | block_hash | data_value |
---|---|---|---|
varchar | varint | varchar | BLOB |
Account data |
PARTITION KEY account_id
CLUSTERING KEY block_height WITH CLUSTERING ORDER BY (block_height DESC)
Similarly to contract code, this table stored the account-related data, represented by the near-primitives::account::Account
structure (amount, locked, code hash, storage usage, version)
It serves the needs for requests like “Give me account X for the block N”.
account_id | data_key |
---|---|
varchar | varchar |
Hex of the blob key to perform LIKE queries |
PARTITION KEY account_id
CLUSTERING KEY data_key
This helper table stores available contract keys, so-called the most recent ones.
By core primitives, I mean entities straight from the blockchain like Block
, Chunk
, Transaction
, Receipt
, ExecutionOutcome
, StateChanges
, etc.
chunk_hash | block_height | shard_id | stored_at_block_height |
---|---|---|---|
varchar | varint | varint | varint |
PRIMARY KEY (chunk_hash, block_height)
The chunks
table, along with the Global Secondary Index on block_height
serves the needs of methods like block
and chunk
.
This table is a “map” that helps the ReadRPC server to find the read data on AWS S3.
transaction_hash | block_height | account_id | transaction_details |
---|---|---|---|
varchar | varint | varchar | BLOB |
We used to have a TX hash collision on mainnet some time ago. Having this value prevents us from troubles | Borsh-serialized bytes of the structure TransactionDetails introduced in the tx-indexer (stores all the entities related to a single transaction) |
PARTITION KEY transaction_hash
CLUSTERING KEY block_height WITH CLUSTERING ORDER BY (block_height DESC)
This table serves the needs of a couple of methods. One of them returns the requested Transaction
with the initial ExecutionOutcome
and a Receipt
the Transaction
was converted into. The other one returns Transaction
and all the Receipts
and ExecutionOutcome
in the Transaction
chain.
We are collecting this data with the tx-indexer
and decided to store the entire response for fast response.
receipt_id | block_height | parent_transaction_hash | shard_id |
---|---|---|---|
varchar | varint | varchar | varint |
This is a way to make a dream come true | |||
PARTITION KEY receipt_id |
This table is a “map” to find Receipts
on AWS S3 faster. Meanwhile, storing the field parent_transaction_hash
is a way to make a brand-new method that can solve many problems for indexer developers.