Skip to content
This repository has been archived by the owner on Jun 9, 2024. It is now read-only.

perf(handler): add retry logic to broadcasting to cometbft mempool #1242

Merged
merged 4 commits into from
Oct 23, 2023

Conversation

itsdevbear
Copy link

@itsdevbear itsdevbear commented Oct 21, 2023

Summary by CodeRabbit

  • New Feature: Introduced a retry mechanism for failed transactions, enhancing the reliability of transaction broadcasting. This feature automatically attempts to re-broadcast any transactions that initially fail, improving the overall user experience by ensuring transactions are processed more consistently.
  • Test: Enhanced JSON RPC test handling with additional assertions for nonce values and a new helper function. This ensures more robust testing and validation of transactions, contributing to the overall stability and reliability of the system.
  • Refactor: Improved code structure and clarity by renaming and modifying key functions, making the system easier to maintain and update in the future.

@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2023

Walkthrough

The changes introduce a retry mechanism for failed transactions and enhance nonce handling in JSON RPC tests. The retry mechanism improves transaction broadcasting reliability, while the nonce handling changes ensure accurate transaction processing.

Changes

File Summary
cosmos/txpool/handler.go Introduced a retry mechanism for failed transactions. Added a new failedTx struct and failedTxs channel. Modified broadcastTransaction to track retries. Added a new failedLoop goroutine for retrying failed transactions. Renamed mainLoop to eventLoop.
e2e/localnet/.../fixture_test.go Enhanced nonce handling in JSON RPC tests. Added assertions for nonce values and a new helper function ExpectSuccessReceipt for transaction receipt verification.

"In the land of code, where logic is king, 🐇
A rabbit hopped forth, changes to bring. 📝
Transactions retry, if at first they fail, 🔄
Nonce handling improved, without any frail. 🛠️
Celebrate we must, for the code is now robust! 🎉"


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.

CodeRabbit Commands

  • Enter @coderabbitai pause as a PR comment to pause the reviews on a PR.
  • Enter @coderabbitai resume as a PR comment to resume the paused reviews.
  • For on-demand reviews, enter @coderabbitai review as a PR comment to trigger a review. This is useful when reviews are automatic disabled at the repository level.
  • Enter @coderabbitai: ignore in the PR description to stop a specific PR from being reviewed.

Note these commands work only in the PR comment. For conversations with the bot, please use the review comments on code diffs or files.

@itsdevbear itsdevbear changed the base branch from main to fix-test October 21, 2023 06:42
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Commits Files that changed from the base of the PR and between 4066dfb and 9ad14ff.
Files selected for processing (2)
  • cosmos/txpool/handler.go (9 hunks)
  • e2e/localnet/network/fixture_test.go (1 hunks)
Additional comments (Suppressed): 12
e2e/localnet/network/fixture_test.go (3)
  • 192-194: The result of contract.ConsumeGas is now assigned to tx which is used later to wait for the transaction receipt. This is a good practice as it allows for better tracking and handling of transactions.

  • 208-209: The ExpectSuccessReceipt function is used to wait for the transaction receipt and verify success. This is a good addition as it ensures that the transaction has been processed successfully before proceeding with the test.

  • 211-216: The test now checks that NonceAt and PendingNonce are equal after the transaction has been included in a block. This is a good practice as it verifies that the nonce values are updated correctly after a transaction.

cosmos/txpool/handler.go (9)
  • 38-44: The introduction of maxRetries and retryDelay constants is a good practice for managing the retry mechanism. However, ensure that these values are optimal for the system's performance and do not lead to unnecessary retries or delays.

  • 72-76: The failedTx struct is a good addition to handle failed transactions. It encapsulates the transaction and the number of retries, which is useful for the retry mechanism.

  • 94-94: The failedTxs channel is used to queue failed transactions for retry. This is a good approach to handle failed transactions and improve the reliability of transaction broadcasting.

  • 108-108: The failedTxs channel is initialized with the same size as the txsCh channel. Ensure that this size is optimal for the system's performance and does not lead to unnecessary memory usage.

  • 118-119: The mainLoop and failedLoop goroutines are started concurrently. This is a good approach to handle new transactions and retry failed transactions simultaneously.

  • 130-131: Two stop signals are sent to the stopCh channel to stop both the mainLoop and failedLoop goroutines. This is a good approach to ensure that both goroutines are stopped when the handler is stopped.

  • 156-173: The failedLoop function is a good addition to handle the retry of failed transactions. It listens for failed transactions from the failedTxs channel and retries them until the maximum number of retries is reached. The use of a delay between retries is a good practice to avoid overwhelming the system with retries.

  • 203-203: The broadcastTransaction function is now called with an initial retry count of 0 for each transaction. This is a good approach to track the number of retries for each transaction.

  • 208-227: The broadcastTransaction function now includes a retries parameter and adds failed transactions to the failedTxs channel for retry. This is a good approach to handle failed transactions and improve the reliability of transaction broadcasting. However, ensure that the retry mechanism does not lead to unnecessary retries or delays.

@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2023

CodeRabbit review skipped

By default, CodeRabbit only reviews PRs on the default branch. If you wish to have PRs reviewed on additional branches, you can configure that under the repository settings in the UI.

@codecov
Copy link

codecov bot commented Oct 21, 2023

Codecov Report

Merging #1242 (9a50c61) into main (4066dfb) will increase coverage by 0.01%.
Report is 1 commits behind head on main.
The diff coverage is 52.94%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1242      +/-   ##
==========================================
+ Coverage   49.23%   49.24%   +0.01%     
==========================================
  Files          77       77              
  Lines        4611     4630      +19     
==========================================
+ Hits         2270     2280      +10     
- Misses       2178     2187       +9     
  Partials      163      163              
Files Coverage Δ
cosmos/txpool/handler.go 68.53% <52.94%> (-4.32%) ⬇️

Base automatically changed from fix-test to main October 21, 2023 07:03
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Commits Files that changed from the base of the PR and between 9ad14ff and 9a50c61.
Files selected for processing (1)
  • cosmos/txpool/handler.go (9 hunks)
Additional comments (Suppressed): 10
cosmos/txpool/handler.go (10)
  • 28-29: The import of the time package is new and is used for the retry delay in the retry mechanism. Ensure that this package is compatible with the rest of the codebase.

  • 40-44: The constants maxRetries and retryDelay have been added for the retry mechanism. Ensure that these values are appropriate for the application's requirements.

  • 72-76: The failedTx struct has been introduced to represent transactions that failed to broadcast. This is a good approach to encapsulate the transaction and its retry count.

  • 93-95: The failedTxs channel has been added to the handler struct to store transactions that failed to broadcast. This is a good approach to queue failed transactions for retry.

  • 108-108: The failedTxs channel is initialized with the same size as the txsCh channel. Ensure that this size is appropriate for the expected number of failed transactions.

  • 118-119: Two goroutines, mainLoop and failedLoop, are started in the Start method. This is a good approach to handle new transactions and retry failed transactions concurrently.

  • 129-131: Two stop signals are sent to the stopCh channel to stop both the mainLoop and failedLoop goroutines. This is a good approach to ensure that both goroutines stop when the handler is stopped.

  • 156-173: The failedLoop function has been introduced to periodically retry failed transactions. This is a good approach to enhance the reliability of transaction broadcasting.

  • 203-203: The broadcastTransaction function is now called with a retries parameter. This is a good approach to track the number of retry attempts for each transaction.

  • 208-227: The broadcastTransaction function has been modified to handle failed transactions. If a transaction fails to broadcast, it is added to the failedTxs channel for retry. This is a good approach to handle failed transactions.

@itsdevbear
Copy link
Author

might upgrade this to use a heap + exponential backoff.

@calbera
Copy link
Contributor

calbera commented Oct 21, 2023

makes sense so far. can also get more sophisticated with retry priority

@itsdevbear
Copy link
Author

yup, was a 2am first pass for sure.

@itsdevbear
Copy link
Author

in practice, the retries will almost never be useful. If the retries are even required something has likely gone really wrong with comet.

@itsdevbear
Copy link
Author

I am still pro gossip via geth devp2p

@calbera
Copy link
Contributor

calbera commented Oct 21, 2023

yeah broadly agree, seems like overall more performant/battle-tested

@itsdevbear itsdevbear merged commit 12f300b into main Oct 23, 2023
14 checks passed
@itsdevbear itsdevbear deleted the retry-handler branch October 23, 2023 16:39
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants