Skip to content

Commit

Permalink
Add broadcast_errors metric (#3710)
Browse files Browse the repository at this point in the history
* Add metric for broadcast errors

* Update the guide with the new broadcast error metric

* Add changelog entry
  • Loading branch information
ljoss17 authored Nov 28, 2023
1 parent e9e3736 commit 8e1391c
Show file tree
Hide file tree
Showing 4 changed files with 42 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
- Add a new metric `broadcast_errors`` which
records the errors observed when broadcasting Txs
([\#3708](https://github.com/informalsystems/hermes/issues/3708))
14 changes: 14 additions & 0 deletions crates/relayer/src/chain/cosmos/retry.rs
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,13 @@ async fn do_send_tx_with_account_sequence_retry(
refreshing account sequence number and retrying once"
);

telemetry!(
broadcast_errors,
&account.address.to_string(),
response.code.into(),
&response.log,
);

refresh_account_and_retry_send_tx_with_account_sequence(
rpc_client, config, key_pair, account, tx_memo, messages,
)
Expand Down Expand Up @@ -147,6 +154,13 @@ async fn do_send_tx_with_account_sequence_retry(
"failed to broadcast tx with unrecoverable error"
);

telemetry!(
broadcast_errors,
&account.address.to_string(),
code.into(),
&response.log
);

Ok(response)
}
}
Expand Down
24 changes: 24 additions & 0 deletions crates/telemetry/src/state.rs
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,9 @@ pub struct TelemetryState {

/// Sum of rewarded fees over the past FEE_LIFETIME seconds
period_fees: ObservableGauge<u64>,

/// Number of errors observed by Hermes when broadcasting a Tx
broadcast_errors: Counter<u64>,
}

impl TelemetryState {
Expand Down Expand Up @@ -371,6 +374,13 @@ impl TelemetryState {
.u64_observable_gauge("ics29_period_fees")
.with_description("Amount of ICS29 fees rewarded over the past 7 days")
.init(),

broadcast_errors: meter
.u64_counter("broadcast_errors")
.with_description(
"Number of errors observed by Hermes when broadcasting a Tx",
)
.init(),
}
}

Expand Down Expand Up @@ -1069,6 +1079,20 @@ impl TelemetryState {
pub fn add_visible_fee_address(&self, address: String) {
self.visible_fee_addresses.insert(address);
}

/// Add an error and its description to the list of errors observed after broadcasting
/// a Tx with a specific account.
pub fn broadcast_errors(&self, address: &String, error_code: u32, error_description: &String) {
let cx = Context::current();

let labels = &[
KeyValue::new("account", address.to_string()),
KeyValue::new("error_code", error_code.to_string()),
KeyValue::new("error_description", error_description.to_string()),
];

self.broadcast_errors.add(&cx, 1, labels);
}
}

use std::sync::Arc;
Expand Down
1 change: 1 addition & 0 deletions guide/src/documentation/telemetry/operators.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ the `backlog_oldest_sequence` that is blocked.
| `tx_latency_submitted` | Latency for all transactions submitted to a chain (i.e., difference between the moment when Hermes received an event until the corresponding transaction(s) were submitted), per chain, counterparty chain, channel and port | `u64` ValueRecorder | None |
| `cleared_send_packet_count_total`  | Number of SendPacket events received during the initial and periodic clearing, per chain, counterparty chain, channel and port | `u64` Counter | Packet workers enabled, and periodic packet clearing or clear on start enabled |
| `cleared_acknowledgment_count_total` | Number of WriteAcknowledgement events received during the initial and periodic clearing, per chain, counterparty chain, channel and port | `u64` Counter | Packet workers enabled, and periodic packet clearing or clear on start enabled |
| `broadcast_errors_total` | Number of errors observed by Hermes when broadcasting a Tx, per error type and account | `u64` Counter | Packet workers enabled |

Notes:
- The two metrics `cleared_send_packet_count_total` and `cleared_acknowledgment_count_total` are only populated if `tx_confirmation = true`.
Expand Down

0 comments on commit 8e1391c

Please sign in to comment.