Skip to content

Commit

Permalink
Merge pull request #328 from reshmabidikar/work-for-TS-185
Browse files Browse the repository at this point in the history
Changes for new metrics endpoint
  • Loading branch information
sbrossie authored Jan 14, 2025
2 parents 0eb8d77 + 823e174 commit 56daa5f
Showing 1 changed file with 83 additions and 16 deletions.
99 changes: 83 additions & 16 deletions source/includes/_aviate-health-apis.md
Original file line number Diff line number Diff line change
Expand Up @@ -335,24 +335,50 @@ None

If successful, returns a `HealthData` object.

### Retrieve Host Samples
### Retrieve Metrics

Returns metrics data for dashboards.
This endpoint returns metric data. This data can be used to assess the health of the system and gain visibility into it. The metrics are computed by the aviate plugin when the `com.killbill.billing.plugin.aviate.enableHealthReporter` property is set to true. Each metric is a timeseries that can be analyzed and/or displayed.

Some metrics are global to the deployment (i.e. independent of the number of Kill Bill nodes/containers deployed) and some others are per-node - in which case the `nodeName` parameter can be used to return metrics for a specific node.

**HTTP Request**

`GET /plugins/aviate-plugin/v1/health/host_samples`
`GET /plugins/aviate-plugin/v1/health/metrics`

> Example Request:
```shell
# Returns metric data for the queue.bus.incoming,queue.bus.processing, queue.bus.late for the duration 2024-12-19T00:00:00 to 2025-01-04T11:59:00 with HOURLY granularity
curl -X GET \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ${ID_TOKEN}' \
-H 'X-killbill-apiKey: bob' \
-H 'X-killbill-apisecret: lazar' \
http://127.0.0.1:8080/plugins/aviate-plugin/v1/health/host_samples?group=shiro.pool.Wait&from=2024-01-01T00:00:00&to=2025-03-14T00:00:00
http://127.0.0.1:8080/plugins/aviate-plugin/v1/health/metrics?from=2024-12-19T00:00:00&to=2025-01-04T11:59:00&metricName=queue.bus.incoming&metricName=queue.bus.processing&metricName=queue.bus.late&granularity=HOUR'
# Returns the metric data for the logs.rates.error metric on the ip-172-31-6-87 node
curl -X GET \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ${ID_TOKEN}' \
-H 'X-killbill-apiKey: bob' \
-H 'X-killbill-apisecret: lazar' \
http://127.0.0.1:8080/plugins/aviate-plugin/v1/health/metrics?nodeName=ip-172-31-6-87&name=logs.rates.error'

# Returns metric data for the logs.rates.error metric for all the nodes:
curl -X GET \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ${ID_TOKEN}' \
-H 'X-killbill-apiKey: bob' \
-H 'X-killbill-apisecret: lazar' \
http://127.0.0.1:8080/plugins/aviate-plugin/v1/health/metrics?name=logs.rates.error'
# Returns metric data for the queue.bus.incoming metric for all the nodes (nodeName is ignored since queue.bus.incoming is a global metric):
curl -X GET \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer ${ID_TOKEN}' \
-H 'X-killbill-apiKey: bob' \
-H 'X-killbill-apisecret: lazar' \
http://127.0.0.1:8080/plugins/aviate-plugin/v1/health/metrics?nodeName=ip-172-31-6-87&name=queue.bus.incoming'
```

```java
Expand All @@ -372,7 +398,18 @@ curl -X GET \

> Example Response:
TODO
````json
[
{
"nodeName": "",
"eventGroup": "gauge",
"eventCategory": "queue.bus.late",
"sampleKind": "value",
"samplesSerializationFormat": "csv",
"samples": "1736394067,0,1736394129,0,1736394189,0,1736394249,0,1736394309,0,1736394369,0,1736394429,0,1736394490,0,1736394550,0,1736394610,0,1736394670,0,1736394730,0,1736394790,0,1736394851,0,1736394911,0,1736394971,0,1736395031,0,1736395091,0,1736395151,0,1736395211,0,1736395272,0,1736395332,0,1736395392,0,1736395452,0,1736395512,0,1736395572,0,1736395632,0,1736395692,0,1736395753,0,1736395813,0,1736395873,0,1736395933,0,1736395993,0,1736396053,0,1736396113,0,1736396174,0,1736396234,0,1736396294,0,1736396354,0,1736396414,0,1736396474,0,1736396534,0,1736396595,0,1736396655,0,1736396715,0,1736396775,0,1736396835,0,1736396895,0,1736396955,0,1736397016,0,1736397076,0,1736397136,0,1736397196,0,1736397256,0,1736397316,0,1736397377,0,1736397437,0,1736397497,0,1736397557,0,1736397617,0,1736397677,0,1736397738,0,1736397798,0,1736397858,0,1736397918,0,1736397978,0,1736398038,0,1736398099,0,1736398159,0,1736398219,0,1736398279,0,1736398339,0,1736398399,0,1736398460,0,1736398520,0,1736398580,0"
}
]
````


**Request Body**
Expand All @@ -381,21 +418,51 @@ None

**Query Parameters**

| Name | Type | Required | Default | Description |
|------------------------|----------------|----------|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| **startTimeParameter** | string | false | none | Start time for the samples |
| **endTimeParameter** | string | false | Current time | End time for the samples |
| **hostNames** | List of String | false | None | List of host names. Multiple host names can be specified by specifying a separate `hostNames` parameter corresponding to each host |
| **group** | string | false | None | Event group for the requested sample kinds |
| **category_and_sample_kind** | List of String | false | None | List of samples kinds (format: category,sample_kind). Multiple category and sample kinds can be specified by specifying a separate `category_and_sample_kind` parameter corresponding to each value |
| **granularity** | SampleGranularity | false | None | Granularity (One of `SECOND`, `MINUTE`, `HOUR`, `DAY`) |
| Name | Type | Required | Default | Description |
|-----------------|----------------|----------|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| **from** | string | false | none | DateTime from which to start including metric data in `yyyy-dd-MMThh:mm:ss` format. If omitted, all the data since the aviate plugin started computing metrics is returned. |
| **to** | string | false | Current time | DateTime up to which metric data should be included in `yyyy-dd-MMThh:mm:ss` format. If omitted, all the data up to the current DateTime is returned. |
| **nodeName** | List of String | false | None | Specifies the name of the node for which metric data should be returned. This parameter applies only to node-specific metrics. When provided, it retrieves metric data for the specified node. For global metrics, this parameter is ignored, global data is returned regardless of the node. If omitted for a node-specific metric, metric data for all nodes will be returned. To specify multiple nodes, include a separate `nodeName` parameter for each node. |
| **metricName** | List of String | false | None | Name of the metric for which to return data.(See list below). To obtain data for multiple metrics, include a separate `metricName` parameter for each metric. |
| **granularity** | SampleGranularity | false | `MINUTE` | Specifies the time unit for the intervals between consecutive data points. One of `MINUTE`, `HOUR`, `DAY`) |

Below is the list of metric names. Any of these can be specified as the value for the `metricName` parameter.

* queue.bus.late
* queue.bus.incoming
* queue.bus.processing
* queue.notifications.late
* queue.notifications.incoming
* queue.notifications.processing
* logs.rates.warn
* logs.rates.error
* servlets.responses.ok
* servlets.responses.created
* servlets.responses.badRequest
* servlets.responses.noContent
* servlets.responses.notFound
* servlets.responses.serverError
* servlets.responses.other
* main.pool.TotalConnections
* main.pool.ActiveConnections
* main.pool.IdleConnections
* main.pool.Wait
* osgi.pool.TotalConnections
* osgi.pool.ActiveConnections
* osgi.pool.IdleConnections
* osgi.pool.Wait
* shiro.pool.TotalConnections
* shiro.pool.ActiveConnections
* shiro.pool.IdleConnections
* shiro.pool.Wait

**Response**

If successful, returns a status code of 200 and the requested host data.
If successful, returns a status code of 200 and the requested metric data.


// ### Fix Parked Accounts - This method is not implemented in the code, so not documenting it
<!-- ### Fix Parked Accounts - This method is not implemented in the code, so not documenting it
-->

### Fix Stuck Bus Entries

Expand Down Expand Up @@ -498,7 +565,7 @@ This endpoint generates a diagnostic report. The report includes logs, tenant co

A few pointers:

* if the `-H "Accept: application/zip" header is specified`, creates a zip file
* If the `-H "Accept: application/zip" header is specified`, creates a zip file
* At least one query parameter needs to be specified, otherwise an empty response is returned.
* HealthData and logs are returned only when the `-H "Accept: application/zip"` header is specified
* Logs from only a single node will be included
Expand Down

0 comments on commit 56daa5f

Please sign in to comment.