You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a workflow is deployed as a serverless Knative Service, and a new workflow instance is triggered, a pod for the workflow gets started automatically, and after the instance is finished, Knative will automatically terminate the pod by scaling down the corresponding k8s deployment to zero replica. The workflow pods last for a short period before they get terminated.
As as result, Prometheus may not have the chance to scrap the metrics from the workflow on time and they may miss such metrics if the pods are already terminated and this leads to the accuracy issue of the dashboards.
This issue is created to implement a solution to overcome such limitation and implement a metrics collector as a push gateway, and a Kogito extension for the workflows to push their metrics to the collector, and prometheus will then scrap metrics from the collector instead. Knative documentation uses such collector for its own components as an example: https://knative.dev/docs/eventing/observability/metrics/collecting-metrics/#understanding-the-collector
Implementation ideas
No response
The text was updated successfully, but these errors were encountered:
Description
When a workflow is deployed as a serverless Knative Service, and a new workflow instance is triggered, a pod for the workflow gets started automatically, and after the instance is finished, Knative will automatically terminate the pod by scaling down the corresponding k8s deployment to zero replica. The workflow pods last for a short period before they get terminated.
As as result, Prometheus may not have the chance to scrap the metrics from the workflow on time and they may miss such metrics if the pods are already terminated and this leads to the accuracy issue of the dashboards.
This issue is created to implement a solution to overcome such limitation and implement a metrics collector as a push gateway, and a Kogito extension for the workflows to push their metrics to the collector, and prometheus will then scrap metrics from the collector instead. Knative documentation uses such collector for its own components as an example:
https://knative.dev/docs/eventing/observability/metrics/collecting-metrics/#understanding-the-collector
Implementation ideas
No response
The text was updated successfully, but these errors were encountered: