-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added metrics for master in cluster mode #324
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, cluster aggregation was designed to aggregate homogenous data. For example, this will skew nodejs_eventloop_lag_seconds
, which is averaged. I guess if you're not collecting those default metrics in the master, then that won't affect the aggregation. The cluster master also isn't generally doing much work--I'm curious: what are you monitoring? At the least, I would note the above (regarding skewing when aggregating the same metrics from the master and workers) in the readme.
Also needs the changelog updated and ideally a test.
if (request.failed) return; // Callback already run with Error. | ||
if (request.failed) return; // Callback already run with Error. | ||
|
||
request.responses.push(Registry.globalRegistry.getMetricsAsJSON()); //add metrics of master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you correct the indentation here please?
I get it, maybe then this isn't the best option. |
Going to close this for now, but if you want to update it so it's configurable and documented, happy to reconsider. #280 is a more versatile solution to the same issue however. |
@zbjornson you mentioned this fix would "this will skew nodejs_eventloop_lag_seconds, which is averaged" - is that because each child, by default, doesn't have a If so, depending on the use case the default averaging might not be ideal, however for backwards compatibility #280 would be less breaking and still flexible enough. For my case, the master is effectively a router, all workers are similar but may have different metric values such as lag depending on their workload, so I've added a
I've also popped some comments on #280 , I might be able to pick that up and add tests. Still interested in the above question to clarify my understanding (if you have the time :) ). |
In cluster mode, only metrics from workers are aggregated, and if you use metrics i master also this isn't shown in aggregated metrics. This fix adds master metrics also.