-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: New vtgate Metrics #17585
Comments
I believe that implementing this RFC would solve the following issue: #16391 |
I thought it would be useful to list the personas that might influence or benefit from these metrics and would be useful to consider in the analysis. Here are the most common ones in the context of a system like Vitess:
|
Updated the categories for |
I like this renewed focus! The main question/concern I would have remains around breaking down the Complex category, as well as other known problematic categories.
Performance impact of queries in the Complex category has the potential to be exponential, so even a relatively small number of queries, percentage-wise, would always warrant a closer look. How can we use the proposed dimensions to "zoom in" on this category to better understand what is happening? Does the proposal intend to provide answers for the new types of diagnostic questions these categories will result in? To give a more practical example, how would we fill the blank in the following process?
One possible approach that doesn't rely purely on Metrics: Are there ways we could answer the question using purely Metrics? |
Unrelated to my prior comment, but another unmistakable Metrics gap in perspective that can only be gained from the vtgate's POV: TransactionsProcessedByShardDistribution
TransactionsProcessedByTransactionType
There might be more to explore there, but given that we already track the transaction context, it feels relatively trivial to start gathering some meaningful numbers about these events. |
Good news! we do have the metrics now in v22 you will see We just added a boolean to understand if anything is modified inside the transaction or not. |
It is not feasible to emit slow queries in metrics. We can use querylog stream of this or the plan stream, which exists today. |
1. Summary
Vitess currently classifies query plan types in a way that is neither intuitive nor helpful for performance analysis. In particular,
QueriesProcessed
andQueriesRouted
rely on plan-type designations that are inconsistent across different operators (e.g.,IN
,Concatenate
,DDL
,Reference
,FkCascade
, andInsertSelect
). This proposal introduces two new metrics and deprecates the older, less-informative ones.2. Motivation
Inconsistent Plan-Type Metric
Route
) reported the route type, while others forwarded whatever their child operator returned.Limited Usefulness
QueriesProcessed
andQueriesRouted
provide only a coarse breakdown.Need for Clarity
3. Proposed Changes
3.1 Deprecation
QueriesProcessed
QueriesRouted
3.2 New Metric:
QueriesProcessedByQueryType
We propose categorizing queries into eight distinct buckets to capture both common and potentially problematic execution patterns:
Passthrough
MultiShard
Scatter
Lookup
Join
Complex
OnlineDDL
DirectDDL
Example Metric Name
QueriesProcessedByQueryType{queryType="Passthrough"}
QueriesProcessedByQueryType{queryType="Complex"}
, etc.3.3 New Metric:
QueriesProcessedByStatementType
Possible Categories (not exhaustive):
SELECT
INSERT
UPDATE
DELETE
SET
DDL
(could be further subdivided if desired)3.4 New Histogram: “Shards Accessed per Query”
0, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512
ShardsAccessedHistogram
(or similar).This histogram shows the distribution of queries across shard counts, helping identify where queries may be fanning out more than expected.
4. Backward Compatibility
5. Open Questions
0, 1, 2, 4, …
) adequate for most production workloads?QueriesProcessed
andQueriesRouted
?The text was updated successfully, but these errors were encountered: