-
Notifications
You must be signed in to change notification settings - Fork 0
Replace strings with byte slices for filters #23
Conversation
Note: still need to account for utf8 rune size in the slicing inside some of the filters. |
rules/rule.go
Outdated
@@ -330,52 +346,35 @@ func (rs *ruleSet) rollupResults(id string) []RollupResult { | |||
} | |||
|
|||
type matchedValue struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are doing N^2 passes below, this struct is no longer needed.
} | ||
// toRollupResults encodes rollup target name and values into ids for each rollup target | ||
func (rs *ruleSet) toRollupResults(id []byte, targets []RollupTarget) []RollupResult { | ||
// NB(r): This is n^2 however this should be quite fast still as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we replacing one-pass scan with n-pass scans? Is it to save the memory allocation for the map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly because using bytes makes this need to use a map of hashed IDs which seems more expensive overall in terms of creation and computation than simply just n^2 on a small set of matched rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, fair enough, if the number of rollup rules gets out of hand we can revisit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Opened issue #24 to tackle UTF8 in future. |
This change makes all APIs that interact with metric IDs and tags from string arguments to byte slices. This helps us use a lower common denominator for components that don't need any string representation of IDs but still need filtering capabilities. We've seen in benchmarks this drastically reduces overhead in our more high throughput components.
cc @xichen2020 @martin-mao @kobolog @kobolog