Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have opened this PR for experimental purposes to resolve the OOM issue in warehouses. Once I get approval for my approach, I will add other things such as test cases, and this PR is currently in draft mode.
Jira id : https://segment.atlassian.net/browse/CONSENT-139
Purpose :
in tsub library, currently we are using
ctlstore/ldb_reader.go
Line 144 in 146e400
and query is
SELECT * FROM tsub_store___rules_materialized_2 WHERE scope = ?
but this is loading ~83k data in cache map for
scope=destinations
. and in every ttl expiry, we are again loading allscope=destinations
based rules.which is causing OOM in warehouses.
To avoid this issue we want to load only warehouseId specific rules only in cache map for warehouses. which will require
targetId like %destinationId%
query.Because this method only provides exact match using
=
operator. So, introduced a new method for like query.so, now new function will generate & execute below query
SELECT * FROM tsub_store___rules_materialized_2 WHERE target_id LIKE %destinationId%
Summary
mainly memory consumption is in storing and deserializing process of cacheRules(..) function in tsub's ctlstore.go
we are currently loading ~83K rows from db
storing them and deserializing each rows in every ttl expiry
this is the main cause of OOM in warehouses, other storage destinations are fine with it
if i reduce this number from 83K to ~2 digits number by not loading all rules based on scope but by loading all rules based on warehouseId..
then no OOM in warehouses as well