-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Concurrency To Download Vendor List #4005
base: master
Are you sure you want to change the base?
Changes from 3 commits
0c48c70
e0b1be2
0a5b937
7623704
e2baac5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -240,10 +240,21 @@ type Privacy struct { | |||
LMT LMT | ||||
} | ||||
|
||||
type VendorListFetcher struct { | ||||
// Max concurrent request for downloading the latest version of the vendor list from a specific major version. | ||||
// If it is 0 or negative, then means there is no limit for the max concurrency. | ||||
MaxConcurrencyInitFetchLatestVersion int `mapstructure:"max_concurrency_init_fetch_latest_version"` | ||||
|
||||
// Max concurrent request for downloading every specific version of the vendor list from its first version to its latest version. | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that the word "request" should be spelled in the plural form. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. good catch There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks ! |
||||
// If it is 0 or negative, then means there is no limit for the max concurrency. | ||||
MaxConcurrencyInitFetchSpecificVersion int `mapstructure:"max_concurrency_init_fetch_specific_version"` | ||||
} | ||||
|
||||
type GDPR struct { | ||||
Enabled bool `mapstructure:"enabled"` | ||||
HostVendorID int `mapstructure:"host_vendor_id"` | ||||
DefaultValue string `mapstructure:"default_value"` | ||||
Enabled bool `mapstructure:"enabled"` | ||||
HostVendorID int `mapstructure:"host_vendor_id"` | ||||
DefaultValue string `mapstructure:"default_value"` | ||||
Comment on lines
-244
to
+256
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that the vertical alignment was intentional. Why change the indentation style of those lines ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sebhtml do you mind elaborating more about but i am curious about other technical aspects related to it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The types are vertically aligned:
see prebid-server/config/config.go Line 258 in 0a5b937
I don't know if my comment is relevant though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be go fmt's decision; they actually look wrong before. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bsardo let me know your opinion |
||||
// TODO: migrate the fetcher timeout to VendorListFetcher | ||||
Timeouts GDPRTimeouts `mapstructure:"timeouts_ms"` | ||||
NonStandardPublishers []string `mapstructure:"non_standard_publishers,flow"` | ||||
NonStandardPublisherMap map[string]struct{} | ||||
|
@@ -255,6 +266,9 @@ type GDPR struct { | |||
// to DefaultValue | ||||
EEACountries []string `mapstructure:"eea_countries"` | ||||
EEACountriesMap map[string]struct{} | ||||
|
||||
// VendorListFetcher - configuration for fetching the vendor list from a remote source. | ||||
VendorListFetcher VendorListFetcher `mapstructure:"vendorlist_fetcher"` | ||||
} | ||||
|
||||
func (cfg *GDPR) validate(v *viper.Viper, errs []error) []error { | ||||
|
@@ -1140,6 +1154,8 @@ func SetupViper(v *viper.Viper, filename string, bidderInfos BidderInfos) { | |||
"FIN", "FRA", "GUF", "DEU", "GIB", "GRC", "GLP", "GGY", "HUN", "ISL", "IRL", "IMN", "ITA", "JEY", "LVA", | ||||
"LIE", "LTU", "LUX", "MLT", "MTQ", "MYT", "NLD", "NOR", "POL", "PRT", "REU", "ROU", "BLM", "MAF", "SPM", | ||||
"SVK", "SVN", "ESP", "SWE", "GBR"}) | ||||
v.SetDefault("gdpr.vendorlist_fetcher.max_concurrency_init_fetch_latest_version", 1) // by default one download at a time | ||||
v.SetDefault("gdpr.vendorlist_fetcher.max_concurrency_init_fetch_latest_version", 1) // by default one download at a time | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this line repeated twice ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. good catch, will change it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks ! |
||||
v.SetDefault("ccpa.enforce", false) | ||||
v.SetDefault("lmt.enforce", true) | ||||
v.SetDefault("currency_converter.fetch_url", "https://cdn.jsdelivr.net/gh/prebid/currency-file@1/latest.json") | ||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,11 +16,17 @@ import ( | |
"github.com/prebid/go-gdpr/vendorlist2" | ||
"github.com/prebid/prebid-server/v2/config" | ||
"golang.org/x/net/context/ctxhttp" | ||
"golang.org/x/sync/errgroup" | ||
) | ||
|
||
type saveVendors func(uint16, uint16, api.VendorList) | ||
type VendorListFetcher func(ctx context.Context, specVersion uint16, listVersion uint16) (vendorlist.VendorList, error) | ||
|
||
type vendorListVersionGroup struct { | ||
specVersion uint16 | ||
firstListVersion uint16 | ||
} | ||
|
||
// This file provides the vendorlist-fetching function for Prebid Server. | ||
// | ||
// For more info, see https://github.com/prebid/prebid-server/issues/504 | ||
|
@@ -32,7 +38,7 @@ func NewVendorListFetcher(initCtx context.Context, cfg config.GDPR, client *http | |
|
||
preloadContext, cancel := context.WithTimeout(initCtx, cfg.Timeouts.InitTimeout()) | ||
defer cancel() | ||
preloadCache(preloadContext, client, urlMaker, cacheSave) | ||
preloadCache(preloadContext, client, urlMaker, cacheSave, cfg.VendorListFetcher) | ||
|
||
saveOneRateLimited := newOccasionalSaver(cfg.Timeouts.ActiveTimeout()) | ||
return func(ctx context.Context, specVersion, listVersion uint16) (vendorlist.VendorList, error) { | ||
|
@@ -61,11 +67,8 @@ func makeVendorListNotFoundError(specVersion, listVersion uint16) error { | |
} | ||
|
||
// preloadCache saves all the known versions of the vendor list for future use. | ||
func preloadCache(ctx context.Context, client *http.Client, urlMaker func(uint16, uint16) string, saver saveVendors) { | ||
versions := [2]struct { | ||
specVersion uint16 | ||
firstListVersion uint16 | ||
}{ | ||
func preloadCache(ctx context.Context, client *http.Client, urlMaker func(uint16, uint16) string, saver saveVendors, conf config.VendorListFetcher) { | ||
versions := [2]vendorListVersionGroup{ | ||
{ | ||
specVersion: 2, | ||
firstListVersion: 2, // The GVL for TCF2 has no vendors defined in its first version. It's very unlikely to be used, so don't preload it. | ||
|
@@ -75,13 +78,38 @@ func preloadCache(ctx context.Context, client *http.Client, urlMaker func(uint16 | |
firstListVersion: 1, | ||
}, | ||
} | ||
for _, v := range versions { | ||
latestVersion := saveOne(ctx, client, urlMaker(v.specVersion, 0), saver) | ||
var wgLatestVersion errgroup.Group | ||
var wgSpecificVersion errgroup.Group | ||
|
||
for i := v.firstListVersion; i < latestVersion; i++ { | ||
saveOne(ctx, client, urlMaker(v.specVersion, i), saver) | ||
} | ||
if conf.MaxConcurrencyInitFetchLatestVersion > 0 { | ||
wgLatestVersion.SetLimit(conf.MaxConcurrencyInitFetchLatestVersion) | ||
} | ||
|
||
if conf.MaxConcurrencyInitFetchSpecificVersion > 0 { | ||
wgSpecificVersion.SetLimit(conf.MaxConcurrencyInitFetchSpecificVersion) | ||
} | ||
|
||
tsStart := time.Now() // For logging how long this takes | ||
for _, v := range versions { | ||
specVersion := v.specVersion | ||
firstVersion := v.firstListVersion | ||
wgLatestVersion.Go(func() error { | ||
latestVersion := saveOne(ctx, client, urlMaker(specVersion, 0), saver) | ||
for i := firstVersion; i < latestVersion; i++ { | ||
currentVersion := i | ||
wgSpecificVersion.Go(func() error { | ||
saveOne(ctx, client, urlMaker(specVersion, currentVersion), saver) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. to help reviewer understand the reason behind reason that i do not use channel or wait group+mutex is because the implementation of |
||
return nil | ||
}) | ||
} | ||
return nil | ||
}) | ||
} | ||
|
||
wgLatestVersion.Wait() | ||
wgSpecificVersion.Wait() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi ! I don't know if it matters. The WaitGroup Globally, you start Goroutines with a limit of So why have 2 wait groups instead of just 1 wait group ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sebhtml , good thought, however the reason i did it is to have fine granular control of two different behaviours:
plus provide sequential execution as default behaviour so the decision of tuning the controlling parameters is determined by the host. by using only one wait group, there are two thoughts behind it
then you have deadlock because one go routine is waiting for another go routine while only 1 go routine is available let me know if you have different idea There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh I see. You have a good point. I therefore resolve this thread. |
||
|
||
glog.Infof("Finished Preloading vendor lists within %v", time.Since(tsStart)) | ||
} | ||
|
||
// Make a URL which can be used to fetch a given version of the Global Vendor List. If the version is 0, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,6 +5,7 @@ import ( | |
"net/http" | ||
"net/http/httptest" | ||
"strconv" | ||
"sync" | ||
"testing" | ||
|
||
"github.com/stretchr/testify/assert" | ||
|
@@ -192,12 +193,16 @@ type versionInfo struct { | |
} | ||
type saver []versionInfo | ||
|
||
var saverMutex sync.Mutex | ||
|
||
func (s *saver) saveVendorLists(specVersion uint16, listVersion uint16, gvl api.VendorList) { | ||
vi := versionInfo{ | ||
specVersion: specVersion, | ||
listVersion: listVersion, | ||
} | ||
saverMutex.Lock() | ||
*s = append(*s, vi) | ||
saverMutex.Unlock() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I think that deferring can avoid that: saverMutex.Lock()
defer saverMutex.Unlock()
*s = append(*s, vi) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. updated There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks |
||
} | ||
|
||
func TestPreloadCache(t *testing.T) { | ||
|
@@ -253,7 +258,10 @@ func TestPreloadCache(t *testing.T) { | |
defer server.Close() | ||
|
||
s := make(saver, 0, 5) | ||
preloadCache(context.Background(), server.Client(), testURLMaker(server), s.saveVendorLists) | ||
preloadCache(context.Background(), server.Client(), testURLMaker(server), s.saveVendorLists, config.VendorListFetcher{ | ||
MaxConcurrencyInitFetchLatestVersion: 2, | ||
MaxConcurrencyInitFetchSpecificVersion: 2, | ||
}) | ||
|
||
expectedLoadedVersions := []versionInfo{ | ||
{specVersion: 2, listVersion: 2}, | ||
|
@@ -284,12 +292,6 @@ var vendorList2Expected = testExpected{ | |
vendorPurposes: map[int]bool{1: false, 2: true, 3: true}, | ||
} | ||
|
||
var vendorListFallbackExpected = testExpected{ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this code block is not used at all so i deleted it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did you need to delete it to accomplish your goal in this PR ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. but why we need to keep it if the code is not used at all? there is no comment or description associated with the deleted test case. i used a practice called There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. I see no problem with that. |
||
vendorListVersion: 215, // Values from hardcoded fallback file. | ||
vendorID: 12, | ||
vendorPurposes: map[int]bool{1: true, 2: false, 3: true}, | ||
} | ||
|
||
type vendorList struct { | ||
GVLSpecificationVersion uint16 `json:"gvlSpecificationVersion"` | ||
VendorListVersion uint16 `json:"vendorListVersion"` | ||
|
@@ -409,5 +411,9 @@ func testConfig() config.GDPR { | |
InitVendorlistFetch: 60 * 1000, | ||
ActiveVendorlistFetch: 1000 * 5, | ||
}, | ||
VendorListFetcher: config.VendorListFetcher{ | ||
MaxConcurrencyInitFetchLatestVersion: 2, | ||
MaxConcurrencyInitFetchSpecificVersion: 2, | ||
}, | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,13 @@ status_response: "ok" # default response string for /status endpoint | |
|
||
gdpr: | ||
default_value: "0" # disable gdpr, explicitly specifying a default value is a requirement in prebid server config | ||
vendorlist_fetcher: | ||
max_concurrency_init_fetch_latest_version: 2 | ||
max_concurrency_init_fetch_specific_version: 20 | ||
timeouts_ms: | ||
init_vendorlist_fetches: 30000 | ||
active_vendorlist_fetch: 30000 | ||
Comment on lines
+9
to
+14
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are those new configuration keys optional ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @zhongshixi I am not asking what are those new configuration keys for. I am asking whether or not the Prebid Server software works if you don't provide them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sebhtml
we have default settings, which is literally the same as before ( sequential download ) for
let me know if i answered your question There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So it's not backward-compatible. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @sebhtml i am not following
in summary, i do not think this PR incurs any compatibility issue. So i would like to ask if could you point out exactly which part of the code logic is not backwards compatible, maybe i am actually missing something There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know how to rephrase it. So I let it go. |
||
|
||
|
||
# set up stored request storage using local file system | ||
stored_requests: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the word "request" should be spelled in the plural form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !