Add IP quantization to obfuscation #801

VianneyRuhlmann · 2024-12-16T13:00:36Z

What does this PR do?

Add IP quantization used to obfuscate peer-tags fields

Motivation

What inspired you to submit this pull request?

Additional Notes

Anything else we should know when reviewing?

How to test the change?

Describe here in detail how the change can be validated.

pr-commenter · 2024-12-16T13:10:43Z

Benchmarks

Comparison

Benchmark execution time: 2024-12-17 10:30:21

Comparing candidate commit 806ae3d in PR branch vianney/trace_obfuscation/quantize_peer_ip with baseline commit d0a8039 in branch main.

Found 0 performance improvements and 4 performance regressions! Performance is the same for 47 metrics, 2 unstable metrics.

scenario:credit_card/is_card_number/x371413321323331

🟥 execution_time [+1.205µs; +1.207µs] or [+17.606%; +17.631%]
🟥 throughput [-21902433.982op/s; -21872653.281op/s] or [-14.990%; -14.969%]

scenario:credit_card/is_card_number_no_luhn/x371413321323331

🟥 execution_time [+1.205µs; +1.206µs] or [+17.600%; +17.624%]
🟥 throughput [-21894746.448op/s; -21864121.115op/s] or [-14.985%; -14.964%]

Candidate

Candidate benchmark details

Group 1

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching string interning on wordpress profile	execution_time	140.366µs	141.144µs ± 0.669µs	141.061µs ± 0.183µs	141.240µs	141.667µs	142.457µs	149.337µs	5.87%	9.427	111.203	0.47%	0.047µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching string interning on wordpress profile	execution_time	[141.051µs; 141.236µs] or [-0.066%; +0.066%]	None	None	None

Group 2

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
sql/obfuscate_sql_string	execution_time	70.296µs	70.539µs ± 0.197µs	70.519µs ± 0.048µs	70.573µs	70.666µs	70.710µs	73.102µs	3.66%	11.095	142.017	0.28%	0.014µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
sql/obfuscate_sql_string	execution_time	[70.511µs; 70.566µs] or [-0.039%; +0.039%]	None	None	None

Group 3

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
concentrator/add_spans_to_concentrator	execution_time	6.768ms	6.784ms ± 0.007ms	6.784ms ± 0.004ms	6.787ms	6.795ms	6.805ms	6.832ms	0.71%	2.113	11.253	0.10%	0.000ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
concentrator/add_spans_to_concentrator	execution_time	[6.783ms; 6.785ms] or [-0.014%; +0.014%]	None	None	None

Group 4

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
two way interface	execution_time	18.668µs	24.590µs ± 14.244µs	18.836µs ± 0.064µs	19.390µs	46.962µs	49.713µs	163.460µs	767.80%	5.558	45.965	57.78%	1.007µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
two way interface	execution_time	[22.616µs; 26.564µs] or [-8.028%; +8.028%]	None	None	None

Group 5

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
redis/obfuscate_redis_string	execution_time	37.859µs	38.434µs ± 0.820µs	38.069µs ± 0.068µs	38.168µs	40.141µs	40.181µs	41.973µs	10.25%	1.822	1.913	2.13%	0.058µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
redis/obfuscate_redis_string	execution_time	[38.320µs; 38.548µs] or [-0.296%; +0.296%]	None	None	None

Group 6

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
tags/replace_trace_tags	execution_time	2.586µs	2.656µs ± 0.016µs	2.656µs ± 0.004µs	2.663µs	2.684µs	2.690µs	2.705µs	1.84%	-0.865	3.078	0.60%	0.001µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
tags/replace_trace_tags	execution_time	[2.653µs; 2.658µs] or [-0.083%; +0.083%]	None	None	None

Group 7

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	188.954µs	191.570µs ± 0.994µs	191.505µs ± 0.559µs	192.191µs	193.142µs	193.889µs	194.351µs	1.49%	0.050	0.026	0.52%	0.070µs	1	200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	5145335.951op/s	5220155.904op/s ± 27084.954op/s	5221787.398op/s ± 15253.985op/s	5235361.025op/s	5267589.336op/s	5284189.769op/s	5292294.081op/s	1.35%	-0.018	0.027	0.52%	1915.195op/s	1	200
normalization/normalize_name/normalize_name/bad-name	execution_time	18.154µs	18.236µs ± 0.034µs	18.233µs ± 0.016µs	18.253µs	18.277µs	18.320µs	18.512µs	1.53%	2.690	20.168	0.19%	0.002µs	1	200
normalization/normalize_name/normalize_name/bad-name	throughput	54017704.074op/s	54837742.323op/s ± 102426.203op/s	54845520.390op/s ± 49391.575op/s	54884027.270op/s	54982195.334op/s	55012511.152op/s	55083764.667op/s	0.43%	-2.608	19.358	0.19%	7242.626op/s	1	200
normalization/normalize_name/normalize_name/good	execution_time	11.059µs	11.120µs ± 0.029µs	11.112µs ± 0.016µs	11.136µs	11.169µs	11.223µs	11.262µs	1.35%	1.206	3.267	0.26%	0.002µs	1	200
normalization/normalize_name/normalize_name/good	throughput	88796392.647op/s	89929702.995op/s ± 237075.664op/s	89991693.345op/s ± 130525.333op/s	90073717.975op/s	90270752.636op/s	90363853.516op/s	90420192.524op/s	0.48%	-1.176	3.139	0.26%	16763.781op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	execution_time	[191.433µs; 191.708µs] or [-0.072%; +0.072%]	None	None	None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...	throughput	[5216402.189op/s; 5223909.618op/s] or [-0.072%; +0.072%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	execution_time	[18.231µs; 18.240µs] or [-0.026%; +0.026%]	None	None	None
normalization/normalize_name/normalize_name/bad-name	throughput	[54823547.036op/s; 54851937.610op/s] or [-0.026%; +0.026%]	None	None	None
normalization/normalize_name/normalize_name/good	execution_time	[11.116µs; 11.124µs] or [-0.037%; +0.037%]	None	None	None
normalization/normalize_name/normalize_name/good	throughput	[89896846.588op/s; 89962559.402op/s] or [-0.037%; +0.037%]	None	None	None

Group 8

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	626.067µs	627.286µs ± 0.795µs	627.176µs ± 0.271µs	627.487µs	628.072µs	630.872µs	635.001µs	1.25%	5.662	47.408	0.13%	0.056µs	1	200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	1574799.961op/s	1594171.278op/s ± 2006.565op/s	1594447.950op/s ± 688.202op/s	1595024.662op/s	1596039.608op/s	1596726.022op/s	1597274.229op/s	0.18%	-5.597	46.556	0.13%	141.886op/s	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	465.259µs	466.160µs ± 0.337µs	466.146µs ± 0.217µs	466.374µs	466.730µs	466.974µs	467.381µs	0.26%	0.346	0.801	0.07%	0.024µs	1	200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	2139582.536op/s	2145186.269op/s ± 1549.233op/s	2145250.266op/s ± 999.994op/s	2146208.233op/s	2147609.496op/s	2148638.464op/s	2149339.745op/s	0.19%	-0.340	0.793	0.07%	109.547op/s	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	179.327µs	179.871µs ± 0.327µs	179.883µs ± 0.288µs	180.131µs	180.343µs	180.586µs	180.735µs	0.47%	0.198	-0.974	0.18%	0.023µs	1	200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	5532973.825op/s	5559547.090op/s ± 10089.941op/s	5559156.202op/s ± 8911.925op/s	5568546.761op/s	5573844.904op/s	5575902.409op/s	5576403.598op/s	0.31%	-0.193	-0.980	0.18%	713.467op/s	1	200
normalization/normalize_service/normalize_service/[empty string]	execution_time	47.437µs	47.579µs ± 0.057µs	47.576µs ± 0.039µs	47.620µs	47.665µs	47.714µs	47.757µs	0.38%	0.103	-0.079	0.12%	0.004µs	1	200
normalization/normalize_service/normalize_service/[empty string]	throughput	20939196.790op/s	21017739.094op/s ± 24992.476op/s	21018812.066op/s ± 17380.754op/s	21034945.022op/s	21060664.142op/s	21067454.516op/s	21080811.699op/s	0.29%	-0.097	-0.085	0.12%	1767.235op/s	1	200
normalization/normalize_service/normalize_service/test_ASCII	execution_time	49.613µs	49.889µs ± 0.133µs	49.867µs ± 0.073µs	49.941µs	50.150µs	50.217µs	50.670µs	1.61%	1.562	5.277	0.27%	0.009µs	1	200
normalization/normalize_service/normalize_service/test_ASCII	throughput	19735661.468op/s	20044612.202op/s ± 53166.233op/s	20053526.595op/s ± 29304.825op/s	20081754.926op/s	20108977.799op/s	20126884.617op/s	20155832.711op/s	0.51%	-1.524	5.028	0.26%	3759.420op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	execution_time	[627.176µs; 627.396µs] or [-0.018%; +0.018%]	None	None	None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...	throughput	[1593893.188op/s; 1594449.369op/s] or [-0.017%; +0.017%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	execution_time	[466.114µs; 466.207µs] or [-0.010%; +0.010%]	None	None	None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて	throughput	[2144971.561op/s; 2145400.978op/s] or [-0.010%; +0.010%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	execution_time	[179.826µs; 179.917µs] or [-0.025%; +0.025%]	None	None	None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters	throughput	[5558148.722op/s; 5560945.459op/s] or [-0.025%; +0.025%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	execution_time	[47.571µs; 47.587µs] or [-0.016%; +0.016%]	None	None	None
normalization/normalize_service/normalize_service/[empty string]	throughput	[21014275.377op/s; 21021202.811op/s] or [-0.016%; +0.016%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	execution_time	[49.871µs; 49.907µs] or [-0.037%; +0.037%]	None	None	None
normalization/normalize_service/normalize_service/test_ASCII	throughput	[20037243.873op/s; 20051980.531op/s] or [-0.037%; +0.037%]	None	None	None

Group 9

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
write only interface	execution_time	1.437µs	3.244µs ± 1.440µs	3.089µs ± 0.020µs	3.105µs	3.136µs	14.227µs	15.239µs	393.26%	7.650	58.348	44.27%	0.102µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
write only interface	execution_time	[3.044µs; 3.443µs] or [-6.151%; +6.151%]	None	None	None

Group 10

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
ip_address/quantize_peer_ip_address_benchmark	execution_time	5.939µs	5.981µs ± 0.029µs	5.970µs ± 0.017µs	6.004µs	6.029µs	6.031µs	6.036µs	1.12%	0.534	-1.233	0.48%	0.002µs	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
ip_address/quantize_peer_ip_address_benchmark	execution_time	[5.977µs; 5.985µs] or [-0.067%; +0.067%]	None	None	None

Group 11

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
credit_card/is_card_number/	execution_time	4.623µs	4.638µs ± 0.004µs	4.638µs ± 0.002µs	4.639µs	4.643µs	4.645µs	4.682µs	0.95%	4.807	49.117	0.09%	0.000µs	1	200
credit_card/is_card_number/	throughput	213594280.783op/s	215618568.059op/s ± 203184.499op/s	215628325.153op/s ± 83960.126op/s	215701121.647op/s	215861596.168op/s	216008653.458op/s	216303530.110op/s	0.31%	-4.728	48.177	0.09%	14367.314op/s	1	200
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	92.201µs	93.298µs ± 0.599µs	93.290µs ± 0.330µs	93.602µs	94.167µs	94.386µs	98.133µs	5.19%	2.590	19.665	0.64%	0.042µs	1	200
credit_card/is_card_number/ 3782-8224-6310-005	throughput	10190297.195op/s	10718768.512op/s ± 67741.213op/s	10719221.561op/s ± 37866.182op/s	10758155.379op/s	10819579.615op/s	10843305.236op/s	10845844.299op/s	1.18%	-2.318	17.003	0.63%	4790.027op/s	1	200
credit_card/is_card_number/ 378282246310005	execution_time	84.012µs	85.184µs ± 0.563µs	85.166µs ± 0.319µs	85.482µs	85.965µs	86.251µs	89.831µs	5.48%	2.818	21.528	0.66%	0.040µs	1	200
credit_card/is_card_number/ 378282246310005	throughput	11132005.109op/s	11739758.911op/s ± 76286.495op/s	11741754.440op/s ± 43760.329op/s	11783377.506op/s	11839443.342op/s	11862240.729op/s	11903053.062op/s	1.37%	-2.526	18.562	0.65%	5394.270op/s	1	200
credit_card/is_card_number/37828224631	execution_time	4.626µs	4.638µs ± 0.006µs	4.638µs ± 0.002µs	4.639µs	4.644µs	4.650µs	4.711µs	1.58%	8.382	96.540	0.13%	0.000µs	1	200
credit_card/is_card_number/37828224631	throughput	212268008.387op/s	215599727.432op/s ± 282779.203op/s	215632043.561op/s ± 76072.591op/s	215705214.681op/s	215825712.074op/s	215979883.430op/s	216149851.070op/s	0.24%	-8.270	94.788	0.13%	19995.509op/s	1	200
credit_card/is_card_number/378282246310005	execution_time	79.597µs	80.701µs ± 0.410µs	80.716µs ± 0.264µs	80.977µs	81.323µs	81.658µs	82.031µs	1.63%	0.073	0.179	0.51%	0.029µs	1	200
credit_card/is_card_number/378282246310005	throughput	12190443.832op/s	12391709.845op/s ± 62916.867op/s	12389149.600op/s ± 40449.784op/s	12430392.204op/s	12495651.559op/s	12534302.718op/s	12563327.020op/s	1.41%	-0.040	0.154	0.51%	4448.894op/s	1	200
credit_card/is_card_number/37828224631000521389798	execution_time	60.181µs	60.285µs ± 0.031µs	60.285µs ± 0.018µs	60.303µs	60.337µs	60.363µs	60.375µs	0.15%	-0.207	1.020	0.05%	0.002µs	1	200
credit_card/is_card_number/37828224631000521389798	throughput	16563171.788op/s	16587983.071op/s ± 8609.636op/s	16587783.098op/s ± 4822.408op/s	16592567.359op/s	16601037.898op/s	16612889.959op/s	16616565.054op/s	0.17%	0.212	1.024	0.05%	608.793op/s	1	200
credit_card/is_card_number/x371413321323331	execution_time	8.037µs	8.050µs ± 0.005µs	8.049µs ± 0.003µs	8.052µs	8.058µs	8.063µs	8.069µs	0.25%	0.464	1.765	0.06%	0.000µs	1	200
credit_card/is_card_number/x371413321323331	throughput	123928635.451op/s	124228967.055op/s ± 74327.191op/s	124232777.650op/s ± 38931.895op/s	124271327.498op/s	124331757.569op/s	124412271.631op/s	124431573.946op/s	0.16%	-0.458	1.756	0.06%	5255.726op/s	1	200
credit_card/is_card_number_no_luhn/	execution_time	4.623µs	4.637µs ± 0.003µs	4.637µs ± 0.002µs	4.639µs	4.642µs	4.644µs	4.645µs	0.17%	-0.241	3.028	0.06%	0.000µs	1	200
credit_card/is_card_number_no_luhn/	throughput	215296211.954op/s	215652509.877op/s ± 129088.186op/s	215662679.573op/s ± 74876.321op/s	215729571.132op/s	215813612.921op/s	215915087.163op/s	216319083.585op/s	0.30%	0.250	3.062	0.06%	9127.913op/s	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	75.474µs	75.840µs ± 0.124µs	75.850µs ± 0.088µs	75.929µs	76.030µs	76.116µs	76.187µs	0.44%	-0.198	-0.151	0.16%	0.009µs	1	200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	13125653.672op/s	13185765.590op/s ± 21490.359op/s	13183880.070op/s ± 15250.108op/s	13200735.941op/s	13222449.472op/s	13232311.041op/s	13249577.268op/s	0.50%	0.207	-0.151	0.16%	1519.598op/s	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	67.457µs	67.796µs ± 0.146µs	67.787µs ± 0.094µs	67.881µs	68.025µs	68.234µs	68.326µs	0.80%	0.563	0.832	0.21%	0.010µs	1	200
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	14635715.556op/s	14750180.688op/s ± 31747.089op/s	14752129.341op/s ± 20499.570op/s	14772536.384op/s	14799620.834op/s	14809467.028op/s	14824207.932op/s	0.49%	-0.547	0.798	0.21%	2244.858op/s	1	200
credit_card/is_card_number_no_luhn/37828224631	execution_time	4.622µs	4.637µs ± 0.003µs	4.637µs ± 0.002µs	4.639µs	4.641µs	4.642µs	4.645µs	0.17%	-0.621	4.882	0.05%	0.000µs	1	200
credit_card/is_card_number_no_luhn/37828224631	throughput	215300681.825op/s	215645636.044op/s ± 117809.127op/s	215662604.113op/s ± 73905.307op/s	215724797.566op/s	215783671.198op/s	215859999.464op/s	216333944.992op/s	0.31%	0.631	4.940	0.05%	8330.363op/s	1	200
credit_card/is_card_number_no_luhn/378282246310005	execution_time	63.453µs	63.756µs ± 0.132µs	63.763µs ± 0.082µs	63.835µs	63.950µs	64.131µs	64.211µs	0.70%	0.113	0.356	0.21%	0.009µs	1	200
credit_card/is_card_number_no_luhn/378282246310005	throughput	15573732.846op/s	15684885.109op/s ± 32392.716op/s	15683155.512op/s ± 20042.503op/s	15705189.390op/s	15739674.368op/s	15757859.804op/s	15759732.538op/s	0.49%	-0.099	0.339	0.21%	2290.511op/s	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	60.196µs	60.298µs ± 0.031µs	60.297µs ± 0.018µs	60.314µs	60.355µs	60.372µs	60.391µs	0.16%	0.169	0.613	0.05%	0.002µs	1	200
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	16558768.666op/s	16584315.001op/s ± 8536.527op/s	16584684.044op/s ± 4962.531op/s	16589886.805op/s	16596626.570op/s	16605527.460op/s	16612351.283op/s	0.17%	-0.165	0.614	0.05%	603.624op/s	1	200
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	8.036µs	8.050µs ± 0.004µs	8.050µs ± 0.003µs	8.052µs	8.058µs	8.061µs	8.066µs	0.21%	0.295	1.401	0.05%	0.000µs	1	200
credit_card/is_card_number_no_luhn/x371413321323331	throughput	123972081.447op/s	124227860.565op/s ± 66689.165op/s	124230921.396op/s ± 39086.040op/s	124269743.426op/s	124314567.833op/s	124413889.972op/s	124434463.882op/s	0.16%	-0.290	1.398	0.05%	4715.636op/s	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
credit_card/is_card_number/	execution_time	[4.637µs; 4.638µs] or [-0.013%; +0.013%]	None	None	None
credit_card/is_card_number/	throughput	[215590408.641op/s; 215646727.476op/s] or [-0.013%; +0.013%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	execution_time	[93.215µs; 93.381µs] or [-0.089%; +0.089%]	None	None	None
credit_card/is_card_number/ 3782-8224-6310-005	throughput	[10709380.231op/s; 10728156.793op/s] or [-0.088%; +0.088%]	None	None	None
credit_card/is_card_number/ 378282246310005	execution_time	[85.106µs; 85.262µs] or [-0.092%; +0.092%]	None	None	None
credit_card/is_card_number/ 378282246310005	throughput	[11729186.336op/s; 11750331.485op/s] or [-0.090%; +0.090%]	None	None	None
credit_card/is_card_number/37828224631	execution_time	[4.637µs; 4.639µs] or [-0.018%; +0.018%]	None	None	None
credit_card/is_card_number/37828224631	throughput	[215560536.955op/s; 215638917.910op/s] or [-0.018%; +0.018%]	None	None	None
credit_card/is_card_number/378282246310005	execution_time	[80.644µs; 80.758µs] or [-0.070%; +0.070%]	None	None	None
credit_card/is_card_number/378282246310005	throughput	[12382990.173op/s; 12400429.518op/s] or [-0.070%; +0.070%]	None	None	None
credit_card/is_card_number/37828224631000521389798	execution_time	[60.280µs; 60.289µs] or [-0.007%; +0.007%]	None	None	None
credit_card/is_card_number/37828224631000521389798	throughput	[16586789.859op/s; 16589176.284op/s] or [-0.007%; +0.007%]	None	None	None
credit_card/is_card_number/x371413321323331	execution_time	[8.049µs; 8.050µs] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number/x371413321323331	throughput	[124218666.021op/s; 124239268.089op/s] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number_no_luhn/	execution_time	[4.637µs; 4.637µs] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number_no_luhn/	throughput	[215634619.496op/s; 215670400.258op/s] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	execution_time	[75.822µs; 75.857µs] or [-0.023%; +0.023%]	None	None	None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005	throughput	[13182787.233op/s; 13188743.947op/s] or [-0.023%; +0.023%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	execution_time	[67.776µs; 67.816µs] or [-0.030%; +0.030%]	None	None	None
credit_card/is_card_number_no_luhn/ 378282246310005	throughput	[14745780.847op/s; 14754580.529op/s] or [-0.030%; +0.030%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	execution_time	[4.637µs; 4.638µs] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631	throughput	[215629308.832op/s; 215661963.257op/s] or [-0.008%; +0.008%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	execution_time	[63.738µs; 63.774µs] or [-0.029%; +0.029%]	None	None	None
credit_card/is_card_number_no_luhn/378282246310005	throughput	[15680395.790op/s; 15689374.427op/s] or [-0.029%; +0.029%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	execution_time	[60.294µs; 60.302µs] or [-0.007%; +0.007%]	None	None	None
credit_card/is_card_number_no_luhn/37828224631000521389798	throughput	[16583131.920op/s; 16585498.081op/s] or [-0.007%; +0.007%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	execution_time	[8.049µs; 8.050µs] or [-0.007%; +0.007%]	None	None	None
credit_card/is_card_number_no_luhn/x371413321323331	throughput	[124218618.089op/s; 124237103.042op/s] or [-0.007%; +0.007%]	None	None	None

Group 12

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
normalization/normalize_trace/test_trace	execution_time	272.479ns	285.214ns ± 14.324ns	277.341ns ± 3.463ns	289.685ns	317.626ns	328.296ns	330.804ns	19.28%	1.491	1.322	5.01%	1.013ns	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
normalization/normalize_trace/test_trace	execution_time	[283.229ns; 287.200ns] or [-0.696%; +0.696%]	None	None	None

Group 13

cpu_model	git_commit_sha	git_commit_date	git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	`806ae3d`	1734430762	vianney/trace_obfuscation/quantize_peer_ip

scenario	metric	min	mean ± sd	median ± mad	p75	p95	p99	max	peak_to_median_ratio	skewness	kurtosis	cv	sem	runs	sample_size
benching deserializing traces from msgpack to their internal representation	execution_time	60.350ms	60.887ms ± 0.356ms	60.996ms ± 0.298ms	61.103ms	61.514ms	61.795ms	62.063ms	1.75%	0.484	-0.155	0.58%	0.025ms	1	200

scenario	metric	95% CI mean	Shapiro-Wilk pvalue	Ljung-Box pvalue (lag=1)	Dip test pvalue
benching deserializing traces from msgpack to their internal representation	execution_time	[60.837ms; 60.936ms] or [-0.081%; +0.081%]	None	None	None

Baseline

Omitted due to size.

codecov-commenter · 2024-12-16T13:13:27Z

Codecov Report

Attention: Patch coverage is 96.98492% with 6 lines in your changes missing coverage. Please review.

Project coverage is 71.04%. Comparing base (d0a8039) to head (806ae3d).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #801      +/-   ##
==========================================
+ Coverage   70.92%   71.04%   +0.12%     
==========================================
  Files         313      314       +1     
  Lines       45752    45951     +199     
==========================================
+ Hits        32450    32647     +197     
- Misses      13302    13304       +2

Components	Coverage Δ
crashtracker	`38.27% <ø> (ø)`
crashtracker-ffi	`5.71% <ø> (ø)`
datadog-alloc	`98.73% <ø> (ø)`
data-pipeline	`92.06% <ø> (ø)`
data-pipeline-ffi	`90.54% <ø> (ø)`
ddcommon	`82.31% <ø> (ø)`
ddcommon-ffi	`65.52% <ø> (ø)`
ddtelemetry	`59.51% <ø> (ø)`
ddtelemetry-ffi	`22.46% <ø> (ø)`
dogstatsd	`89.46% <ø> (ø)`
dogstatsd-client	`79.77% <ø> (ø)`
ipc	`82.76% <ø> (-0.11%)`	⬇️
profiling	`84.31% <ø> (ø)`
profiling-ffi	`77.55% <ø> (ø)`
serverless	`0.00% <ø> (ø)`
sidecar	`40.67% <ø> (+0.11%)`	⬆️
sidecar-ffi	`2.06% <ø> (+0.88%)`	⬆️
spawn-worker	`54.37% <ø> (ø)`
tinybytes	`93.60% <ø> (ø)`
trace-mini-agent	`72.38% <ø> (ø)`
trace-normalization	`98.23% <ø> (ø)`
trace-obfuscation	`95.96% <96.98%> (+0.19%)`	⬆️
trace-protobuf	`77.67% <ø> (ø)`
trace-utils	`93.52% <ø> (ø)`

hoolioh

LGTM although I would recommend using OnceLock to avoid more dependencies.

hoolioh · 2024-12-31T11:30:58Z

trace-obfuscation/src/ip_address.rs

+// Copyright 2024-Present Datadog, Inc. https://www.datadoghq.com/
+// SPDX-License-Identifier: Apache-2.0
+
+use lazy_static::lazy_static;


In this case since the types are not Sync you'll have to use OnceLock.

Suggested change

use lazy_static::lazy_static;

use std::sync::OnceLock

hoolioh · 2024-12-31T11:32:52Z

trace-obfuscation/src/ip_address.rs

+lazy_static! {
+    static ref ALLOWED_IP_ADDRESSES: HashSet<&'static str> = HashSet::from([
+        // localhost
+        "127.0.0.1",
+        "::1",
+        // link-local cloud provider metadata server addresses
+        "169.254.169.254",
+        "fd00:ec2::254"
+    ]);
+
+    static ref PREFIX_REGEX: Regex = Regex::new(r"^((?:dnspoll|ftp|file|http|https):/{2,3})").unwrap();
+}


Suggested change

lazy_static! {

static ref ALLOWED_IP_ADDRESSES: HashSet<&'static str> = HashSet::from([

// localhost

"127.0.0.1",

"::1",

// link-local cloud provider metadata server addresses

"169.254.169.254",

"fd00:ec2::254"

]);

static ref PREFIX_REGEX: Regex = Regex::new(r"^((?:dnspoll|ftp|file|http|https):/{2,3})").unwrap();

}

static ALLOWED_IP_ADDRESSES: OnceLock<HashSet<&'static str>> = OnceLock::new();

static PREFIX_REGEX: OnceLock<Regex> = OnceLock::new();

}

hoolioh · 2024-12-31T11:34:35Z

trace-obfuscation/src/ip_address.rs

+fn quantize_ip(s: &str) -> Option<String> {
+    let (prefix, stripped_s) = split_prefix(s);
+    if let Some((ip, suffix)) = parse_ip(stripped_s) {
+        if !ALLOWED_IP_ADDRESSES.contains(ip) {


Suggested change

if !ALLOWED_IP_ADDRESSES.contains(ip) {

if !ALLOWED_IP_ADDRESSES

.get_or_init(|| {

HashSet::from([

// localhost

"127.0.0.1",

"::1",

// link-local cloud provider metadata server addresses

"169.254.169.254",

"fd00:ec2::254",

])

})

.contains(ip)

{

hoolioh · 2024-12-31T11:40:47Z

trace-obfuscation/src/ip_address.rs

+fn split_prefix(s: &str) -> (&str, &str) {
+    if let Some(tail) = s.strip_prefix("ip-") {
+        ("ip-", tail)
+    } else if let Some(protocol) = PREFIX_REGEX.find(s) {


Suggested change

} else if let Some(protocol) = PREFIX_REGEX.find(s) {

} else if let Some(protocol) = PREFIX_REGEX

.get_or_init(|| Regex::new(r"^((?:dnspoll|ftp|file|http|https):/{2,3})").unwrap())

.find(s)

{

hoolioh · 2024-12-31T11:43:40Z

trace-obfuscation/Cargo.toml

@@ -14,6 +14,7 @@ serde = { version = "1.0.145", features = ["derive"] }
 serde_json = "1.0"
 url = "2.4.0"
 percent-encoding = "2.1"
+lazy_static = "1.4"


I would use OnceCell variants. It's part of the std so less dependencies and provides some advantages over lazy_static since it can accept closures as initializers, it's macro-free (I wouldn't count that as a massive adv) and, on certain scenarios, it could be more efficient regarding memory handling.

Suggested change

lazy_static = "1.4"

github-actions bot added the mini-agent label Dec 16, 2024

VianneyRuhlmann added 3 commits December 16, 2024 18:38

Add ip quantization

8313d24

Use Cow to reduce allocations

e64862d

Fix clippy lints

d6b9432

VianneyRuhlmann force-pushed the vianney/trace_obfuscation/quantize_peer_ip branch from 2c2511c to d6b9432 Compare December 16, 2024 17:39

VianneyRuhlmann marked this pull request as ready for review December 16, 2024 17:50

VianneyRuhlmann requested review from a team as code owners December 16, 2024 17:50

Merge branch 'main' into vianney/trace_obfuscation/quantize_peer_ip

806ae3d

hoolioh approved these changes Dec 31, 2024

View reviewed changes

hoolioh self-requested a review December 31, 2024 11:48

hoolioh approved these changes Dec 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add IP quantization to obfuscation #801

Add IP quantization to obfuscation #801

VianneyRuhlmann commented Dec 16, 2024

pr-commenter bot commented Dec 16, 2024 •

edited

Loading

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Group 12

Group 13

codecov-commenter commented Dec 16, 2024 •

edited

Loading

hoolioh left a comment

hoolioh Dec 31, 2024

hoolioh Dec 31, 2024

hoolioh Dec 31, 2024

hoolioh Dec 31, 2024

hoolioh Dec 31, 2024

Add IP quantization to obfuscation #801

Are you sure you want to change the base?

Add IP quantization to obfuscation #801

Conversation

VianneyRuhlmann commented Dec 16, 2024

What does this PR do?

Motivation

Additional Notes

How to test the change?

pr-commenter bot commented Dec 16, 2024 • edited Loading

Benchmarks

Comparison

scenario:credit_card/is_card_number/x371413321323331

scenario:credit_card/is_card_number_no_luhn/x371413321323331

Candidate

Group 1

Group 2

Group 3

Group 4

Group 5

Group 6

Group 7

Group 8

Group 9

Group 10

Group 11

Group 12

Group 13

Baseline

codecov-commenter commented Dec 16, 2024 • edited Loading

Codecov Report

hoolioh left a comment

Choose a reason for hiding this comment

hoolioh Dec 31, 2024

Choose a reason for hiding this comment

hoolioh Dec 31, 2024

Choose a reason for hiding this comment

hoolioh Dec 31, 2024

Choose a reason for hiding this comment

hoolioh Dec 31, 2024

Choose a reason for hiding this comment

hoolioh Dec 31, 2024

Choose a reason for hiding this comment

pr-commenter bot commented Dec 16, 2024 •

edited

Loading

codecov-commenter commented Dec 16, 2024 •

edited

Loading