Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[crashtracking] improve poll waiting logic #754

Merged
merged 3 commits into from
Nov 25, 2024
Merged

Conversation

sanchda
Copy link
Contributor

@sanchda sanchda commented Nov 22, 2024

What does this PR do?

The original implementation accidentally had a mutable array with immutable objects, causing the interface to always throw errors. Since this part of the code is in the critical path for handling zombie processes, this condition had an adverse side-effect on customer infrastructure.

This code also used a BorrowedFd, which is supposed to track an OwnedFd. This was problematic in some conditions, since the underlying implementation would use prctl() to check file descriptor liveness and panic in some edge-cases. The code has been ported to libc, using exclusively RawFd, in order to prevent this condition.

Finally, this patch grants some additional time to the act of reaping a PID. When a receiver process exceeds its timeout budget, it's sent a SIGKILL. However, the old behavior was to SIGKILL, the immediately waitpid( pid, ..., WNOHANG). On a saturated system (i.e., precisely the kind of system where a timeout might be necessary!), it may take some time for the receiver PID to respond to the SIGKILL.

In general, there's no way to provided a bounded guarantee for the duration of this reap operation, so an arbitrary number of scheduler slices is chosen as the maximum reaping wait duration.

Motivation

Fix zombies

@sanchda sanchda requested a review from a team as a code owner November 22, 2024 18:46
@pr-commenter
Copy link

pr-commenter bot commented Nov 22, 2024

Benchmarks

Comparison

Benchmark execution time: 2024-11-25 20:51:20

Comparing candidate commit 8074728 in PR branch sanchda/fix_poll_zombies with baseline commit bdbbd73 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 51 metrics, 2 unstable metrics.

Candidate

Candidate benchmark details

Group 1

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
two way interface execution_time 18.279µs 24.451µs ± 14.485µs 18.529µs ± 0.084µs 18.910µs 47.313µs 50.310µs 157.177µs 748.29% 5.095 38.179 59.09% 1.024µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
two way interface execution_time [22.444µs; 26.459µs] or [-8.210%; +8.210%] None None None

Group 2

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
sql/obfuscate_sql_string execution_time 68.939µs 69.078µs ± 0.144µs 69.058µs ± 0.040µs 69.101µs 69.194µs 69.296µs 70.775µs 2.49% 8.559 95.753 0.21% 0.010µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
sql/obfuscate_sql_string execution_time [69.058µs; 69.098µs] or [-0.029%; +0.029%] None None None

Group 3

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
credit_card/is_card_number/ execution_time 4.623µs 4.632µs ± 0.004µs 4.632µs ± 0.003µs 4.635µs 4.638µs 4.642µs 4.656µs 0.52% 0.814 4.307 0.09% 0.000µs 1 200
credit_card/is_card_number/ throughput 214760651.790op/s 215875063.962op/s ± 195540.286op/s 215871407.835op/s ± 135440.082op/s 216018713.084op/s 216158066.801op/s 216284750.093op/s 216324363.504op/s 0.21% -0.798 4.221 0.09% 13826.786op/s 1 200
credit_card/is_card_number/ 3782-8224-6310-005 execution_time 90.788µs 91.770µs ± 0.671µs 91.687µs ± 0.334µs 92.037µs 92.744µs 92.978µs 98.291µs 7.20% 4.792 42.734 0.73% 0.047µs 1 200
credit_card/is_card_number/ 3782-8224-6310-005 throughput 10173846.773op/s 10897360.266op/s ± 77162.762op/s 10906677.627op/s ± 39854.989op/s 10946144.293op/s 10977464.386op/s 11006785.374op/s 11014664.864op/s 0.99% -4.341 36.931 0.71% 5456.231op/s 1 200
credit_card/is_card_number/ 378282246310005 execution_time 83.848µs 84.077µs ± 0.372µs 83.998µs ± 0.043µs 84.086µs 84.304µs 84.885µs 88.772µs 5.68% 10.331 125.472 0.44% 0.026µs 1 200
credit_card/is_card_number/ 378282246310005 throughput 11264858.016op/s 11894055.091op/s ± 50373.942op/s 11905093.594op/s ± 6034.973op/s 11910120.029op/s 11916260.214op/s 11922160.575op/s 11926305.853op/s 0.18% -10.058 120.548 0.42% 3561.976op/s 1 200
credit_card/is_card_number/37828224631 execution_time 4.612µs 4.627µs ± 0.006µs 4.627µs ± 0.003µs 4.629µs 4.634µs 4.642µs 4.686µs 1.28% 4.134 34.541 0.14% 0.000µs 1 200
credit_card/is_card_number/37828224631 throughput 213407435.660op/s 216119953.021op/s ± 300385.283op/s 216129973.890op/s ± 128759.917op/s 216267594.440op/s 216502749.974op/s 216687204.414op/s 216833761.719op/s 0.33% -4.054 33.638 0.14% 21240.447op/s 1 200
credit_card/is_card_number/378282246310005 execution_time 81.037µs 81.207µs ± 0.122µs 81.181µs ± 0.045µs 81.232µs 81.445µs 81.655µs 81.785µs 0.74% 1.998 5.169 0.15% 0.009µs 1 200
credit_card/is_card_number/378282246310005 throughput 12227214.782op/s 12314193.910op/s ± 18501.917op/s 12318138.578op/s ± 6816.619op/s 12324203.377op/s 12335607.574op/s 12338683.247op/s 12340018.870op/s 0.18% -1.983 5.096 0.15% 1308.283op/s 1 200
credit_card/is_card_number/37828224631000521389798 execution_time 58.994µs 59.196µs ± 0.127µs 59.159µs ± 0.072µs 59.264µs 59.455µs 59.564µs 59.622µs 0.78% 1.027 0.694 0.21% 0.009µs 1 200
credit_card/is_card_number/37828224631000521389798 throughput 16772437.229op/s 16893170.466op/s ± 36260.575op/s 16903686.699op/s ± 20603.765op/s 16919739.838op/s 16937463.637op/s 16946668.856op/s 16950869.560op/s 0.28% -1.016 0.664 0.21% 2564.010op/s 1 200
credit_card/is_card_number/x371413321323331 execution_time 6.832µs 6.844µs ± 0.004µs 6.843µs ± 0.002µs 6.845µs 6.851µs 6.856µs 6.874µs 0.45% 2.042 11.292 0.06% 0.000µs 1 200
credit_card/is_card_number/x371413321323331 throughput 145474504.101op/s 146123544.609op/s ± 93963.996op/s 146134991.854op/s ± 44864.562op/s 146178873.500op/s 146231430.255op/s 146309969.641op/s 146361085.010op/s 0.15% -2.024 11.157 0.06% 6644.258op/s 1 200
credit_card/is_card_number_no_luhn/ execution_time 4.617µs 4.631µs ± 0.005µs 4.631µs ± 0.003µs 4.634µs 4.639µs 4.642µs 4.646µs 0.33% 0.155 0.130 0.11% 0.000µs 1 200
credit_card/is_card_number_no_luhn/ throughput 215241746.376op/s 215938487.180op/s ± 230355.301op/s 215942332.601op/s ± 146336.679op/s 216088980.049op/s 216315197.200op/s 216422922.874op/s 216607519.109op/s 0.31% -0.149 0.127 0.11% 16288.580op/s 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time 73.051µs 73.694µs ± 0.165µs 73.700µs ± 0.073µs 73.773µs 73.916µs 74.041µs 74.546µs 1.15% -0.087 5.195 0.22% 0.012µs 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput 13414497.620op/s 13569778.853op/s ± 30436.014op/s 13568543.367op/s ± 13510.832op/s 13582053.357op/s 13620689.594op/s 13675573.636op/s 13689054.142op/s 0.89% 0.135 5.117 0.22% 2152.151op/s 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time 64.736µs 65.096µs ± 0.210µs 65.091µs ± 0.155µs 65.245µs 65.494µs 65.598µs 65.618µs 0.81% 0.388 -0.462 0.32% 0.015µs 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 throughput 15239687.630op/s 15362065.305op/s ± 49395.660op/s 15363086.348op/s ± 36567.904op/s 15399770.935op/s 15433387.764op/s 15444169.585op/s 15447409.645op/s 0.55% -0.374 -0.478 0.32% 3492.801op/s 1 200
credit_card/is_card_number_no_luhn/37828224631 execution_time 4.617µs 4.632µs ± 0.004µs 4.632µs ± 0.003µs 4.635µs 4.639µs 4.641µs 4.644µs 0.25% -0.071 0.145 0.09% 0.000µs 1 200
credit_card/is_card_number_no_luhn/37828224631 throughput 215335732.333op/s 215891683.590op/s ± 204315.941op/s 215873670.744op/s ± 141078.451op/s 216041620.424op/s 216189108.183op/s 216343195.231op/s 216575136.934op/s 0.32% 0.077 0.150 0.09% 14447.319op/s 1 200
credit_card/is_card_number_no_luhn/378282246310005 execution_time 62.738µs 63.531µs ± 0.161µs 63.572µs ± 0.076µs 63.633µs 63.721µs 63.771µs 63.783µs 0.33% -1.467 2.965 0.25% 0.011µs 1 200
credit_card/is_card_number_no_luhn/378282246310005 throughput 15678104.307op/s 15740410.022op/s ± 40080.747op/s 15730144.787op/s ± 18671.866op/s 15756546.597op/s 15825375.097op/s 15849784.068op/s 15939301.526op/s 1.33% 1.489 3.073 0.25% 2834.137op/s 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time 59.026µs 59.244µs ± 0.136µs 59.240µs ± 0.087µs 59.297µs 59.482µs 59.708µs 59.906µs 1.12% 1.288 3.008 0.23% 0.010µs 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput 16692932.519op/s 16879402.512op/s ± 38578.081op/s 16880406.931op/s ± 24813.401op/s 16910888.926op/s 16927507.371op/s 16934041.573op/s 16941642.557op/s 0.36% -1.265 2.898 0.23% 2727.882op/s 1 200
credit_card/is_card_number_no_luhn/x371413321323331 execution_time 6.831µs 6.841µs ± 0.004µs 6.842µs ± 0.002µs 6.844µs 6.847µs 6.850µs 6.856µs 0.20% -0.268 0.415 0.06% 0.000µs 1 200
credit_card/is_card_number_no_luhn/x371413321323331 throughput 145866709.156op/s 146167371.011op/s ± 89332.511op/s 146161902.902op/s ± 51205.760op/s 146214435.021op/s 146343828.910op/s 146378248.892op/s 146382553.683op/s 0.15% 0.273 0.413 0.06% 6316.762op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
credit_card/is_card_number/ execution_time [4.632µs; 4.633µs] or [-0.013%; +0.013%] None None None
credit_card/is_card_number/ throughput [215847963.958op/s; 215902163.965op/s] or [-0.013%; +0.013%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 execution_time [91.677µs; 91.863µs] or [-0.101%; +0.101%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 throughput [10886666.250op/s; 10908054.283op/s] or [-0.098%; +0.098%] None None None
credit_card/is_card_number/ 378282246310005 execution_time [84.026µs; 84.129µs] or [-0.061%; +0.061%] None None None
credit_card/is_card_number/ 378282246310005 throughput [11887073.747op/s; 11901036.435op/s] or [-0.059%; +0.059%] None None None
credit_card/is_card_number/37828224631 execution_time [4.626µs; 4.628µs] or [-0.019%; +0.019%] None None None
credit_card/is_card_number/37828224631 throughput [216078322.510op/s; 216161583.532op/s] or [-0.019%; +0.019%] None None None
credit_card/is_card_number/378282246310005 execution_time [81.190µs; 81.224µs] or [-0.021%; +0.021%] None None None
credit_card/is_card_number/378282246310005 throughput [12311629.722op/s; 12316758.098op/s] or [-0.021%; +0.021%] None None None
credit_card/is_card_number/37828224631000521389798 execution_time [59.178µs; 59.213µs] or [-0.030%; +0.030%] None None None
credit_card/is_card_number/37828224631000521389798 throughput [16888145.099op/s; 16898195.833op/s] or [-0.030%; +0.030%] None None None
credit_card/is_card_number/x371413321323331 execution_time [6.843µs; 6.844µs] or [-0.009%; +0.009%] None None None
credit_card/is_card_number/x371413321323331 throughput [146110522.103op/s; 146136567.115op/s] or [-0.009%; +0.009%] None None None
credit_card/is_card_number_no_luhn/ execution_time [4.630µs; 4.632µs] or [-0.015%; +0.015%] None None None
credit_card/is_card_number_no_luhn/ throughput [215906562.150op/s; 215970412.209op/s] or [-0.015%; +0.015%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time [73.671µs; 73.716µs] or [-0.031%; +0.031%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput [13565560.714op/s; 13573996.992op/s] or [-0.031%; +0.031%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time [65.067µs; 65.125µs] or [-0.045%; +0.045%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 throughput [15355219.542op/s; 15368911.069op/s] or [-0.045%; +0.045%] None None None
credit_card/is_card_number_no_luhn/37828224631 execution_time [4.631µs; 4.633µs] or [-0.013%; +0.013%] None None None
credit_card/is_card_number_no_luhn/37828224631 throughput [215863367.365op/s; 215919999.814op/s] or [-0.013%; +0.013%] None None None
credit_card/is_card_number_no_luhn/378282246310005 execution_time [63.509µs; 63.553µs] or [-0.035%; +0.035%] None None None
credit_card/is_card_number_no_luhn/378282246310005 throughput [15734855.216op/s; 15745964.828op/s] or [-0.035%; +0.035%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time [59.225µs; 59.263µs] or [-0.032%; +0.032%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput [16874055.961op/s; 16884749.063op/s] or [-0.032%; +0.032%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 execution_time [6.841µs; 6.842µs] or [-0.008%; +0.008%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 throughput [146154990.384op/s; 146179751.638op/s] or [-0.008%; +0.008%] None None None

Group 4

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time 620.330µs 621.642µs ± 0.762µs 621.553µs ± 0.267µs 621.832µs 622.376µs 626.648µs 627.086µs 0.89% 4.982 32.457 0.12% 0.054µs 1 200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput 1594676.587op/s 1608645.270op/s ± 1960.700op/s 1608872.568op/s ± 691.469op/s 1609532.035op/s 1610487.097op/s 1611017.933op/s 1612045.202op/s 0.20% -4.947 32.143 0.12% 138.642op/s 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time 466.243µs 467.103µs ± 0.316µs 467.084µs ± 0.198µs 467.283µs 467.683µs 467.888µs 468.137µs 0.23% 0.410 0.373 0.07% 0.022µs 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput 2136126.097op/s 2140857.798op/s ± 1445.963op/s 2140942.770op/s ± 907.296op/s 2141821.983op/s 2142910.293op/s 2143700.746op/s 2144805.771op/s 0.18% -0.405 0.369 0.07% 102.245op/s 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time 191.179µs 191.749µs ± 0.191µs 191.748µs ± 0.128µs 191.859µs 192.030µs 192.167µs 192.759µs 0.53% 0.679 3.182 0.10% 0.014µs 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput 5187827.282op/s 5215161.570op/s ± 5198.981op/s 5215189.413op/s ± 3484.594op/s 5218927.834op/s 5222871.372op/s 5226089.271op/s 5230701.428op/s 0.30% -0.665 3.122 0.10% 367.623op/s 1 200
normalization/normalize_service/normalize_service/[empty string] execution_time 46.811µs 47.141µs ± 0.115µs 47.138µs ± 0.074µs 47.218µs 47.336µs 47.404µs 47.487µs 0.74% 0.111 0.054 0.24% 0.008µs 1 200
normalization/normalize_service/normalize_service/[empty string] throughput 21058260.644op/s 21213074.683op/s ± 51754.535op/s 21214387.747op/s ± 33111.851op/s 21245382.637op/s 21293046.651op/s 21333361.560op/s 21362497.053op/s 0.70% -0.096 0.052 0.24% 3659.598op/s 1 200
normalization/normalize_service/normalize_service/test_ASCII execution_time 51.468µs 51.666µs ± 0.091µs 51.657µs ± 0.052µs 51.709µs 51.825µs 51.919µs 52.065µs 0.79% 0.988 2.404 0.18% 0.006µs 1 200
normalization/normalize_service/normalize_service/test_ASCII throughput 19206665.300op/s 19354972.833op/s ± 33992.185op/s 19358595.110op/s ± 19487.497op/s 19376893.807op/s 19398378.247op/s 19427710.394op/s 19429566.222op/s 0.37% -0.971 2.344 0.18% 2403.610op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time [621.536µs; 621.748µs] or [-0.017%; +0.017%] None None None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput [1608373.536op/s; 1608917.004op/s] or [-0.017%; +0.017%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time [467.059µs; 467.146µs] or [-0.009%; +0.009%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput [2140657.402op/s; 2141058.195op/s] or [-0.009%; +0.009%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time [191.722µs; 191.775µs] or [-0.014%; +0.014%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput [5214441.041op/s; 5215882.098op/s] or [-0.014%; +0.014%] None None None
normalization/normalize_service/normalize_service/[empty string] execution_time [47.125µs; 47.157µs] or [-0.034%; +0.034%] None None None
normalization/normalize_service/normalize_service/[empty string] throughput [21205902.002op/s; 21220247.363op/s] or [-0.034%; +0.034%] None None None
normalization/normalize_service/normalize_service/test_ASCII execution_time [51.654µs; 51.679µs] or [-0.024%; +0.024%] None None None
normalization/normalize_service/normalize_service/test_ASCII throughput [19350261.843op/s; 19359683.823op/s] or [-0.024%; +0.024%] None None None

Group 5

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching deserializing traces from msgpack to their internal representation execution_time 60.344ms 60.707ms ± 0.197ms 60.676ms ± 0.084ms 60.747ms 61.128ms 61.456ms 61.672ms 1.64% 2.101 5.967 0.32% 0.014ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching deserializing traces from msgpack to their internal representation execution_time [60.680ms; 60.734ms] or [-0.045%; +0.045%] None None None

Group 6

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
tags/replace_trace_tags execution_time 2.699µs 2.740µs ± 0.013µs 2.739µs ± 0.007µs 2.747µs 2.769µs 2.773µs 2.777µs 1.38% 0.454 0.930 0.47% 0.001µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
tags/replace_trace_tags execution_time [2.738µs; 2.742µs] or [-0.065%; +0.065%] None None None

Group 7

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching string interning on wordpress profile execution_time 137.209µs 137.870µs ± 0.261µs 137.843µs ± 0.124µs 137.978µs 138.266µs 138.702µs 139.017µs 0.85% 1.006 3.481 0.19% 0.018µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching string interning on wordpress profile execution_time [137.834µs; 137.906µs] or [-0.026%; +0.026%] None None None

Group 8

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
redis/obfuscate_redis_string execution_time 38.113µs 38.906µs ± 1.279µs 38.305µs ± 0.075µs 38.488µs 41.686µs 41.714µs 41.813µs 9.16% 1.686 0.887 3.28% 0.090µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
redis/obfuscate_redis_string execution_time [38.728µs; 39.083µs] or [-0.455%; +0.455%] None None None

Group 9

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_trace/test_trace execution_time 298.278ns 310.104ns ± 13.165ns 305.457ns ± 4.981ns 312.348ns 344.190ns 347.503ns 350.171ns 14.64% 1.636 1.720 4.23% 0.931ns 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_trace/test_trace execution_time [308.279ns; 311.928ns] or [-0.588%; +0.588%] None None None

Group 10

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
concentrator/add_spans_to_concentrator execution_time 9.153ms 9.191ms ± 0.015ms 9.189ms ± 0.010ms 9.199ms 9.215ms 9.226ms 9.276ms 0.94% 0.999 4.156 0.16% 0.001ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
concentrator/add_spans_to_concentrator execution_time [9.189ms; 9.193ms] or [-0.023%; +0.023%] None None None

Group 11

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time 299.229µs 304.009µs ± 1.645µs 303.959µs ± 1.121µs 305.149µs 306.475µs 307.376µs 307.536µs 1.18% -0.174 -0.275 0.54% 0.116µs 1 200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput 3251652.622op/s 3289474.887op/s ± 17815.405op/s 3289914.391op/s ± 12093.385op/s 3301472.695op/s 3321469.782op/s 3331962.021op/s 3341920.832op/s 1.58% 0.201 -0.255 0.54% 1259.739op/s 1 200
normalization/normalize_name/normalize_name/bad-name execution_time 28.089µs 28.288µs ± 0.104µs 28.284µs ± 0.054µs 28.336µs 28.444µs 28.731µs 28.893µs 2.15% 1.955 8.413 0.37% 0.007µs 1 200
normalization/normalize_name/normalize_name/bad-name throughput 34610516.871op/s 35351520.445op/s ± 129003.217op/s 35355533.890op/s ± 67691.691op/s 35433198.978op/s 35520213.863op/s 35568833.429op/s 35601185.930op/s 0.69% -1.884 7.969 0.36% 9121.905op/s 1 200
normalization/normalize_name/normalize_name/good execution_time 16.586µs 16.703µs ± 0.053µs 16.705µs ± 0.041µs 16.740µs 16.788µs 16.812µs 16.829µs 0.74% -0.008 -0.758 0.31% 0.004µs 1 200
normalization/normalize_name/normalize_name/good throughput 59419486.849op/s 59870324.532op/s ± 188889.357op/s 59862026.274op/s ± 145730.933op/s 60027158.561op/s 60164550.056op/s 60256907.334op/s 60291408.474op/s 0.72% 0.020 -0.761 0.31% 13356.495op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time [303.781µs; 304.237µs] or [-0.075%; +0.075%] None None None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput [3287005.843op/s; 3291943.930op/s] or [-0.075%; +0.075%] None None None
normalization/normalize_name/normalize_name/bad-name execution_time [28.273µs; 28.302µs] or [-0.051%; +0.051%] None None None
normalization/normalize_name/normalize_name/bad-name throughput [35333641.840op/s; 35369399.050op/s] or [-0.051%; +0.051%] None None None
normalization/normalize_name/normalize_name/good execution_time [16.696µs; 16.710µs] or [-0.044%; +0.044%] None None None
normalization/normalize_name/normalize_name/good throughput [59844146.284op/s; 59896502.780op/s] or [-0.044%; +0.044%] None None None

Group 12

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz 8074728 1732567225 sanchda/fix_poll_zombies
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
write only interface execution_time 1.403µs 3.261µs ± 1.445µs 3.105µs ± 0.020µs 3.123µs 3.150µs 14.256µs 15.303µs 392.77% 7.629 58.122 44.21% 0.102µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
write only interface execution_time [3.060µs; 3.461µs] or [-6.143%; +6.143%] None None None

Baseline

Omitted due to size.

@sanchda sanchda requested a review from danielsn November 22, 2024 18:48
@codecov-commenter
Copy link

codecov-commenter commented Nov 22, 2024

Codecov Report

Attention: Patch coverage is 0% with 23 lines in your changes missing coverage. Please review.

Project coverage is 70.47%. Comparing base (bdbbd73) to head (8074728).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #754      +/-   ##
==========================================
- Coverage   70.49%   70.47%   -0.02%     
==========================================
  Files         297      297              
  Lines       43401    43411      +10     
==========================================
  Hits        30595    30595              
- Misses      12806    12816      +10     
Components Coverage Δ
crashtracker 43.48% <0.00%> (-0.14%) ⬇️
crashtracker-ffi 8.41% <ø> (ø)
datadog-alloc 98.73% <ø> (ø)
data-pipeline 89.09% <ø> (ø)
data-pipeline-ffi 0.00% <ø> (ø)
ddcommon 83.46% <ø> (ø)
ddcommon-ffi 69.12% <ø> (ø)
ddtelemetry 59.05% <ø> (ø)
ddtelemetry-ffi 22.13% <ø> (ø)
dogstatsd 89.45% <ø> (ø)
dogstatsd-client 79.77% <ø> (ø)
ipc 82.76% <ø> (ø)
profiling 84.30% <ø> (ø)
profiling-ffi 77.46% <ø> (ø)
serverless 0.00% <ø> (ø)
sidecar 38.01% <ø> (ø)
sidecar-ffi 0.00% <ø> (ø)
spawn-worker 50.36% <ø> (ø)
tinybytes 94.77% <ø> (ø)
trace-mini-agent 72.36% <ø> (ø)
trace-normalization 98.23% <ø> (ø)
trace-obfuscation 95.77% <ø> (ø)
trace-protobuf 77.67% <ø> (ø)
trace-utils 93.29% <ø> (ø)

@sanchda sanchda enabled auto-merge (squash) November 22, 2024 20:27
_ => Err(anyhow::anyhow!("poll returned unexpected result")),
},
let mut poll_fds = [pollfd {
fd: target_fd,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - BorrowedFd prefferably should be use in conjuction with OwnedFd. borrow_raw - without any guarantees of FD lifetime is problematic.

Probably the safest option would be to dup the fd - and own it within the context of this function.

Otherwise the code looks like correct but "C'ish" rust :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#758 to track

let reaping_allowed_ms = std::cmp::min(
timeout_ms.saturating_sub(start_time.elapsed().as_millis() as u32),
DD_CRASHTRACK_MINIMUM_REAP_TIME_MS,
);

let _ = reap_child_non_blocking(receiver_pid_as_pid, reaping_allowed_ms);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return Err(anyhow::anyhow!("Timeout waiting for child process to exit"));

In sidecar - we send kill and term. When the timeout ends.

And it looks that - we're not doing that here either way - so a non 0 timeout will only reduce the incidence of zombies. Not prevent them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@pawelchcki pawelchcki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved - because Its an improvement over previous code. But it looks like some issues with zombies can still show up from time to time.

@sanchda sanchda merged commit 6fe032f into main Nov 25, 2024
32 checks passed
@sanchda sanchda deleted the sanchda/fix_poll_zombies branch November 25, 2024 21:04
_ => Err(anyhow::anyhow!("poll returned unexpected result")),
},
let mut poll_fds = [pollfd {
fd: target_fd,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#758 to track

revents: 0,
}];

match unsafe { poll(poll_fds.as_mut_ptr(), 1, timeout_ms) } {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: this should be .len not constant 1

revents if revents.contains(PollFlags::POLLHUP) => Ok(true),
_ => Err(anyhow::anyhow!("poll returned unexpected result")),
},
let mut poll_fds = [pollfd {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment to explain the meaning of the boolean result

let reaping_allowed_ms = std::cmp::min(
timeout_ms.saturating_sub(start_time.elapsed().as_millis() as u32),
DD_CRASHTRACK_MINIMUM_REAP_TIME_MS,
);

let _ = reap_child_non_blocking(receiver_pid_as_pid, reaping_allowed_ms);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants