-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V8 Regression with Node.js 22 #166
Comments
A lot of these issues appear to have to do with relatively cheap functions. Maybe there is an additional overhead? |
+1, something must be going wrong with the test harness. For example, one test |
Theory: Before node v22, V8 applied dead-code elimination and reduced simple test functions to a no-op which artifically inflated performance numbers. suite
.add('using Array.includes', function () {
const httpVersion = '1.1'
const exists = ['2.0', '1.0', '1.1'].includes(httpVersion)
}) This test is simple, and doesn't do much, but it does do something. Yet for v21, it has the same suite
.add('do nothing', function () { }) Which makes me think that V8 optimizes that test into a no-op. If I modify the test suite slightly like so - const suite = createBenchmarkSuite('Array.includes vs raw comparison')
let exists; // add a global variable so V8 does not do dead-code-elimination on or test logic
suite
.add('using Array.includes', function () {
const httpVersion = '1.1'
exists = ['2.0', '1.0', '1.1'].includes(httpVersion)
})
.add('using Array.includes (first item)', function () {
const httpVersion = '2.0'
exists = ['2.0', '1.0', '1.1'].includes(httpVersion)
}) Then Perhaps in V8 12, some code optimization logic changed and those simple tests are no longer reduced to no-op by V8, but actually have to execute code. If true, it's not that Node v22.1 numbers are not slow, it's that the baseline from v21 is wrong. |
It makes a lot of sense. |
Yeah could be it, looks to me like a reality check for micro benchmarks 😄 |
But even with your suggestion, node 22 seems slower overall in the benchmarks (macOS) Examplenode 20: ## startsWith comparison
|name|ops/sec|samples|
|-|-|-|
|(short string) (true) String#startsWith|155,325,867|99|
|(short string) (true) String#slice and strict comparison|108,541,352|95|
|(long string) (true) String#startsWith|107,012,984|101|
|(long string) (true) String#slice and strict comparison|97,092,528|96|
|(short string) (false) String#startsWith|155,958,923|97|
|(short string) (false) String#slice and strict comparison|106,668,930|99|
|(long string) (false) String#startsWith|152,774,020|96|
|(long string) (false) String#slice and strict comparison|96,296,096|100| node 22: ## startsWith comparison
|name|ops/sec|samples|
|-|-|-|
|(short string) (true) String#startsWith|109,696,453|97|
|(short string) (true) String#slice and strict comparison|78,338,981|91|
|(long string) (true) String#startsWith|71,833,592|91|
|(long string) (true) String#slice and strict comparison|65,265,958|91|
|(short string) (false) String#startsWith|109,524,264|98|
|(short string) (false) String#slice and strict comparison|77,999,196|91|
|(long string) (false) String#startsWith|105,947,492|90|
|(long string) (false) String#slice and strict comparison|65,677,907|94| Benchmark (modified from nodejs-bench-operations): const { createBenchmarkSuite } = require('../common')
const suite = createBenchmarkSuite('startsWith comparison')
const shortString = 'foobar'
const longString = 'foobar'.repeat(100)
const comparison = 'foo'
const comparison2 = 'bar'
let result = false
suite
.add('(short string) (true) String#startsWith', function () {
result = shortString.startsWith(comparison)
}
)
.add('(short string) (true) String#slice and strict comparison', function () {
result = shortString.slice(0, comparison.length) === comparison
}
)
.add('(long string) (true) String#startsWith', function () {
result = longString.startsWith(comparison)
}
)
.add('(long string) (true) String#slice and strict comparison', function () {
result = longString.slice(0, comparison.length) === comparison
}
)
.add('(short string) (false) String#startsWith', function () {
result = shortString.startsWith(comparison2)
}
)
.add('(short string) (false) String#slice and strict comparison', function () {
result = shortString.slice(0, comparison2.length) === comparison2
}
)
.add('(long string) (false) String#startsWith', function () {
result = longString.startsWith(comparison2)
}
)
.add('(long string) (false) String#slice and strict comparison', function () {
result = longString.slice(0, comparison2.length) === comparison2
}
)
.run({ async: false }) |
Yes, and for these micro-benchmarks, I also saw ~30% drop in const longString = 'foobar'.repeat(100)
let result = false
const comparison = 'foo'
console.time('run')
for(let i = 0; i < 100_000_000; i++) {
result = longString.startsWith(comparison)
}
console.timeEnd('run') The results were ~identical between v20 and v22. This ignores any V8 optimizations for functions and directly checks this operation's performance. Something changed in how v22 runs the test harness, and that obscures the performance of the actual code under test. |
Have you checked V8 optimization (maybe with I also suspect this is due to how dead-code elimination is put in place with this new compiler. Let me clarify a few things from the However, If we include operations or disable somehow dead code elimination (for instance, %DoNotOptimize) the microbenchmark won't give us any insight as it's unlikely that operation Y will behave in the same way in production (likely to be optimized). It will be pretty much the same as measuring FLOPS or CPU instructions. My purpose for bringing up this issue is to suggest that we investigate focusing our efforts on more realistic benchmarks. I have another repository for this which I will make public soon. Something has changed, now, we should understand if that's indeed a regressions (caused by not flagging some operations as "optimizable") or not. |
Please, also consider the environment used for each benchmark (ubuntu 22.04 x64 4 vCPUs 16gb - m6a.xlarge). We are discussing some approaches for a more consistent and "close to reality" in RafaelGSS/nodejs-bench-operations#74. |
Yes, I did - I also tried various combinations of that, plus |
Could be due to TurboShaft, introduced fully in V8 12.0, maybe it doesn't do dead code elimination the same way Turboshaft: new architecture for the top-tier optimizing compilerMaglev wasn't our only investment in improved compiler technology. We've also introduced Turboshaft, a new internal architecture for our top-tier optimizing compiler Turbofan, making it both easier to extend with new optimizations and faster at compiling. Since Chrome 120, the CPU-agnostic backend phases all use Turboshaft rather than Turbofan, and compile about twice as fast as before. This is saving energy and is paving the way for more exciting performance gains next year and beyond. Keep an eye out for updates! |
So, I have investigated it a bit and it's unlikely to contain a regression, but a different benchmark approach is required. Before maglev, the benchmarks were optimized directly into TURBOFAN during the benchmark clock. Assume the following analogy:
Note that, the samples were collected including portions of Interpreted code measurements + Turbofan code measurements. Now, with MAGLEV in the middle, a few more operations are executed during the benchmark clock measurement.
Therefore, I'm adjusting the
Hence, for now, disregard these results. Results obtained with
|
The nodejs-bench-operations identified regressions between Node.js 21.7.3 and Node.js 22.1.0
https://github.com/RafaelGSS/nodejs-bench-operations/actions/runs/8946165774/job/24576348629#step:5:56
Source:
The text was updated successfully, but these errors were encountered: