Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for missing cycles column and more #2588

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

travisdowns
Copy link
Contributor

@travisdowns travisdowns commented Dec 17, 2024

In this series:

Example results for perf_test_perf after the fix:

test                             iterations      median         mad         min         max      allocs       tasks        inst      cycles
perf_tests.test_simple_1          774019546     1.313ns     0.019ns     1.259ns     1.333ns       0.000       0.000         7.0         5.0
perf_tests.test_simple_n         3188188100     0.330ns     0.001ns     0.308ns     0.387ns       0.000       0.000         3.1         1.3
perf_tests.test_ready_async_1    3638567766     0.280ns     0.005ns     0.271ns     0.290ns       0.000       0.000         4.0         1.1
perf_tests.test_ready_async_n    3179866200     0.312ns     0.006ns     0.305ns     0.435ns       0.000       0.000         3.1         1.3
perf_tests.test_unready_async_1    28622990    34.253ns     0.303ns    33.950ns    36.198ns       1.000       2.000       360.1       125.3
perf_tests.test_unready_async_n  1082973200     0.942ns     0.011ns     0.897ns     1.031ns       0.020       0.030         9.3         3.5
fixture.test_fixture_1            781121567     1.303ns     0.006ns     1.282ns     1.309ns       0.000       0.000         7.0         5.0
fixture.test_fixture_n           3096749700     0.330ns     0.005ns     0.318ns     0.363ns       0.000       0.000         3.1         1.3
fixture.test_coro_1                71657240    13.640ns     0.179ns    13.461ns    14.588ns       1.000       0.000       150.1        49.2
fixture.test_coro_n              2214644700     0.475ns     0.019ns     0.453ns     0.722ns       0.010       0.000         4.6         1.9

@travisdowns travisdowns changed the title Td upstream 2587 cycles fix Fix for missing cycles column and more Dec 17, 2024
#include "seastar/coroutine/maybe_yield.hh"
#include "seastar/util/later.hh"
#include <seastar/core/coroutine.hh>
#include <seastar/testing/perf_tests.hh>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<> includes please.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaikov could we enforce it? By using -isystem include instead of -Iinclude, though it would take me several weeks to figure out how to do it with cmake.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it wouldn't help.

Copy link
Contributor Author

@travisdowns travisdowns Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry this is how clangd auto-adds headers, let me see if I can configure it to understand the seastar way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in push

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is not currently configurable though there has been a fair amount of effort to make it configurable, see:

clangd/clangd#1247
llvm/llvm-project#67749

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using -isystem include

FWIW using -isystem would actually fix clangd auto-add behavior to use <> instead. However, I believe it comes with downsides, e.g., some/all warnings are not reported in -isystem headers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaikov could we enforce it? By using -isystem include instead of -Iinclude, though it would take me several weeks to figure out how to do it with cmake.

created #2592

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that not reporting errors would make it worse.

Copy link
Contributor Author

@travisdowns travisdowns Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After poking the LLVM bear it looks like this will be available in a future clangd version: llvm/llvm-project#67749 (comment)

/*
* This file is open source software, licensed to you under the terms
* of the Apache License, Version 2.0 (the "License"). See the NOTICE file
* distributed with this work for additional information regarding copyright
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are benchmarks which just test minimal version of the various
PERF_TEST flavors, with mostly empty test bodies. It is intended to
test the overhead of the test framework, so that changes can be made
without any large regressions. There are a lot of cases because
internally the main perf test loop has different flavors depending
on async or not, returning an iteration count or not, etc.

Please enrich the changelog with sample output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, sample out here (from after the coroutine change):

#2588 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you forgot to add it to the cover letter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been added it to the cover letter now.

@travisdowns travisdowns force-pushed the td-upstream-2587-cycles-fix branch from bf6880e to 759efe1 Compare December 18, 2024 20:34
@travisdowns
Copy link
Contributor Author

Sample output of perf_tests_perf:

test iterations median mad min max allocs tasks inst
perf_tests.test_simple_1 774019546 1.313ns 0.019ns 1.259ns 1.333ns 0.000 0.000 7.0
perf_tests.test_simple_n 3188188100 0.330ns 0.001ns 0.308ns 0.387ns 0.000 0.000 3.1
perf_tests.test_ready_async_1 3638567766 0.280ns 0.005ns 0.271ns 0.290ns 0.000 0.000 4.0
perf_tests.test_ready_async_n 3179866200 0.312ns 0.006ns 0.305ns 0.435ns 0.000 0.000 3.1
perf_tests.test_unready_async_1 28622990 34.253ns 0.303ns 33.950ns 36.198ns 1.000 2.000 360.1
perf_tests.test_unready_async_n 1082973200 0.942ns 0.011ns 0.897ns 1.031ns 0.020 0.030 9.3
fixture.test_fixture_1 781121567 1.303ns 0.006ns 1.282ns 1.309ns 0.000 0.000 7.0
fixture.test_fixture_n 3096749700 0.330ns 0.005ns 0.318ns 0.363ns 0.000 0.000 3.1
fixture.test_coro_1 71657240 13.640ns 0.179ns 13.461ns 14.588ns 1.000 0.000 150.1
fixture.test_coro_n 2214644700 0.475ns 0.019ns 0.453ns 0.722ns 0.010 0.000 4.6

These are benchmarks which just test minimal version of the various
PERF_TEST flavors, with mostly empty test bodies. It is intended to
test the overhead of the test framework, so that changes can be made
without any large regressions. There are a lot of cases because
internally the main perf test loop has different flavors depending
on async or not, returning an iteration count or not, etc.
Simplify and coroutinize main loop. This reduces the duplicated portion
in the main loop which handles the 4 compile-time cases of

{void returning, size_t returning} x {sync, async}

test case methods. We duplicate only as much as need to keep the main
loop efficient but try to share the remainder. We can also remove a lot
of boilerplate which is no longer needed in C++.

This fixes scylladb#2587 because the underlying bug there was a change in
one of the duplicated sections but not the other.

Performance should be similar after this change: any benchmark that does
substantial work won't see a change. Very short benchmarks are
sensitive to the work in the main loop and this is slightly better in
some cases and slightly worse in others. This was evaluated with
perf_tests_perf benchmark in this series.

We delegate the choice of whether to yield in the benchmark not to the
code under test (in the future returning case).

Fixes scylladb#2587.
@travisdowns travisdowns force-pushed the td-upstream-2587-cycles-fix branch from 759efe1 to ff20df3 Compare December 18, 2024 21:04
@travisdowns
Copy link
Contributor Author

Actually the output above shows the cycles column is missing from the .md output! Let me fix that.

Here's the non-md output corresponding to the above:

test                             iterations      median         mad         min         max      allocs       tasks        inst      cycles
perf_tests.test_simple_1          774019546     1.313ns     0.019ns     1.259ns     1.333ns       0.000       0.000         7.0         5.0
perf_tests.test_simple_n         3188188100     0.330ns     0.001ns     0.308ns     0.387ns       0.000       0.000         3.1         1.3
perf_tests.test_ready_async_1    3638567766     0.280ns     0.005ns     0.271ns     0.290ns       0.000       0.000         4.0         1.1
perf_tests.test_ready_async_n    3179866200     0.312ns     0.006ns     0.305ns     0.435ns       0.000       0.000         3.1         1.3
perf_tests.test_unready_async_1    28622990    34.253ns     0.303ns    33.950ns    36.198ns       1.000       2.000       360.1       125.3
perf_tests.test_unready_async_n  1082973200     0.942ns     0.011ns     0.897ns     1.031ns       0.020       0.030         9.3         3.5
fixture.test_fixture_1            781121567     1.303ns     0.006ns     1.282ns     1.309ns       0.000       0.000         7.0         5.0
fixture.test_fixture_n           3096749700     0.330ns     0.005ns     0.318ns     0.363ns       0.000       0.000         3.1         1.3
fixture.test_coro_1                71657240    13.640ns     0.179ns    13.461ns    14.588ns       1.000       0.000       150.1        49.2
fixture.test_coro_n              2214644700     0.475ns     0.019ns     0.453ns     0.722ns       0.010       0.000         4.6         1.9

@travisdowns
Copy link
Contributor Author

The unrelated fix for the entirely missing cycles column (not simply zero) in perf tests is over there: #2590

@travisdowns
Copy link
Contributor Author

Not sure who is waiting for who here, but I believe I have addressed all feedback. Can you take a look @tchaikov ?

Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tchaikov
Copy link
Contributor

Not sure who is waiting for who here, but I believe I have addressed all feedback. Can you take a look @tchaikov ?

sure. but we still need the blessings from the seastar maintainers.

@tchaikov
Copy link
Contributor

@scylladb/seastar-maint hello maintainers, could you help review this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

async PERF_TEST always returns 0 cycles
3 participants