test(IDX): don't run tests that take longer than 5 mins on PRs (#2017)

What === This PR proposes to forbid running tests in ci-main/bazel-test-all, that take longer than 5 minutes on PRs and instead only run them on pushes to master or when manually triggered via workflow_dispatch. Why === IDX has the ambitious goal of having a 20 minute 90th percentile of the duration of CI Main on PRs. See the top charts on the [GitHub IC Stats & SLO](https://superset.idx.dfinity.network/superset/dashboard/3/) dashboard for the current statistics (at the time of writing we're at around 31 minutes). Currently the dominating actions on critical paths on PRs are taken up by long running tests. To not suffer from these long running tests we propose to institute a rule that forbids tests taking longer than 5 minutes on PRs. Instead these tests have to move to a new workflow that runs these tests on a push to master. How === * The chart [Bazel Test All / P90 Test Duration / From Last Week Onwards](https://superset.idx.dfinity.network/explore/?slice_id=48) shows the 90th percentile of the duration of bazel tests on PRs (bazel-test-all) from last week till now. * All tests in that chart that take longer than 5 minutes have been tagged with `long_test`. * Why 5 minutes? On average system-tests on PRs take 3.6 minutes. 5 minutes seems like a nice round number. * IDX will keep an eye on that chart and will periodically move offending tests to `long_test`. * `ci-main/bazel-test-all` sets `--test_tag_filters=-long_test` on `pull_request` and `merge_group` events to prevent running these long tests on PRs. * Note that `ci-main/bazel-test-all` also runs for every push to master. It will then run the long_tests so in case there's a regression it's easy to determine which commit introduced it. Future Work === * We could consider moving the system-tests tagged with `system_test_hourly` to `long_test` as well and remove the `bazel-system-test-hourly` job from the `schedule-hourly` workflow to simplify CI. * IDX will likely introduce a Superset chart that shows for every test failure on master the first commit where the failure occurred. This will help us quickly finding the commit that introduced a regression. --------- Co-authored-by: IDX GitHub Automation <IDX GitHub Automation>
dfinity · Oct 18, 2024 · a25a338 · a25a338
1 parent da6ea99
commit a25a338
Show file tree

Hide file tree

Showing 19 changed files with 162 additions and 18 deletions.
diff --git a/.github/workflows-source/ci-main.yml b/.github/workflows-source/ci-main.yml
@@ -112,16 +112,32 @@ jobs:
         shell: bash
         run: |
           set -xeuo pipefail
+
+          # Determine which tests to skip and append 'long_test' for pull requests or merge groups
+          EXCLUDED_TEST_TAGS=(system_test_hourly system_test_nightly system_test_nightly_nns system_test_staging system_test_hotfix system_test_benchmark fuzz_test fi_tests_nightly nns_tests_nightly)
+          [[ "${{ github.event_name }}" =~ ^(pull_request|merge_group)$ ]] && EXCLUDED_TEST_TAGS+=(long_test)
+
+          # Export excluded tags as environment variable for ci/bazel-scripts/diff.sh
+          echo "EXCLUDED_TEST_TAGS=${EXCLUDED_TEST_TAGS[*]}" >> $GITHUB_ENV
+
+          # Prepend tags with '-' and join them with commas for Bazel
+          TEST_TAG_FILTERS=$(IFS=,; echo "${EXCLUDED_TEST_TAGS[*]/#/-}")
+
+          # Determine BAZEL_EXTRA_ARGS based on event type or branch name
+          BAZEL_EXTRA_ARGS="--test_tag_filters=$TEST_TAG_FILTERS"
           if [[ "${{ github.event_name }}" == 'merge_group' ]]; then
-            echo "BAZEL_EXTRA_ARGS=--test_timeout_filters=short,moderate --flaky_test_attempts=3" >> $GITHUB_ENV
+              BAZEL_EXTRA_ARGS+=" --test_timeout_filters=short,moderate --flaky_test_attempts=3"
           elif [[ $BRANCH_NAME =~ ^hotfix-.* ]]; then
-            echo "BAZEL_EXTRA_ARGS=--test_timeout_filters=short,moderate" >> $GITHUB_ENV
+              BAZEL_EXTRA_ARGS+=" --test_timeout_filters=short,moderate"
           else
-            echo "BAZEL_EXTRA_ARGS=--keep_going" >> $GITHUB_ENV
+              BAZEL_EXTRA_ARGS+=" --keep_going"
           fi
+
+          # Export BAZEL_EXTRA_ARGS to environment
+          echo "BAZEL_EXTRA_ARGS=$BAZEL_EXTRA_ARGS" >> $GITHUB_ENV
       - name: Run Bazel Test All
         id: bazel-test-all
-        uses:  ./.github/actions/bazel-test-all/
+        uses: ./.github/actions/bazel-test-all/
         env:
           AWS_SHARED_CREDENTIALS_CONTENT: ${{ secrets.AWS_SHARED_CREDENTIALS_FILE }}
           # Only run ci/bazel-scripts/diff.sh on PRs that are not labeled with "CI_ALL_BAZEL_TARGETS".

diff --git a/.github/workflows/ci-main.yml b/.github/workflows/ci-main.yml
@@ -58,13 +58,29 @@ jobs:
         shell: bash
         run: |
           set -xeuo pipefail
+
+          # Determine which tests to skip and append 'long_test' for pull requests or merge groups
+          EXCLUDED_TEST_TAGS=(system_test_hourly system_test_nightly system_test_nightly_nns system_test_staging system_test_hotfix system_test_benchmark fuzz_test fi_tests_nightly nns_tests_nightly)
+          [[ "${{ github.event_name }}" =~ ^(pull_request|merge_group)$ ]] && EXCLUDED_TEST_TAGS+=(long_test)
+
+          # Export excluded tags as environment variable for ci/bazel-scripts/diff.sh
+          echo "EXCLUDED_TEST_TAGS=${EXCLUDED_TEST_TAGS[*]}" >> $GITHUB_ENV
+
+          # Prepend tags with '-' and join them with commas for Bazel
+          TEST_TAG_FILTERS=$(IFS=,; echo "${EXCLUDED_TEST_TAGS[*]/#/-}")
+
+          # Determine BAZEL_EXTRA_ARGS based on event type or branch name
+          BAZEL_EXTRA_ARGS="--test_tag_filters=$TEST_TAG_FILTERS"
           if [[ "${{ github.event_name }}" == 'merge_group' ]]; then
-            echo "BAZEL_EXTRA_ARGS=--test_timeout_filters=short,moderate --flaky_test_attempts=3" >> $GITHUB_ENV
+              BAZEL_EXTRA_ARGS+=" --test_timeout_filters=short,moderate --flaky_test_attempts=3"
           elif [[ $BRANCH_NAME =~ ^hotfix-.* ]]; then
-            echo "BAZEL_EXTRA_ARGS=--test_timeout_filters=short,moderate" >> $GITHUB_ENV
+              BAZEL_EXTRA_ARGS+=" --test_timeout_filters=short,moderate"
           else
-            echo "BAZEL_EXTRA_ARGS=--keep_going" >> $GITHUB_ENV
+              BAZEL_EXTRA_ARGS+=" --keep_going"
           fi
+
+          # Export BAZEL_EXTRA_ARGS to environment
+          echo "BAZEL_EXTRA_ARGS=$BAZEL_EXTRA_ARGS" >> $GITHUB_ENV
       - name: Run Bazel Test All
         id: bazel-test-all
         uses: ./.github/actions/bazel-test-all/

diff --git a/bazel/conf/.bazelrc.build b/bazel/conf/.bazelrc.build
@@ -54,12 +54,11 @@ build --flag_alias=timeout_value=//bazel:timeout_value
 # Exclude system tests by default
 # https://github.com/bazelbuild/bazel/issues/8439
 build --build_tag_filters="-system_test,-upload,-fuzz_test"
-test --test_tag_filters="-system_test,-post_master,-fuzz_test"
+test --test_tag_filters="-system_test,-fuzz_test"
 test:alltests --test_tag_filters=""
 test:paritytests --test_tag_filters="-system_test"
 build:ci --build_tag_filters="-system_test,-fuzz_test"
 build:ci --verbose_failures
-test:ci --test_tag_filters="-post_master,-system_test_hourly,-system_test_nightly,-system_test_nightly_nns,-system_test_staging,-system_test_hotfix,-system_test_benchmark,-fuzz_test,-fi_tests_nightly,-nns_tests_nightly"
 
 test --test_output=errors
 test --test_env=RUST_BACKTRACE=full

diff --git a/ci/bazel-scripts/diff.sh b/ci/bazel-scripts/diff.sh
@@ -55,7 +55,12 @@ fi
 if [ "${BAZEL_COMMAND:-}" == "build" ]; then
     TARGETS=$(bazel query "rdeps(//..., set(${files[*]}))")
 elif [ "${BAZEL_COMMAND:-}" == "test" ]; then
-    TARGETS=$(bazel query "kind(test, rdeps(//..., set(${files[*]}))) except attr('tags', 'manual|system_test_hourly|system_test_nightly|system_test_staging|system_test_hotfix|system_test_nightly_nns', //...)")
+    EXCLUDED_TAGS=(manual $EXCLUDED_TEST_TAGS)
+    EXCLUDED_TAGS=$(
+        IFS='|'
+        echo "${EXCLUDED_TAGS[*]}"
+    )
+    TARGETS=$(bazel query "kind(test, rdeps(//..., set(${files[*]}))) except attr('tags', '$EXCLUDED_TAGS', //...)")
 else
     echo "Unknown BAZEL_COMMAND: ${BAZEL_COMMAND:-}" >&2
     exit 1

diff --git a/rs/ethereum/ledger-suite-orchestrator/BUILD.bazel b/rs/ethereum/ledger-suite-orchestrator/BUILD.bazel
@@ -161,6 +161,9 @@ rust_ic_test(
         "LEDGER_ARCHIVE_NODE_CANISTER_WASM_PATH": "$(rootpath //rs/ledger_suite/icrc1/archive:archive_canister_u256.wasm.gz)",
     },
     proc_macro_deps = [],
+    tags = [
+        "long_test",  # since it takes longer than 5 minutes.
+    ],
     deps = [
         # Keep sorted.
         ":ledger_suite_orchestrator",

diff --git a/rs/tests/boundary_nodes/BUILD.bazel b/rs/tests/boundary_nodes/BUILD.bazel
@@ -19,10 +19,12 @@ system_test_nns(
         "HTTP_COUNTER_WASM_PATH": "$(rootpath //rs/tests/test_canisters/http_counter)",
         "ASSET_CANISTER_WASM_PATH": "$(rootpath @asset_canister//file)",
     },
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = BOUNDARY_NODE_GUESTOS_RUNTIME_DEPS + GUESTOS_RUNTIME_DEPS + TEST_CANISTERS_RUNTIME_DEPS + GRAFANA_RUNTIME_DEPS,
@@ -196,10 +198,12 @@ system_test_nns(
 
 system_test_nns(
     name = "api_bn_decentralization_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = BOUNDARY_NODE_GUESTOS_RUNTIME_DEPS + GUESTOS_RUNTIME_DEPS + GRAFANA_RUNTIME_DEPS,

diff --git a/rs/tests/consensus/BUILD.bazel b/rs/tests/consensus/BUILD.bazel
@@ -199,11 +199,13 @@ system_test_nns(
 
 system_test_nns(
     name = "subnet_splitting_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "experimental_system_test_colocation",
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
         "subnet_splitting",
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
@@ -345,10 +347,12 @@ system_test_nns(
 
 system_test_nns(
     name = "adding_nodes_to_subnet_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,

diff --git a/rs/tests/consensus/backup/BUILD.bazel b/rs/tests/consensus/backup/BUILD.bazel
@@ -31,11 +31,12 @@ BACKUP_RUNTIME_DEPS = ["//rs/tests:backup/binaries"]
 
 system_test_nns(
     name = "backup_manager_downgrade_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "experimental_system_test_colocation",
-        "system_test_hourly",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     uses_guestos_dev_test = True,
@@ -54,11 +55,12 @@ system_test_nns(
 
 system_test_nns(
     name = "backup_manager_upgrade_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "experimental_system_test_colocation",
-        "system_test_hourly",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     uses_guestos_dev_test = True,

diff --git a/rs/tests/consensus/orchestrator/BUILD.bazel b/rs/tests/consensus/orchestrator/BUILD.bazel
@@ -5,10 +5,12 @@ package(default_visibility = ["//rs:system-tests-pkg"])
 
 system_test_nns(
     name = "node_assign_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,
@@ -31,10 +33,12 @@ system_test_nns(
 
 system_test_nns(
     name = "node_reassignment_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,
@@ -96,10 +100,12 @@ system_test(
 
 system_test_nns(
     name = "ssh_access_to_nodes_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,
@@ -138,10 +144,12 @@ system_test_nns(
 
 system_test_nns(
     name = "rotate_ecdsa_idkg_key_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,

diff --git a/rs/tests/consensus/subnet_recovery/BUILD.bazel b/rs/tests/consensus/subnet_recovery/BUILD.bazel
@@ -47,11 +47,13 @@ rust_library(
 
 system_test_nns(
     name = "sr_app_same_nodes_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "experimental_system_test_colocation",
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
         "subnet_recovery",
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
@@ -106,11 +108,13 @@ system_test_nns(
 
 system_test_nns(
     name = "sr_app_failover_nodes_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "experimental_system_test_colocation",
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
         "subnet_recovery",
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
@@ -167,11 +171,13 @@ system_test_nns(
 
 system_test_nns(
     name = "sr_app_no_upgrade_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "experimental_system_test_colocation",
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
         "subnet_recovery",
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
@@ -250,11 +256,13 @@ system_test_nns(
 
 system_test_nns(
     name = "sr_nns_same_nodes_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "experimental_system_test_colocation",
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
         "subnet_recovery",
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
@@ -274,11 +282,13 @@ system_test_nns(
 
 system_test_nns(
     name = "sr_nns_failover_nodes_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = False,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "experimental_system_test_colocation",
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
         "subnet_recovery",
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS

diff --git a/rs/tests/consensus/tecdsa/BUILD.bazel b/rs/tests/consensus/tecdsa/BUILD.bazel
@@ -31,10 +31,12 @@ system_test_nns(
 
 system_test_nns(
     name = "tecdsa_add_nodes_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,
@@ -58,10 +60,12 @@ system_test_nns(
 
 system_test_nns(
     name = "tschnorr_message_sizes_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,
@@ -81,10 +85,12 @@ system_test_nns(
 
 system_test_nns(
     name = "tecdsa_remove_nodes_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,
@@ -224,11 +230,13 @@ system_test_nns(
 
 system_test_nns(
     name = "tecdsa_signature_life_cycle_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     # Remove when CON-937 is resolved
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,
@@ -273,10 +281,12 @@ system_test_nns(
 
 system_test_nns(
     name = "tecdsa_signature_timeout_test",
+    extra_head_nns_tags = [],  # don't run the head_nns variant on nightly since it aleady runs on long_test.
     flaky = True,
     proc_macro_deps = MACRO_DEPENDENCIES,
     tags = [
         "k8s",
+        "long_test",  # since it takes longer than 5 minutes.
     ],
     target_compatible_with = ["@platforms//os:linux"],  # requires libssh that does not build on Mac OS
     runtime_deps = GUESTOS_RUNTIME_DEPS,