Add SDXL regression test on mi308 #19747

IanNod · 2025-01-21T01:15:20Z

Adds mi308 runner for regression testing and specifies sku for tuned specs
Adds new unet IR for sdxl with 960x1024 image shape
New tuning spec for new unet IR for mi308 sku

ScottTodd · 2025-01-21T22:49:13Z

(seeing all the force pushes for "typo fixes" to get tests working)

When iterating on a single workflow job, you can put a git trailer like

ci-exactly: build_packages,regression_test

in the PR description to run a specific selection of jobs: https://iree.dev/developers/general/contributing/#ci-behavior-manipulation. That reduces the load on unaffected runners and can save a respectable sum from managed runner time that we pay for.

The PkgCI workflows are also designed to be run without needing to rebuild the packages at all, but that takes some workflow_dispatch triggering and possibly workflow surgery. If you expect to be debugging for a longer duration I can help with that.

Do you have access to the runner type you are adding these tests for? Or a similar machine? I would start by getting the tests working locally before pushing to GitHub if possible. If tests can only be iterated on via github actions then that is quite unhealthy for development workflows and should be improved.

- Adds new unet for sdxl image shape of 960x1024 - Adds new tuning spec for new unet shapes - Adds mi308 to tests Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

…thout Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

Signed-off-by: ianNod <[email protected]>

saienduri · 2025-01-24T22:58:56Z

.github/workflows/pkgci_regression_test.yml

@@ -165,3 +173,32 @@ jobs:
            --timeout=600 \
            --retries 7
          echo "$(<job_summary.md )" >> $GITHUB_STEP_SUMMARY
+      # Note: allowing 10% deviation from observed averages here to account for
+      # different runner conditions.
+      - name: "Running SDXL rocm pipeline benchmark (mi300)"


change name here for mi308

saienduri

LGTM thanks

ScottTodd

A few notes but looks good enough to me.

Can you also edit the PR title to match local project style (https://github.com/iree-org/iree/commits/main/)? Notably start with an uppercase letter and include a verb, like "Add SDXL regression test on mi308"

ScottTodd · 2025-01-24T23:05:36Z

.github/workflows/pkgci_regression_test.yml

+          source ${VENV_DIR}/bin/activate
+          pytest ./experimental/benchmarks/sdxl/benchmark_sdxl_rocm.py \
+            --goldentime-tolerance-multiplier 1.1 \
+            --goldentime-rocm-e2e-ms 797.0 \


Nit: I'd round to 800 (I'm skeptical we can get significant figures within 1ms at this scale, 5ms maybe).

Suggested change

--goldentime-rocm-e2e-ms 797.0 \

--goldentime-rocm-e2e-ms 800.0 \

ScottTodd · 2025-01-24T23:06:23Z

.github/workflows/pkgci_regression_test.yml

+            --goldentime-rocm-punet-int8-fp16-ms 138.0 \
+            --goldentime-rocm-punet-int8-fp8-ms 147 \


more nits about precision in benchmark expectations, but you've probably looked at the times closer than me 😅

Suggested change

--goldentime-rocm-punet-int8-fp16-ms 138.0 \

--goldentime-rocm-punet-int8-fp8-ms 147 \

--goldentime-rocm-punet-int8-fp16-ms 140.0 \

--goldentime-rocm-punet-int8-fp8-ms 150.0 \

(These tests are good at guarding against large regressions, but they will flake frequently if the tolerances are too tight and the runners experience different conditions)

ScottTodd · 2025-01-24T23:08:10Z

build_tools/pkgci/external_test_suite/attention_and_matmul_spec_unet_fp16_mi308.mlir

Oh no, another 200+ line spec file 😨. How does this one differ from the mi300 version?

The differences are mainly the shapes used for the 960x1024 IR don't match, and MI308 tunings are different than MI300. It seems we ideally will need a similar spec file for every sku if we want best performance after tuning

ScottTodd · 2025-01-24T23:10:25Z

experimental/regression_suite/shark-test-suite-models/sdxl/test_unet.py

+sdxl_unet_fp16_960_1024_inference_input_0 = fetch_source_fixture(
+    "https://sharkpublic.blob.core.windows.net/sharkpublic/ian/unet_npys/input1.npy",


nit, not blocking.

Common infra should really be pointing at a common file path with some versioning built in (like a date) and some developer documentation explaining how to manage the files. Otherwise anyone wanting to modify this in the future will need to ask you personally what to do 😛

ScottTodd · 2025-01-24T23:11:54Z

experimental/regression_suite/shark-test-suite-models/sdxl/test_unet.py

+if os.path.isfile(
+    f"{iree_test_path_extension}/attention_and_matmul_spec_unet_fp16_{sku}.mlir"
+):


This seems brittle. Is this checking if the test runner has a file in the cache? Should that file be coming from the source code or one of the fetch_source_fixtures? I'm worried this will make the tests even harder to understand and run locally.

This is grabbing the file in the source code. I changed the existing spec name to append the sku (mi300) and added the one for mi308. If we don't have a tuning spec file for the sku in question, then it shouldn't use one as tuning from one sku seems to not be directly applicable to another.

Signed-off-by: ianNod <[email protected]>

IanNod requested review from ScottTodd, benvanik and stellaraccident as code owners January 21, 2025 01:15

IanNod force-pushed the unet_mi308_regression_test branch from 6396140 to 33254c6 Compare January 21, 2025 16:40

IanNod marked this pull request as draft January 21, 2025 16:40

IanNod force-pushed the unet_mi308_regression_test branch 8 times, most recently from 6171e30 to 73ab365 Compare January 21, 2025 22:40

IanNod force-pushed the unet_mi308_regression_test branch 5 times, most recently from de6bf5c to 12acf02 Compare January 23, 2025 20:52

IanNod added 7 commits January 23, 2025 15:42

Adds new unet regression test

6613180

- Adds new unet for sdxl image shape of 960x1024 - Adds new tuning spec for new unet shapes - Adds mi308 to tests Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

fix typos

590832d

Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

Linting

ddff236

Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

Update inputs and outputs for new unet model

70a152c

Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

If no spec file for sku uses mi300 for punet due to numeric issues wi…

cb5dcf0

…thout Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

Linting

bcb87ff

Signed-off-by: Ian <[email protected]> Signed-off-by: ianNod <[email protected]>

Add MI308 to benchmark regression tests

2504703

Signed-off-by: ianNod <[email protected]>

IanNod force-pushed the unet_mi308_regression_test branch from deb8eb2 to 2504703 Compare January 23, 2025 21:46

IanNod marked this pull request as ready for review January 23, 2025 22:49

IanNod requested a review from saienduri January 23, 2025 22:50

saienduri reviewed Jan 24, 2025

View reviewed changes

Clarify runner name

dab8fb7

saienduri approved these changes Jan 24, 2025

View reviewed changes

IanNod enabled auto-merge (squash) January 24, 2025 23:03

ScottTodd approved these changes Jan 24, 2025

View reviewed changes

IanNod disabled auto-merge January 24, 2025 23:16

Loosen benchmark precision times for runner tolerances

112d2e8

Signed-off-by: ianNod <[email protected]>

IanNod changed the title ~~unet mi308 regression test~~ Add SDXL regression test on mi308 Jan 24, 2025

IanNod enabled auto-merge (squash) January 24, 2025 23:39

IanNod merged commit ea599e2 into iree-org:main Jan 25, 2025
40 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SDXL regression test on mi308 #19747

Add SDXL regression test on mi308 #19747

IanNod commented Jan 21, 2025

ScottTodd commented Jan 21, 2025

saienduri Jan 24, 2025

saienduri left a comment

ScottTodd left a comment

ScottTodd Jan 24, 2025

ScottTodd Jan 24, 2025

ScottTodd Jan 24, 2025

IanNod Jan 24, 2025

ScottTodd Jan 24, 2025

ScottTodd Jan 24, 2025

IanNod Jan 24, 2025

	--goldentime-rocm-e2e-ms 797.0 \
	--goldentime-rocm-e2e-ms 800.0 \

		--goldentime-rocm-punet-int8-fp16-ms 138.0 \
		--goldentime-rocm-punet-int8-fp8-ms 147 \

		sdxl_unet_fp16_960_1024_inference_input_0 = fetch_source_fixture(
		"https://sharkpublic.blob.core.windows.net/sharkpublic/ian/unet_npys/input1.npy",

Add SDXL regression test on mi308 #19747

Add SDXL regression test on mi308 #19747

Conversation

IanNod commented Jan 21, 2025

ScottTodd commented Jan 21, 2025

Choose a reason for hiding this comment

saienduri left a comment

Choose a reason for hiding this comment

ScottTodd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment