Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SDXL regression test on mi308 #19747

Merged
merged 9 commits into from
Jan 25, 2025

Conversation

IanNod
Copy link
Contributor

@IanNod IanNod commented Jan 21, 2025

  • Adds mi308 runner for regression testing and specifies sku for tuned specs
  • Adds new unet IR for sdxl with 960x1024 image shape
  • New tuning spec for new unet IR for mi308 sku

@IanNod IanNod force-pushed the unet_mi308_regression_test branch from 6396140 to 33254c6 Compare January 21, 2025 16:40
@IanNod IanNod marked this pull request as draft January 21, 2025 16:40
@IanNod IanNod force-pushed the unet_mi308_regression_test branch 8 times, most recently from 6171e30 to 73ab365 Compare January 21, 2025 22:40
@ScottTodd
Copy link
Member

(seeing all the force pushes for "typo fixes" to get tests working)

When iterating on a single workflow job, you can put a git trailer like

ci-exactly: build_packages,regression_test

in the PR description to run a specific selection of jobs: https://iree.dev/developers/general/contributing/#ci-behavior-manipulation. That reduces the load on unaffected runners and can save a respectable sum from managed runner time that we pay for.

The PkgCI workflows are also designed to be run without needing to rebuild the packages at all, but that takes some workflow_dispatch triggering and possibly workflow surgery. If you expect to be debugging for a longer duration I can help with that.

Do you have access to the runner type you are adding these tests for? Or a similar machine? I would start by getting the tests working locally before pushing to GitHub if possible. If tests can only be iterated on via github actions then that is quite unhealthy for development workflows and should be improved.

@IanNod IanNod force-pushed the unet_mi308_regression_test branch 5 times, most recently from de6bf5c to 12acf02 Compare January 23, 2025 20:52
 - Adds new unet for sdxl image shape of 960x1024
 - Adds new tuning spec for new unet shapes
 - Adds mi308 to tests

Signed-off-by: Ian <[email protected]>
Signed-off-by: ianNod <[email protected]>
Signed-off-by: Ian <[email protected]>
Signed-off-by: ianNod <[email protected]>
Signed-off-by: Ian <[email protected]>
Signed-off-by: ianNod <[email protected]>
Signed-off-by: Ian <[email protected]>
Signed-off-by: ianNod <[email protected]>
@IanNod IanNod force-pushed the unet_mi308_regression_test branch from deb8eb2 to 2504703 Compare January 23, 2025 21:46
@IanNod IanNod marked this pull request as ready for review January 23, 2025 22:49
@IanNod IanNod requested a review from saienduri January 23, 2025 22:50
@@ -165,3 +173,32 @@ jobs:
--timeout=600 \
--retries 7
echo "$(<job_summary.md )" >> $GITHUB_STEP_SUMMARY
# Note: allowing 10% deviation from observed averages here to account for
# different runner conditions.
- name: "Running SDXL rocm pipeline benchmark (mi300)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change name here for mi308

Copy link
Collaborator

@saienduri saienduri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@IanNod IanNod enabled auto-merge (squash) January 24, 2025 23:03
Copy link
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes but looks good enough to me.

Can you also edit the PR title to match local project style (https://github.com/iree-org/iree/commits/main/)? Notably start with an uppercase letter and include a verb, like "Add SDXL regression test on mi308"

source ${VENV_DIR}/bin/activate
pytest ./experimental/benchmarks/sdxl/benchmark_sdxl_rocm.py \
--goldentime-tolerance-multiplier 1.1 \
--goldentime-rocm-e2e-ms 797.0 \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd round to 800 (I'm skeptical we can get significant figures within 1ms at this scale, 5ms maybe).

Suggested change
--goldentime-rocm-e2e-ms 797.0 \
--goldentime-rocm-e2e-ms 800.0 \

Comment on lines 194 to 195
--goldentime-rocm-punet-int8-fp16-ms 138.0 \
--goldentime-rocm-punet-int8-fp8-ms 147 \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more nits about precision in benchmark expectations, but you've probably looked at the times closer than me 😅

Suggested change
--goldentime-rocm-punet-int8-fp16-ms 138.0 \
--goldentime-rocm-punet-int8-fp8-ms 147 \
--goldentime-rocm-punet-int8-fp16-ms 140.0 \
--goldentime-rocm-punet-int8-fp8-ms 150.0 \

(These tests are good at guarding against large regressions, but they will flake frequently if the tolerances are too tight and the runners experience different conditions)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, another 200+ line spec file 😨. How does this one differ from the mi300 version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The differences are mainly the shapes used for the 960x1024 IR don't match, and MI308 tunings are different than MI300. It seems we ideally will need a similar spec file for every sku if we want best performance after tuning

Comment on lines +67 to +68
sdxl_unet_fp16_960_1024_inference_input_0 = fetch_source_fixture(
"https://sharkpublic.blob.core.windows.net/sharkpublic/ian/unet_npys/input1.npy",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, not blocking.

Common infra should really be pointing at a common file path with some versioning built in (like a date) and some developer documentation explaining how to manage the files. Otherwise anyone wanting to modify this in the future will need to ask you personally what to do 😛

Comment on lines +283 to +285
if os.path.isfile(
f"{iree_test_path_extension}/attention_and_matmul_spec_unet_fp16_{sku}.mlir"
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems brittle. Is this checking if the test runner has a file in the cache? Should that file be coming from the source code or one of the fetch_source_fixtures? I'm worried this will make the tests even harder to understand and run locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is grabbing the file in the source code. I changed the existing spec name to append the sku (mi300) and added the one for mi308. If we don't have a tuning spec file for the sku in question, then it shouldn't use one as tuning from one sku seems to not be directly applicable to another.

@IanNod IanNod disabled auto-merge January 24, 2025 23:16
@IanNod IanNod changed the title unet mi308 regression test Add SDXL regression test on mi308 Jan 24, 2025
@IanNod IanNod enabled auto-merge (squash) January 24, 2025 23:39
@IanNod IanNod merged commit ea599e2 into iree-org:main Jan 25, 2025
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants