Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Use affine.delinearize_index for MMA tiles and vector distribution #19228

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

krzysz00
Copy link
Contributor

@krzysz00 krzysz00 commented Nov 20, 2024

This commit updates some by-hand delinearizations in MMA tile generation and vector distribution to use affine.delinearize_index instead.

The main tricky thing here is that a lot of that MMA code would use (id / stride) % size, whereas delinearize's outputs all have the form (id % stride) / nextStride. In all the cases at issue, we could use a utility to convert arrays of sizes and strides to a permutation on a delinearization basis.

In order to not break existing tests, the trivial-loop detector had to be manually instrumented to support delinearize_index (and I got util.assume.int while I was there). (I suspect there're a few other cases, and that, long-term, that detector should be using one of the bounds interfaces, but that's not this PR)

@krzysz00 krzysz00 force-pushed the users/krzysz00/gpu-distribute-with-linearize branch from 5328767 to 5a8fa83 Compare November 21, 2024 20:54
@krzysz00 krzysz00 force-pushed the users/krzysz00/linearize-mma branch from 829c3d5 to aaf6cf8 Compare November 21, 2024 22:30
@krzysz00 krzysz00 force-pushed the users/krzysz00/gpu-distribute-with-linearize branch from 5a8fa83 to 291f570 Compare November 26, 2024 19:24
@krzysz00 krzysz00 force-pushed the users/krzysz00/linearize-mma branch from aaf6cf8 to 3ccf6a4 Compare November 26, 2024 19:28
Base automatically changed from users/krzysz00/gpu-distribute-with-linearize to main November 26, 2024 20:38
@krzysz00 krzysz00 force-pushed the users/krzysz00/linearize-mma branch 4 times, most recently from ba7bc66 to d577300 Compare December 17, 2024 23:07
@krzysz00 krzysz00 force-pushed the users/krzysz00/linearize-mma branch from 31e58d0 to 5592a62 Compare January 3, 2025 22:12
@krzysz00
Copy link
Contributor Author

krzysz00 commented Jan 3, 2025

Update: staring at tests showed that I should go implement the value bounds op interfaces on the affine.delinearize_index and affine.linearize_index because there were some single-iteration loops that weren't getting eliminated.

@krzysz00 krzysz00 force-pushed the users/krzysz00/linearize-mma branch from 8e5ebb0 to de7e313 Compare January 7, 2025 00:13
@krzysz00 krzysz00 marked this pull request as ready for review January 7, 2025 00:21
@krzysz00 krzysz00 force-pushed the users/krzysz00/linearize-mma branch 2 times, most recently from 0277d07 to 3ff65f0 Compare January 13, 2025 21:14
@krzysz00 krzysz00 force-pushed the users/krzysz00/linearize-mma branch 2 times, most recently from fef54fc to eeb3bb1 Compare January 22, 2025 19:28
This commit updates some by-hand delinearizations in MMA tile generation and vector distribution to use `affine.delinearize_index` instead.

The main tricky thing here is that a lot of that MMA code would use `(id / stride) % size`, whereas delinearize's outputs all have the form `(id % stride) / nextStride`. In all the cases at issue, we could use a utility to convert arrays of sizes and strides to a permutation on a delinearization basis.

In order to not break existing tests, the trivial-loop detector had to be manually instrumented to support `delinearize_index` (and I got `util.assume.int` while I was there). (I suspect there're a few other cases, and that, long-term, that detector should be using one of the bounds interfaces, but that's not this PR)# This is a combination of 7 commits.
@krzysz00 krzysz00 force-pushed the users/krzysz00/linearize-mma branch from eeb3bb1 to 12377f8 Compare January 24, 2025 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant