Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPIRV validation error when using min16 types get d #6914

Open
castano opened this issue Sep 13, 2024 · 7 comments
Open

SPIRV validation error when using min16 types get d #6914

castano opened this issue Sep 13, 2024 · 7 comments
Labels
bug Bug, regression, crash spirv Work related to SPIR-V

Comments

@castano
Copy link

castano commented Sep 13, 2024

Description
Compiling one of our HLSL shaders with the following command line:

dxc.exe spark_bc1_rgb_q0.mp.i2i.comp.hlsl -nologo -T cs_6_0 -spirv -fspv-target-env=vulkan1.0 -DVULKAN -Fo spark_bc1_rgb_q0.mp.i2i.comp.hlsl.dxc.spv

Produces this error:

fatal error: generated SPIR-V is invalid: ID '51' decorated with RelaxedPrecision multiple times is not allowed.
  %51 = OpExtInst %float %1 NMax %49 %50

Which seems to be caused by the following spirv validation code:

https://github.com/KhronosGroup/SPIRV-Tools/blob/4451f6ab13dda98bf255a7cd7b4d120132dc0dfd/source/val/validate_decorations.cpp#L1562

I know this report would be much more useful with source code to reproduce the problem, but I'm not able to publish this shader openly and I haven't been able to produce a minimal repro yet.

Before I start debugging this myself, it would be helpful to know if there's any command line option to have DXC output the spv file even though validation failed. Inspecting the spv file may be helpful to understand what part of the code is causing the issue and possibly build a simplified repro.

Environment

  • DXC version: dxcompiler.dll: 1.8 - 1.8.2405.15 (fd7e54b); dxil.dll: 1.8(101.8.2405.17)
  • Host Operating System: Windows 10.
@castano castano added bug Bug, regression, crash needs-triage Awaiting triage spirv Work related to SPIR-V labels Sep 13, 2024
@s-perron
Copy link
Collaborator

You can add the -Vd to skip validation. This should emit the Spir-V

@s-perron
Copy link
Collaborator

To generate a reproducer, you could also try -fcgl. This will turn off all optimizations. If this passes validation, you can run the optimizations from the command line and use spirv-reduce to get a small reproducer.

I'll give more details on Monday.

@castano
Copy link
Author

castano commented Sep 13, 2024

Hi Steven.

This is very useful, thanks!

I can tell exactly what statement is introducing the error now, but it's a bit of a mystery why this is happening, because it's present in all variants of the same shader and not all of them have this issue.

    norm = max(max(abs(r), abs(g)), abs(b));
         %53 = OpExtInst %float %1 FAbs %41
         %54 = OpExtInst %float %1 FAbs %46
         %55 = OpExtInst %float %1 NMax %53 %54
         %56 = OpExtInst %float %1 FAbs %51
         %57 = OpExtInst %float %1 NMax %55 %56

The duplicate decoration appears conspicuously out of place:

       OpDecorate %210 RelaxedPrecision
       OpDecorate %57 RelaxedPrecision
       OpDecorate %211 RelaxedPrecision

I tried with -fcgl and that seems to eliminate the validation error, but triggers an exception:

SPIRV-Cross threw an exception: RayQuery requires Vulkan GLSL 460.

However, -O0 also triggers the same exception and also has the validation error.

I've also confirmed that the problem still exists in the latest dxc release:

dxcompiler.dll: 1.8 - 1.8.2407.7 (416fab6); dxil.dll: 1.8(101.8.2407.12)

Digging a bit more, it seems the problem is caused by that statement being in an unrolled loop where the iteration count depends on an expression that's known at compile time.

    if (fast) iteration_count = 1;
    [unroll] for (int i = 0; i < iteration_count; i++) {
    }

Removing the if (fast) statement or the [unroll] attribute fixes the problem.

I've got a workaround that avoid the issue, but I'd be happy to dig more in order to get to the root of the problem.

@s-perron
Copy link
Collaborator

Here are the step I would take to reduce the tests case:

  1. Dump the unoptimized spir-v from DXC. This is described above, and see to me working.
  2. Identify the pass and the input to the pass that generates the invalid code.
spirv-opt t.spv -o o.spv --validate-after-all --print-all 2>&1 | tee log  ; t.spv is the output from step 1.

You can look at the last line of the form ; IR before pass <pass>. The spir-v that follows is the input to the failing pass and failing pass is <pass>.

  1. Move the failing input into a new spir-v assembly file and assemble it:
spirv-as fail.spvasm -o fail.spv --target-env=<your environment>
  1. Write a script that will return 0 if the given spir-v fails if the same error occurs:
echo 'spirv-opt $1 -o o.spv --<pass> && spirv-val o.spv | grep "decorated with RelaxedPrecision multiple times"' > b.sh
  1. Run spirv-reduce:
spirv-reduce fail.spv -o reduced.spv -- ./b.sh

The final reduced.spv, should be a much smaller test case that hopefully you can provide me. I should be able to get a fix from that.

@llvm-beanz llvm-beanz moved this to For Google in HLSL Triage Sep 16, 2024
@castano
Copy link
Author

castano commented Sep 16, 2024

Here's a reduced repro: bugreport.zip

Let me know if you need anything else!

@s-perron s-perron moved this from For Google to Triaged in HLSL Triage Sep 17, 2024
@s-perron s-perron added this to the Next+1 Release milestone Sep 17, 2024
@s-perron
Copy link
Collaborator

This is a bug in the spirv-opt unroller, so it should be fixed. There is a workaround in most cases (do not mark the loop as unroll), so we will probably not get to it this release. I'll move to the next right away.

@s-perron s-perron removed the needs-triage Awaiting triage label Sep 20, 2024
@castano
Copy link
Author

castano commented Dec 4, 2024

What's the status of this issue? I recently noticed that similar problems also occur on all spirv-opt versions distributed with recent Vulkan SDKs. The most recent Vulkan SDK that doesn't have this problem is version 1.3.280, although in this case it seems to occur in other scenarios that do not involve loop unrolling.

Some of my clients have also run into the same problem independently, although I don't think they have reported it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug, regression, crash spirv Work related to SPIR-V
Projects
Status: New
Status: Triaged
Development

No branches or pull requests

2 participants