From 3bad1190cc4ec1e983f8c537095c03af8fd2c096 Mon Sep 17 00:00:00 2001 From: Anupama Chandrasekhar Date: Fri, 20 Sep 2024 15:08:09 -0700 Subject: [PATCH 1/7] First Draft --- proposals/NNNN-max-records-per-node.md | 114 +++++++++++++++++++++++++ 1 file changed, 114 insertions(+) create mode 100644 proposals/NNNN-max-records-per-node.md diff --git a/proposals/NNNN-max-records-per-node.md b/proposals/NNNN-max-records-per-node.md new file mode 100644 index 00000000..20e65d19 --- /dev/null +++ b/proposals/NNNN-max-records-per-node.md @@ -0,0 +1,114 @@ + + +* Proposal: [MaxRecordsPerNode Attribute for NodeArrayOutput](NNNN-filename.md) +* Author(s): [Anupama Chandrasekhar](https://github.com/anupamachandra), [Mike Apodaca](https://github.com/mapodaca-nv) +* Sponsor: TBD +* Status: **Under Consideration** + +# [MaxRecordsPerNode(count)] Attribute for NodeOutputArray + +## Introduction + +This specification describes the HLSL and DXIL details for a new [NodeArrayOutput](https://microsoft.github.io/DirectX-Specs/d3d/WorkGraphs.html#node-output-attributes) attribute `[MaxRecordsPerNode(count)]` that specifies the maximum number of records that can be output to a specific node in a node output array. See the [MaxRecordsPerNode]() specifications for more details. + +## Motivation + +For `NodeArrayOutput`, the node output attribute `[MaxRecords(count)]` specifies the maximum number of records that can +be output across the entire node array. This attribute alone is insufficient for determining how records are +distributed across an output array. For example, consider an output node array specification of +`[MaxRecords(N)][NodeArraySize(N)]`. All N records could be sent to one node in the array, or one record could be +sent to each of the N nodes in the array, or the records could be spread in an arbitrary fashion across multiple nodes +in the array. An implementation cannot distinguish these different use cases. + +When determining backing store memory requirements, an implementation must assume the worst-case of `MaxRecords` written +to any single node in the output array. However, a common use-case is for a small number records to be written to +select nodes in a very large array of nodes. Some implementations can take advantage of this knowledge to significantly +reduce the backing store memory requirements while maintaining peak performance. + +## Proposed solution + +We propose a new node output attribute called `MaxRecordsPerNode`. This parameter is only required for output node +arrays. This attribute specifies the maximum number of records that can be written to any single output node within a +node array. + +## Detailed design + +### HLSL Additions + +Add a new node output attribute: + +| Attribute | Required | Description | +|:--- |:--------:|:------------| +| `[MaxRecordsPerNode(count)]` | Y | For `NodeArrayOutput`, specifies the maximum number of records that can be output to a node within the array. Exceeding this results in undefined behavior. This attribute can be overridden via the `NumOutputOverrides / pOutputOverrides` option when constructing a work graph. This attribute has no impact on existing node output limits. | + +This attribute will be required starting with a future Shader Model version. +Since this may cause compilation failures with existing Work Graphs, a new compiler command line option will be +introduced to replace the compiler error with a warning and implicitly set the value of `MaxRecordsPerNode` +equal to `MaxRecords`. (TODO: Specify the command line option) + +The compiler will also generate an error if the `MaxRecordsPerNode` value is greater than the `MaxRecords` in a HLSL shader. Note that `pMaxRecordsPerNode` may override this value and the runtime will validate the correctness in that case. See the feature [spec]() for more details. + +**Developer's note**: Implementations that do not support or ignore this attribute, will not be functionally impacted. + +### Usage + +The following trivial example demonstrates using `MaxRecordsPerNode` for a thread launch node which distributes +a single record across an array of 64 consumer thread launch nodes. + +```cpp +[Shader("node")] +[NodeLaunch("thread")] +[NodeIsProgramEntry] +void DispatchNode( + [MaxRecords(64)] // a maximum of 64 records are written to output node array, + [MaxRecordsPerNode(1)] // but only 1 record is written to each node in the array + [NodeArraySize(64)] NodeOutputArray ConsumerNodes ) +{ + [unroll] for(uint i = 0; i < 64; ++i) + { + ThreadNodeOutputRecords outputRecord = ConsumerNodes[i].GetThreadNodeOutputRecords(1); + ... + outputRecord.OutputComplete(); + } +} +``` + +As mentioned above, some material shading algorithms have a similar pattern: a single node which makes a decision about +which node(s) in a node array (materials) to execute, where the number of possible materials is large, but the number of +records submitted to any specific node is small, relative to the size of the array. + +### Interchange Format Additions + +A new metadata tag is added for MaxRecordsPerNode. + +|Tag |Tag Encoding |Value Type |Default | +|:------------------ |:----------------|:--------------|:-----------| +|kDxilNodeMaxRecordsPerNodeTag |`7` |`i32` |Required, See [HLSL Additions](#hlsl-additions) section for backward compatibility with older Shader Models | + +### Runtime Additions + +The `MaxRecordsPerNode` information will be captured to RDAT. Similar to other Node attributes, add a `RDAT::NodeAttribKind` named `MaxRecordsPerNode`. + +## Alternatives considered + +### Parameter of MaxRecords + +Modify the definition for `MaxRecords` node output attribute: + +| attribute | required | description | +|:--- |:--------:|:------------| +| `[MaxRecords(count)]` or `[MaxRecords(count, maxRecordsPerNode)]` | Y (this or below attribute) | Given uint `count` declaration, the thread group can output `0...count` records to this output. The variant with `maxRecordsPerNode` is required for `NodeArrayOutput`, where `count` applies across all the output nodes in the array and `maxRecordsPerNode` specifies the maximum number of records that can be written to a single output node within the array. Exceeding these limits results in undefined behavior. The value of `maxRecordsPerNode` must be less-than or equal to the value of `count`. These attributes can be overridden via the `NumOutputOverrides / pOutputOverrides` option when constructing a work graph as part of the [definition of a node](). See [Node output limits](). | + +Note: if the specification is `MaxRecords(count, maxRecordsPerNode)`, then multiple outputs that share budget using +`MaxRecordsSharedWith` **must** also share the same value for `maxRecordsPerNode`. While in many cases this might be +correct, this locks this requirement into the spec and restricts an implementation's ability to distinguish cases where +they are different. + +### Optional Attribute + +This attribute could be made optional, for maximum backward compatibility; i.e. existing SM6.8 Work Graphs compile with +the newer Shader Model. When `MaxRecordsPerNode` is _not_ specified, the implicit value of `MaxRecordsPerNode` is equal +to `MaxRecords`. This also avoids redundant attribute specifications for those usage models where the values of +`MaxRecords` and `MaxRecordsPerNode` are identical. + + From b4c5208ef665b5f4db8995a79806cc7b836d17be Mon Sep 17 00:00:00 2001 From: Anupama Chandrasekhar Date: Fri, 18 Oct 2024 11:37:46 -0700 Subject: [PATCH 2/7] Update proposals/NNNN-max-records-per-node.md Co-authored-by: Tex Riddell --- proposals/NNNN-max-records-per-node.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/proposals/NNNN-max-records-per-node.md b/proposals/NNNN-max-records-per-node.md index 20e65d19..7a9337b7 100644 --- a/proposals/NNNN-max-records-per-node.md +++ b/proposals/NNNN-max-records-per-node.md @@ -42,9 +42,10 @@ Add a new node output attribute: | `[MaxRecordsPerNode(count)]` | Y | For `NodeArrayOutput`, specifies the maximum number of records that can be output to a node within the array. Exceeding this results in undefined behavior. This attribute can be overridden via the `NumOutputOverrides / pOutputOverrides` option when constructing a work graph. This attribute has no impact on existing node output limits. | This attribute will be required starting with a future Shader Model version. -Since this may cause compilation failures with existing Work Graphs, a new compiler command line option will be -introduced to replace the compiler error with a warning and implicitly set the value of `MaxRecordsPerNode` -equal to `MaxRecords`. (TODO: Specify the command line option) +Since this may cause compilation failures with existing Work Graphs, this will +be a `DefaultError` warning assigned to a warning group named +`hlsl-require-max-records-per-node` to allow a command-line override. +The value of `MaxRecordsPerNode` will be set equal to `MaxRecords`. The compiler will also generate an error if the `MaxRecordsPerNode` value is greater than the `MaxRecords` in a HLSL shader. Note that `pMaxRecordsPerNode` may override this value and the runtime will validate the correctness in that case. See the feature [spec]() for more details. From 57c4f61bc04bb95636513484f54019ebb41b95df Mon Sep 17 00:00:00 2001 From: Anupama Chandrasekhar Date: Fri, 18 Oct 2024 18:16:41 -0700 Subject: [PATCH 3/7] Update alternative solution, resolve ambiguity --- proposals/NNNN-max-records-per-node.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/NNNN-max-records-per-node.md b/proposals/NNNN-max-records-per-node.md index 7a9337b7..437f4175 100644 --- a/proposals/NNNN-max-records-per-node.md +++ b/proposals/NNNN-max-records-per-node.md @@ -98,12 +98,12 @@ Modify the definition for `MaxRecords` node output attribute: | attribute | required | description | |:--- |:--------:|:------------| -| `[MaxRecords(count)]` or `[MaxRecords(count, maxRecordsPerNode)]` | Y (this or below attribute) | Given uint `count` declaration, the thread group can output `0...count` records to this output. The variant with `maxRecordsPerNode` is required for `NodeArrayOutput`, where `count` applies across all the output nodes in the array and `maxRecordsPerNode` specifies the maximum number of records that can be written to a single output node within the array. Exceeding these limits results in undefined behavior. The value of `maxRecordsPerNode` must be less-than or equal to the value of `count`. These attributes can be overridden via the `NumOutputOverrides / pOutputOverrides` option when constructing a work graph as part of the [definition of a node](). See [Node output limits](). | +| `[MaxRecords(count, maxRecordsPerNode)]` | Y (this or below attribute) | Given uint `count` declaration, the thread group can output `0...count` records to this output. The variant with `maxRecordsPerNode` is required for `NodeArrayOutput`, where `count` applies across all the output nodes in the array and `maxRecordsPerNode` specifies the maximum number of records that can be written to a single output node within the array. Exceeding these limits results in undefined behavior. The value of `maxRecordsPerNode` must be less-than or equal to the value of `count`. These attributes can be overridden via the `NumOutputOverrides / pOutputOverrides` option when constructing a work graph as part of the [definition of a node](). See [Node output limits](). | Note: if the specification is `MaxRecords(count, maxRecordsPerNode)`, then multiple outputs that share budget using `MaxRecordsSharedWith` **must** also share the same value for `maxRecordsPerNode`. While in many cases this might be correct, this locks this requirement into the spec and restricts an implementation's ability to distinguish cases where -they are different. +they are different. We therefore prefer the option of specifying `MaxRecordsPerNode(count)` as a separate attribute. ### Optional Attribute From 2fd297703c3f47a2c48140767efb738fe96d3e13 Mon Sep 17 00:00:00 2001 From: Anupama Chandrasekhar Date: Mon, 21 Oct 2024 14:10:46 -0700 Subject: [PATCH 4/7] Minor update to optional attribute section --- proposals/NNNN-max-records-per-node.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/proposals/NNNN-max-records-per-node.md b/proposals/NNNN-max-records-per-node.md index 437f4175..06647bc4 100644 --- a/proposals/NNNN-max-records-per-node.md +++ b/proposals/NNNN-max-records-per-node.md @@ -108,8 +108,9 @@ they are different. We therefore prefer the option of specifying `MaxRecordsPerN ### Optional Attribute This attribute could be made optional, for maximum backward compatibility; i.e. existing SM6.8 Work Graphs compile with -the newer Shader Model. When `MaxRecordsPerNode` is _not_ specified, the implicit value of `MaxRecordsPerNode` is equal -to `MaxRecords`. This also avoids redundant attribute specifications for those usage models where the values of -`MaxRecords` and `MaxRecordsPerNode` are identical. +the newer Shader Model. When `MaxRecordsPerNode` is _not_ specified, the implicit value of `MaxRecordsPerNode` is +equal to `MaxRecords`. This also avoids redundant attribute specifications for those usage models where the values of +`MaxRecords` and `MaxRecordsPerNode` are identical. However, for performance reasons, this was made a required +attribute with a compiler fall back for backward compatibilty. From 03a843b4564eaa7e7811f761813359a34364c5e5 Mon Sep 17 00:00:00 2001 From: Anupama Chandrasekhar Date: Wed, 30 Oct 2024 13:56:24 -0700 Subject: [PATCH 5/7] Update Sponsor and Proposal Number --- proposals/NNNN-max-records-per-node.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/NNNN-max-records-per-node.md b/proposals/NNNN-max-records-per-node.md index 06647bc4..ea140497 100644 --- a/proposals/NNNN-max-records-per-node.md +++ b/proposals/NNNN-max-records-per-node.md @@ -1,8 +1,8 @@ -* Proposal: [MaxRecordsPerNode Attribute for NodeArrayOutput](NNNN-filename.md) +* Proposal: [0025](0025-filename.md) * Author(s): [Anupama Chandrasekhar](https://github.com/anupamachandra), [Mike Apodaca](https://github.com/mapodaca-nv) -* Sponsor: TBD +* Sponsor: Damyan Pepper * Status: **Under Consideration** # [MaxRecordsPerNode(count)] Attribute for NodeOutputArray From 3a411c690a6ea10ff6d402e62db48d8a5f6666bd Mon Sep 17 00:00:00 2001 From: Anupama Chandrasekhar Date: Wed, 30 Oct 2024 13:58:21 -0700 Subject: [PATCH 6/7] Update NNNN to 0025 --- ...{NNNN-max-records-per-node.md => 0025-max-records-per-node.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename proposals/{NNNN-max-records-per-node.md => 0025-max-records-per-node.md} (100%) diff --git a/proposals/NNNN-max-records-per-node.md b/proposals/0025-max-records-per-node.md similarity index 100% rename from proposals/NNNN-max-records-per-node.md rename to proposals/0025-max-records-per-node.md From 4f42207fbc734c09f0d1ba991ff69172557ad79a Mon Sep 17 00:00:00 2001 From: Anupama Chandrasekhar Date: Wed, 30 Oct 2024 14:00:18 -0700 Subject: [PATCH 7/7] Update filename in Proposal section --- proposals/0025-max-records-per-node.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0025-max-records-per-node.md b/proposals/0025-max-records-per-node.md index ea140497..7e279222 100644 --- a/proposals/0025-max-records-per-node.md +++ b/proposals/0025-max-records-per-node.md @@ -1,6 +1,6 @@ -* Proposal: [0025](0025-filename.md) +* Proposal: [0025](0025-max-records-per-node.md) * Author(s): [Anupama Chandrasekhar](https://github.com/anupamachandra), [Mike Apodaca](https://github.com/mapodaca-nv) * Sponsor: Damyan Pepper * Status: **Under Consideration**