cl_ext_buffer_device_address #1159

pjaaskel · 2024-05-07T16:08:30Z

The basic cl_mem buffer API doesn't enable access to the underlying raw pointers in the device memory, preventing its use in host side data structures that need pointer references to objects. This API adds a minimal increment on top of cl_mem that provides such capabilities.

The version 0.1.0 is implemented in PoCL and rusticl for prototyping, but everything's still up for discussion. chipStar is the first client that uses the API.

extensions/cl_ext_buffer_device_address.asciidoc

karolherbst · 2024-06-14T12:17:20Z

any motivation to get this merged? Or anything else needed to discuss before merging this? Could also try to bring it up at the CL WG if needed.

pjaaskel · 2024-06-17T08:39:58Z

any motivation to get this merged? Or anything else needed to discuss before merging this? Could also try to bring it up at the CL WG if needed.

Yep, this is still being discussed in the WG. I personally think it's useful as is and shouldn't harm anything if merged as it even has 2 implementations now.

xml/cl.xml

pjaaskel · 2024-08-26T10:56:04Z

Thanks @SunSerega

SunSerega · 2024-08-26T11:05:12Z

Alright, and now the problem I found in #1171 is visible here because the Presubmit workflow has been properly rerun.

extensions/cl_ext_buffer_device_address.asciidoc

pjaaskel · 2024-09-04T12:30:40Z

One thing I'm wondering about is how should clCreateSubBuffer behave when being executed on a bda memory object? Should it fail or succeed? I'm fine with either, just wondering if some of the behavior needs to be formalized.

CL_MEM_DEVICE_PTR_EXT and CL_MEM_DEVICE_PTRS_EXT comes to my mind, which should probably return the address + the sub buffer offset, at least that's how I have implemented it so far.

Yes, this was the idea. I'll add a mention in the next update.

extensions/cl_ext_buffer_device_address.asciidoc

pjaaskel · 2024-09-09T13:55:58Z

Updated according to @karolherbst comments.

karolherbst

I've already implemented the extension fully in rusticl/mesa (including sharing the same address across devices) and I think it's fine, however I'd still urge to address the concerns I have for layered implementations implementing it on top of Vulkan. I've already considered the constraints when implementing it, however I think it's better to provide clients to query if address sharing across multiple devices is supported or not.

extensions/cl_ext_buffer_device_address.asciidoc

xml/cl.xml

pjaaskel · 2024-09-30T15:32:13Z

@SunSerega thanks!

pjaaskel · 2024-10-22T17:54:11Z

@karolherbst I asked about this in the CL/memory WG. We need to submit CTS tests and this might be good to go then with this one. Do you have good tests in Rusticl side we could use? The test in PoCL is quite basic (and needs to be updated), but can be used as a starting point also.

karolherbst · 2024-10-22T19:49:50Z

@karolherbst I asked about this in the CL/memory WG. We need to submit CTS tests and this might be good to go then with this one. Do you have good tests in Rusticl side we could use? The test in PoCL is quite basic (and needs to be updated), but can be used as a starting point also.

I haven't written any more tests and only used the one in pocl. But I can probably help out with writing tests here.

kpet

Thanks for this proposal!

extensions/cl_ext_buffer_device_address.asciidoc

xml/cl.xml

The basic cl_mem buffer API doesn't enable access to the underlying raw pointers in the device memory, preventing its use in host side data structures that need pointer references to objects. This API adds a minimal increment on top of cl_mem that provides such capabilities. The version 0.1.0 is implemented in PoCL and rusticl for prototyping, but everything's still up for discussion. chipStar is the first client that uses the API.

Co-authored-by: Sun Serega <[email protected]>

Changed the CL_MEM_DEVICE_ADDRESS_EXT wording for multi-device cases "all", not "any", covering a case where not all devices can ensure the same address across the context. In that case CL_INVALID_VALUE can be returned. Defined sub-buffer address computation to be 'base_addr + origin'. Added error conditions for clSetKernelExecInfo when the device doesn't support device pointers.

...and renamed them to CL_MEM_DEVICE_SHARED_ADDRESS_EXT and CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT. The first one guarantees the same address across all devices in the context, whereas the latter allows per-device addresses.

xml/cl.xml

Also made the enums globally unique.

karolherbst · 2024-12-12T17:02:04Z

Not really familiar with the build process and such, so I'm wondering how I'd get an updated cl_ext.h file I could use to update my implementation?

pjaaskel · 2024-12-12T17:20:53Z

Not really familiar with the build process and such, so I'm wondering how I'd get an updated cl_ext.h file I could use to update my implementation?

Good question. I'm new to it too. @SunSerega ?

SunSerega · 2024-12-12T17:43:13Z

My intuition tells me to expect the build workflow, with a single job called "Build spec artifacts" - to produce artifacts.
But it doesn't ¯\_(ツ)_/¯

And even if it did, it would probably only be HTML and PDF artifacts, because header automation is in a separate repo: https://github.com/KhronosGroup/OpenCL-Headers

But my stake here is the XML consumption for my own bindings, I never used C bindings, so I pass the baton to @bashbaug

pjaaskel · 2024-12-16T08:28:20Z

PR for CTS.

karolherbst · 2024-12-16T13:34:21Z

@pjaaskel do you have local changes, because locally I had to change a few things to get things to compile:

diff --git a/xml/cl.xml b/xml/cl.xml
index d7bdb1a..47f52b0 100644
--- a/xml/cl.xml
+++ b/xml/cl.xml
@@ -3736,7 +3736,7 @@ server's OpenCL/api-docs repository.
             <proto><type>cl_int</type>                                  <name>clSetKernelArgDevicePointerEXT</name></proto>
             <param><type>cl_kernel</type>                               <name>kernel</name></param>
             <param><type>cl_uint</type>                                 <name>arg_index</name></param>
-            <param>const <type>void</type>*                             <name>arg_value</name></param>
+            <param>const <type>cl_mem_device_address_ext</type>         <name>arg_value</name></param>
         </command>
         <command suffix="CL_API_SUFFIX__VERSION_2_0">
             <proto><type>cl_int</type>                                  <name>clSetKernelExecInfo</name></proto>
@@ -7200,6 +7200,9 @@ server's OpenCL/api-docs repository.
             </require>
         </extension>
         <extension name="cl_ext_buffer_device_address" revision="0.9.1" supported="opencl" depends="CL_VERSION_3_0" provisional="true">
+            <require>
+                <type name="cl_mem_device_address_ext"/>
+            </require>
             <require>
                 <command name="clSetKernelArgDevicePointerEXT"/>
             </require>
@@ -7212,9 +7215,6 @@ server's OpenCL/api-docs repository.
             <require comment="cl_kernel_exec_info">
                 <enum name="CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT"/>
             </require>
-            <require>
-                <type name="cl_mem_device_address_ext"/>
-            </require>
         </extension>
         <extension name="cl_khr_command_buffer" revision="0.9.5" supported="opencl" depends="CL_VERSION_1_2" ratified="opencl" provisional="true">
             <require>

and the CTS PR still uses cl_mem_device_address_EXT not cl_mem_device_address_ext

pjaaskel · 2024-12-16T13:50:11Z

@pjaaskel do you have local changes, because locally I had to change a few things to get things to compile:

I hadn't local changes, but I didn't try to generate the header (still yet find how), just built the docs. Fixed. Pinging @franz about the CTS for when he comes back to office.

bashbaug · 2024-12-16T17:46:06Z

I didn't try to generate the header (still yet find how)

The header generation scripts are in the headers repo, specifically:

https://github.com/KhronosGroup/OpenCL-Headers/tree/main/scripts

The CMake file for the headers defines two targets related to header generation: headers_generate, and headers_copy. headers_generate generates the headers to an output temporary directory and headers_copy copies the genreated headers into the right place. You will need to tell the header generation scripts where to find the spec XML file, which you can do using the CMake variable OPENCL_HEADERS_XML_PATH.

If you want to do everything in one go, a good workflow is something like:

make headers_generate headers_copy all test -j

This generates the headers, copies them to the right place, builds all tests, and runs all tests.

karolherbst · 2024-12-18T01:47:51Z

Now with the CL_MEM_DEVICE_SHARED_ADDRESS option remove, I don't think the current interface of clSetKernelArgDevicePointerEXT makes much sense.

The good thing about binding buffer objects via clSetKernelArg or setting SVM pointers via clSetKernelArgSVMPointer is, that you don't have to change the argument to enqueue the same kernel on a different device.

Now that each device has its own address, one would have to synchronize the set address with the queues launching those kernels, potentially involing a lot of flushing when it wouldn't be necessary.

Though using sub-buffers if one doesn't want to use the base address is also quite the annoying interface. Maybe we need a clSetKernelArgBuffer, which comes with some useful features, like an offset argument? Might also want to add a couple of more things while at it.

Or we just stick with sub-buffers after all...

EDIT: same issue with CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT

pjaaskel · 2024-12-18T12:55:18Z

Now with the CL_MEM_DEVICE_SHARED_ADDRESS option remove, I don't think the current interface of clSetKernelArgDevicePointerEXT makes much sense.

I noticed the same that it's a bit cumbersome, but it maps to the HIP/CUDA programming model where one device is targeted at the time with the hipSetDevice() API.

The good thing about binding buffer objects via clSetKernelArg or setting SVM pointers via clSetKernelArgSVMPointer is, that you don't have to change the argument to enqueue the same kernel on a different device.

Yes, I agree.

Now that each device has its own address, one would have to synchronize the set address with the queues launching those kernels, potentially involving a lot of flushing when it wouldn't be necessary.

In practice it means that if you submit the kernel to multiple devices' command queues, you have to set the arg for each submission. The client can still pass the cl_mem handle or subbuffers using the core clSetKernelArg() API if they don't want to touch the per-device pointers or set them one at a time.

Though using sub-buffers if one doesn't want to use the base address is also quite the annoying interface. Maybe we need a clSetKernelArgBuffer, which comes with some useful features, like an offset argument? Might also want to add a couple of more things while at it.

Or we just stick with sub-buffers after all...

If you recall, we discussed and considered this option in the beginning, but it was not good enough for HIP/CUDA due to some subbuffer restrictions (alignment at least, but perhaps something else I forgot) and for the fact that some APIs prefer to use the raw pointer address. It might be doable if we lift those restrictions with the extension and then the runtime has to keep book of what raw address maps to which (sub)buffer when interfacing from APIs with raw pointers, but is that a much better option than this one?

karolherbst · 2024-12-18T16:52:02Z

Now that each device has its own address, one would have to synchronize the set address with the queues launching those kernels, potentially involving a lot of flushing when it wouldn't be necessary.

In practice it means that if you submit the kernel to multiple devices' command queues, you have to set the arg for each submission. The client can still pass the cl_mem handle or subbuffers using the core clSetKernelArg() API if they don't want to touch the per-device pointers or set them one at a time.

Oh right, implementations have to look at the kernel arguments set on clEnqueueNDRangeKernel call time, not sure why I thought it's any different.

Though using sub-buffers if one doesn't want to use the base address is also quite the annoying interface. Maybe we need a clSetKernelArgBuffer, which comes with some useful features, like an offset argument? Might also want to add a couple of more things while at it.
Or we just stick with sub-buffers after all...

If you recall, we discussed and considered this option in the beginning, but it was not good enough for HIP/CUDA due to some subbuffer restrictions (alignment at least, but perhaps something else I forgot) and for the fact that some APIs prefer to use the raw pointer address. It might be doable if we lift those restrictions with the extension and then the runtime has to keep book of what raw address maps to which (sub)buffer when interfacing from APIs with raw pointers, but is that a much better option than this one?

I think for alignment it would be much better if we could simply specify the alignment at buffer creation time. Not sure if anything is being worked on in this regard though.

clSetKernelArgBuffer could also get a device parameter added and then the value is specific to the device. That might be feasible and usable? I think that also allows for better debugability and error checking.

pjaaskel · 2024-12-19T08:56:34Z

I think for alignment it would be much better if we could simply specify the alignment at buffer creation time. Not sure if anything is being worked on in this regard though.

I'm not aware. If you want to extend the buffer API, I think it should be a separate extension.

Another problem in addition to the fixed alignment I recall with the approach of (ab)using sub-buffers for implementing the raw pointer passing is that sub-buffers have a size. We discussed using size 0 sub-buffers for this and it started to seem hacky, therefore we went with this raw pointer API plan, IIRC. I can try to dig up the old notes if you want to go back on the drawing board with this extension.

clSetKernelArgBuffer could also get a device parameter added and then the value is specific to the device. That might be feasible and usable? I think that also allows for better debugability and error checking.

Do you mean that we could call clSetKernelArgBufferAddress() multiple times for each device holding the buffer? We could, but then again it doesn't differ much from calling it multiple times for each NDRange enqueue of that kernel (to different devices' queues), in which case the device is implied from the queue it is being pushed to. Perhaps I can just add a sentence that highlights the fact that the raw pointer's device is implied from the command queue the kernel is enqueued to?

karolherbst · 2024-12-19T12:21:02Z

I think for alignment it would be much better if we could simply specify the alignment at buffer creation time. Not sure if anything is being worked on in this regard though.

I'm not aware. If you want to extend the buffer API, I think it should be a separate extension.

Another problem in addition to the fixed alignment I recall with the approach of (ab)using sub-buffers for implementing the raw pointer passing is that sub-buffers have a size. We discussed using size 0 sub-buffers for this and it started to seem hacky, therefore we went with this raw pointer API plan, IIRC. I can try to dig up the old notes if you want to go back on the drawing board with this extension.

Oh right yeah, that's impossible to solve with sub-buffers properly without changing semantics.

clSetKernelArgBuffer could also get a device parameter added and then the value is specific to the device. That might be feasible and usable? I think that also allows for better debugability and error checking.

Do you mean that we could call clSetKernelArgBufferAddress() multiple times for each device holding the buffer? We could, but then again it doesn't differ much from calling it multiple times for each NDRange enqueue of that kernel (to different devices' queues), in which case the device is implied from the queue it is being pushed to. Perhaps I can just add a sentence that highlights the fact that the raw pointer's device is implied from the command queue the kernel is enqueued to?

It could allow you to skip calling clSetKernelArgBufferAddress when and argument won't have to change, but that might be niche enough to not matter.

However it kinda matters more with this API, because it's probably not intended to be thread-safe as clSetKernelArg isn't. And this extension might want to modify the "Multiple Host Threads" section as well. So if you have arguments which would be the same pointer for a given device you could skip those calls if you set the pointer once for each device.

Like I can see the pattern where an application simply binds all arguments once and then uses commands to update the content of those arguments and never the kernel objects itself. And they can do it from multiple threads concurrently and safely as long as they don't modify the kernel object, like calling clSetKernelArg*.

If you require to call clSetKernelArgBufferAddress before each enqueue, application will have to synchronize calls to it from multiple threads. Unless of course clSetKernelArgBufferAddress will be considered thread-safe, which then might make the problem a much smaller one.

pjaaskel · 2024-12-19T14:27:10Z

Unless of course clSetKernelArgBufferAddress will be considered thread-safe, which then might make the problem a much smaller one.

I see. What if we simply declare the call thread safe here and add a note of the problem in the spec? Somehow I'm reluctant to add the explicit device argument to the argument setter function as it's not needed in the basic single device use case.

karolherbst · 2024-12-19T15:01:49Z

I see. What if we simply declare the call thread safe here and add a note of the problem in the spec? Somehow I'm reluctant to add the explicit device argument to the argument setter function as it's not needed in the basic single device use case.

My concern is that most implementations aren't set up to make it thread-safe easily, because it will require reworking how setting arguments is implemented, meaning it's a higher bar to implement this extension. In rusticl it's not an issue, because at the moment it is implemented in a thread-safe way, though I did consider dropping the thread-safety once I get to improve performance more and if it's a significant overhead showing up in applications.

Though one could argue, that all the other clSetKernelArg* calls are device independent and the reason they don't have a device arg is, because it was never needed, just clSetKernelArgBufferAddress being the exception here. Though I don't think it's a huge issue to add it. Nor is it a huge issue not to.

I think the one benefit of adding is, that the runtime could verify if the passed in handle is indeed a valid one. And it could also catch use-after-free use cases when you set a handle, but the buffer gets deallocated later one and the same address becomes valid in a new allocation hiding the bug pretty well.

But if adding it is too much of an issue for chipstar (and other users) then we could also not.

I think it makes sense to wait what others think about this problem. I don't have any strong opinions here either way, just wanted to bring this issue up before the extension is finalized.

pjaaskel · 2024-12-19T15:18:59Z

I suspect the "others" do not have strong opinions here either since the need for this extension originates from HIP/CUDA/chipStar. Somehow adding the device argument seems wrong since so far cl_kernel has been device independent. I suggest we just go with the current API as it fulfills the original need and way of usage, and focus on USVM.

bashbaug · 2024-12-19T19:08:42Z

think for alignment it would be much better if we could simply specify the alignment at buffer creation time. Not sure if anything is being worked on in this regard though.

FYI, there is some work ongoing to specify an alignment at buffer creation time, see internal MR 198.

clSetKernelArgBuffer could also get a device parameter added and then the value is specific to the device. That might be feasible and usable? I think that also allows for better debugability and error checking.

Somehow I'm reluctant to add the explicit device argument to the argument setter function as it's not needed in the basic single device use case.

Perhaps consider doing something like clBuildProgram, which accepts a device list to specify which devices the program should be built for, but the device list can be NULL, in which case the program is built for all devices? I don't know if you'd need a device list or if a single argument would be sufficient.

pjaaskel · 2024-12-20T07:25:25Z

think for alignment it would be much better if we could simply specify the alignment at buffer creation time. Not sure if anything is being worked on in this regard though.

FYI, there is some work ongoing to specify an alignment at buffer creation time, see internal MR 198.

Good to know!

clSetKernelArgBuffer could also get a device parameter added and then the value is specific to the device. That might be feasible and usable? I think that also allows for better debugability and error checking.

Somehow I'm reluctant to add the explicit device argument to the argument setter function as it's not needed in the basic single device use case.

Perhaps consider doing something like clBuildProgram, which accepts a device list to specify which devices the program should be built for, but the device list can be NULL, in which case the program is built for all devices? I don't know if you'd need a device list or if a single argument would be sufficient.

Hmm, if we change the clSetKernelArgDevicePointerEXT() pointer arg to a list of pointers with a pointer per device in the context it would match the buffer info query's return value meaning we could pass its return value directly to this API. The annoying part here is the potential discrepancy with the clBuildProgram's device list: The kernel might not have been built for all kernels in the context whereas the buffers are by default associated with all devices. If we make the list passed to the arg setter to match with the list of devices passed to clBuildProgram, the client has to filter the pointers from the context superset. Likely not a big deal, I suspect the device list in build program is typically not a subset of the context's devices - a rarely if never used feature.

In this sense, Karol's proposal of setting the pointer separately for each device might be actually easier from the client perspective. But then again I don't see the "thread safety" issue of the current proposal a major one as cl_kernels can be cloned etc. In the end I don't have strong preferences and would just go with the current simple option, but can also change it to whatever you prefer.

FYI, Michal implemented the current 0.9.1 version in PoCL.

karolherbst · 2024-12-20T18:59:18Z

I forgot that clCloneKernel existed... yeah that's definitely the cheapest option to get around the thread-safety problem.

karolherbst · 2024-12-21T22:11:35Z

Anyway, updated my implementation to 0.9.1 as well, though I still treat CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT as a flag, because that's also what the current version of the CTS test is doing. Well.. it should also work as a property, but that's untested.

karolherbst · 2024-12-22T05:15:58Z

Another thing I was wondering about today. This extension interacts with SVM in a weird way. One can clSVMAlloc memory and then pass it via a host_ptr into clCreateBuffer or clCreateBufferWithProperties. If CL_MEM_USE_HOST_PTR is also specified as a flag when doing so, the cl_mem object basically wraps the SVM allocation (the CTS has a couple of tests for this).

Now with this extension one can do a CL_MEM_USE_HOST_PTR buffer allocation with an SVM pointer and CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT set to CL_TRUE. That should mean that the addresses returned by CL_MEM_DEVICE_ADDRESS_EXT also match the SVM pointer, no? I wonder if this should explicitly be stated or not. And if the CTS should test for this.

This extension could also just disallow this behavior if this sounds too much of an edge case nobody is going to care about.

Just wanted to point out that this is a real possibility, which implementations might have to face supporting both (as I'm now).

pjaaskel · 2025-01-02T14:34:25Z

This extension could also just disallow this behavior if this sounds too much of an edge case nobody is going to care about.

Just wanted to point out that this is a real possibility, which implementations might have to face supporting both (as I'm now).

This is a good point. I'd just disallow this as if (CG) SVM is supported by the implementation, then this extension is rather pointless as this extension is supposed to be a simplification to CG SVM.

pjaaskel · 2025-01-02T14:35:48Z

...but having said that, if the implementation does support CG SVM, it might still support BDA for legacy/compatibility reasons and in that case the other behavior (the "device ptr" = SVM pointer) would make sense. Other opinions?

karolherbst · 2025-01-02T17:06:23Z

...but having said that, if the implementation does support CG SVM, it might still support BDA for legacy/compatibility reasons and in that case the other behavior (the "device ptr" = SVM pointer) would make sense. Other opinions?

yeah.. I mean it shouldn't be hard for the impl to simply return the SVM pointer for those BDA allocations, because the cl_mem object wrapping an SVM allocation is probably 100x the amount of work compared to handling this edge case.

The normal host_ptr path can have a different address on the GPU side (e.g. if the host memory couldn't be mapped into the GPUs VM), which I think this extension will also have to clarify, but this guarantee also doesn't exist in the core spec (unless it's an SVM allocation).

pjaaskel · 2025-01-03T07:38:37Z

Right. I'll add the other option, returning the SVM pointer in this case, in the next specification revision.

pjaaskel mentioned this pull request May 7, 2024

cl_ext_buffer_device_address pocl/pocl#1441

Closed

1 task

karolherbst reviewed May 7, 2024

View reviewed changes

extensions/cl_ext_buffer_device_address.asciidoc Outdated Show resolved Hide resolved

SunSerega reviewed Aug 25, 2024

View reviewed changes

xml/cl.xml Outdated Show resolved Hide resolved

SunSerega reviewed Aug 26, 2024

View reviewed changes

xml/cl.xml Outdated Show resolved Hide resolved

SunSerega reviewed Aug 26, 2024

View reviewed changes

xml/cl.xml Outdated Show resolved Hide resolved

karolherbst reviewed Sep 3, 2024

View reviewed changes

extensions/cl_ext_buffer_device_address.asciidoc Outdated Show resolved Hide resolved

karolherbst reviewed Sep 4, 2024

View reviewed changes

extensions/cl_ext_buffer_device_address.asciidoc Outdated Show resolved Hide resolved

This comment was marked as resolved.

Sign in to view

karolherbst reviewed Sep 5, 2024

View reviewed changes

extensions/cl_ext_buffer_device_address.asciidoc Outdated Show resolved Hide resolved

karolherbst approved these changes Sep 12, 2024

View reviewed changes

extensions/cl_ext_buffer_device_address.asciidoc Outdated Show resolved Hide resolved

pjaaskel force-pushed the cl_ext_buffer_device_address branch from b8df46b to 1931416 Compare September 24, 2024 12:26

SunSerega reviewed Sep 28, 2024

View reviewed changes

xml/cl.xml Show resolved Hide resolved

SunSerega reviewed Sep 28, 2024

View reviewed changes

xml/cl.xml Outdated Show resolved Hide resolved

kpet requested changes Dec 1, 2024

View reviewed changes

xml/cl.xml Outdated Show resolved Hide resolved

xml/cl.xml Outdated Show resolved Hide resolved

pjaaskel and others added 5 commits December 6, 2024 17:47

Apply suggestions from code review

b461671

Co-authored-by: Sun Serega <[email protected]>

BDA: Made the allocation flags independent from each other

595e268

...and renamed them to CL_MEM_DEVICE_SHARED_ADDRESS_EXT and CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT. The first one guarantees the same address across all devices in the context, whereas the latter allows per-device addresses.

Fixes from SunSerega

8f39997

pjaaskel force-pushed the cl_ext_buffer_device_address branch from 9ad04fb to edcff73 Compare December 6, 2024 15:53

SunSerega reviewed Dec 12, 2024

View reviewed changes

xml/cl.xml Show resolved Hide resolved

SunSerega reviewed Dec 12, 2024

View reviewed changes

xml/cl.xml Outdated Show resolved Hide resolved

SunSerega reviewed Dec 12, 2024

View reviewed changes

xml/cl.xml Outdated Show resolved Hide resolved

BDA: Removed CL_MEM_DEVICE_SHARED_ADDRESS_EXT as unneeded.

c99804d

Also made the enums globally unique.

pjaaskel force-pushed the cl_ext_buffer_device_address branch from ba243d4 to c99804d Compare December 12, 2024 16:38

cl_ext_buffer_device_address: fixed Karol-pointed issues in XML

f9d2828

cl_ext_buffer_device_address #1159

Are you sure you want to change the base?

cl_ext_buffer_device_address #1159

Conversation

pjaaskel commented May 7, 2024

karolherbst commented Jun 14, 2024

pjaaskel commented Jun 17, 2024

pjaaskel commented Aug 26, 2024

SunSerega commented Aug 26, 2024

This comment was marked as resolved.

pjaaskel commented Sep 4, 2024

pjaaskel commented Sep 9, 2024

karolherbst left a comment

Choose a reason for hiding this comment

pjaaskel commented Sep 30, 2024

pjaaskel commented Oct 22, 2024

karolherbst commented Oct 22, 2024

kpet left a comment

Choose a reason for hiding this comment

karolherbst commented Dec 12, 2024

pjaaskel commented Dec 12, 2024

SunSerega commented Dec 12, 2024

pjaaskel commented Dec 16, 2024

karolherbst commented Dec 16, 2024

pjaaskel commented Dec 16, 2024

bashbaug commented Dec 16, 2024

karolherbst commented Dec 18, 2024 • edited Loading

pjaaskel commented Dec 18, 2024

karolherbst commented Dec 18, 2024

pjaaskel commented Dec 19, 2024

karolherbst commented Dec 19, 2024

pjaaskel commented Dec 19, 2024

karolherbst commented Dec 19, 2024

pjaaskel commented Dec 19, 2024

bashbaug commented Dec 19, 2024

pjaaskel commented Dec 20, 2024

karolherbst commented Dec 20, 2024

karolherbst commented Dec 21, 2024 • edited Loading

karolherbst commented Dec 22, 2024

pjaaskel commented Jan 2, 2025

pjaaskel commented Jan 2, 2025

karolherbst commented Jan 2, 2025

pjaaskel commented Jan 3, 2025

karolherbst commented Dec 18, 2024 •

edited

Loading

karolherbst commented Dec 21, 2024 •

edited

Loading