Skip to content

Implemented external memory sample#90

Merged
bashbaug merged 13 commits intoKhronosGroup:mainfrom
StreamHPC:external-mem-sample
Apr 2, 2025
Merged

Implemented external memory sample#90
bashbaug merged 13 commits intoKhronosGroup:mainfrom
StreamHPC:external-mem-sample

Conversation

@mfep
Copy link
Contributor

@mfep mfep commented Oct 27, 2023

NOTE: this PR includes all commits from branch StreamHPC:release-cd. This is required, because the implemented sample requires additional updates to the CI scripts to pull its new dependencies. After #88 is merged, this should be trivial to rebase/merge. Only the last commit is relevant for the sample, all other commits are from the release-cd branch.

  • Implemented sample showcasing the interop of OpenCL and Vulkan, using external memory,
  • for details, please visit the newly added README file.

Copy link

@nikhiljnv nikhiljnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large change to review meaningfully:)
Approving based on our successful testing on NVIDIA implementation.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I only reviewed the last commit.

At some point it would be good to update this sample so it worked with other types of external memory as well, e.g. dma_buf file descriptors.

Comment on lines 423 to 461
// We need to get the pointer to the vkGetMemoryFdKHR function because
// it's from extension VK_KHR_external_memory_fd.
PFN_vkGetMemoryFdKHR vkGetMemoryFdKHR =
(PFN_vkGetMemoryFdKHR)vkGetDeviceProcAddr(vk_device,
"vkGetMemoryFdKHR");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the function that is needed for Linux and file descriptors, but a different function is needed for Windows and NT handles. Has this sample been tested on Windows?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this sample has been tested on Windows, but we got

Test #31: externalmemory_Release ...........Exit code 0xc0000135
***Exception:   0.01 sec

Test #32: externalmemorycpp_Release ........Exit code 0xc0000135
***Exception:   0.01 sec

In the latest commit, I have added the logic to get the function pointer and call the correct function on Windows, but the runtime fail is still there: https://github.com/StreamHPC/OpenCL-SDK/actions/runs/7216235279/job/19662376992. This runtime error also appears on MacOS, although for now both tests are disabled on CI (macos' here and windows' here).

Given that it works fine in Linux, we assumed this failure could be due to no appropriate Vulkan device present on the failing test machines from CI. Perhaps you have more insights about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is still failing because the handle type needs to be updated for Windows:

            (cl_mem_properties)CL_EXTERNAL_MEMORY_HANDLE_OPAQUE_FD_KHR,
#ifdef _WIN32
            (cl_mem_properties)handle_y,
#else
            (cl_mem_properties)fd_y,
#endif

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this still needs to be fixed. Have these samples been tested on Windows?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to answer this, sorry. The handle type is allegedly fixed, but the example still fails on Windows. Right now is disabled on CI, but I tested it on my personal fork https://github.com/Beanavil/OpenCL-SDK/actions/runs/9563524333/job/26362215011.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I've left comments on the C++ sample, but many of the issues occur in the C sample also. Thanks!

@Beanavil Beanavil force-pushed the external-mem-sample branch 7 times, most recently from bf01f67 to aae074f Compare December 15, 2023 09:35
@bashbaug
Copy link
Contributor

There is a merge conflict here that needs to be resolved, also.

@Beanavil Beanavil force-pushed the external-mem-sample branch 5 times, most recently from 3f900c6 to 5dbeb7b Compare May 29, 2024 09:36
@Beanavil
Copy link
Member

Addressed the comments and rebased onto StreamHPC/release-cd. There are some test failures from the other tests of the repo, but I think those are due to release-cd still being worked on to solve some new issues after rebasing. It's probably better to wait until #88 is ready to merge and rebase this one onto it.

Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C sample is fixed and doesn't crash for me anymore, thanks!

I think the Windows sample is still broken, though.

@Beanavil Beanavil force-pushed the external-mem-sample branch 2 times, most recently from 90df63b to f220052 Compare June 5, 2024 08:05
@Beanavil Beanavil force-pushed the external-mem-sample branch from f220052 to 7cff095 Compare June 18, 2024 10:11
@Beanavil Beanavil force-pushed the external-mem-sample branch from 7cff095 to 4bf7ad4 Compare July 16, 2024 16:30
@Beanavil Beanavil force-pushed the external-mem-sample branch 5 times, most recently from fb99610 to 7101a17 Compare October 2, 2024 09:25
Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, here are a few more issues I found.

Note that the sample still does not execute properly on Windows.

@Beanavil Beanavil force-pushed the external-mem-sample branch from 7101a17 to 294d0bc Compare October 3, 2024 07:37
@Beanavil Beanavil force-pushed the external-mem-sample branch from 294d0bc to 1c8c8db Compare October 4, 2024 19:39
@Beanavil
Copy link
Member

Beanavil commented Oct 8, 2024

@bashbaug about the Windows crash: unfortunately, I'm only able to run this sample in Windows through the CI and debugging it is rather slow. I checked the error that the sample produced, and it was the 0xc0000135, which means there was a missing DLL (the vulkan-1.dll one it seems). As expected, installing the Vulkan loader fixed that error, but then the example still fails with a 0xc0000139. I think this error is an "Entry point not found" (some DLL is being loaded but that cannot find the other DLLs it relies on), so it may be that other Vulkan-related dependencies are missing. I'll look further into this sometime this week if that's okay.

@Beanavil Beanavil force-pushed the external-mem-sample branch from 1c8c8db to 780296c Compare November 12, 2024 14:44
@Beanavil
Copy link
Member

@Beanavil Beanavil force-pushed the external-mem-sample branch from c91ebd6 to 5ca6f4c Compare November 12, 2024 16:29
Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to fix the first issue below if possible, but otherwise I think this is ready to go. If we find any additional issues we can fix them after merging.

cl_device, CL_DEVICE_EXTERNAL_MEMORY_IMPORT_HANDLE_TYPES_KHR,
supported_handle_types_count, supported_handle_types, NULL),
error, err);
for (size_t i = 0; i < supported_handle_types_count; ++i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supported_handle_types_count is actually the size in bytes rather than a count, so this loop will read off the end of the allocation. There needs to be a divide by cl_external_memory_handle_type_khr somewhere to compute the count.

Comment on lines 369 to 372
if (opencl_version_is_major(&dev_versions[i], 3))
{
strcat(compiler_options, "-cl-std=CL3.0 ");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything in the external_saxpy.cl kernel that requires OpenCL C 3.0. Am I missing something? If not, we could remove this code block to shorten the sample, and it could run on more devices also (in theory, at least).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, if we remove this code block from the C sample, we should remove it from the C++ sample also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed! OpenCL 3.0 is certainly only necessary for the external memory extension.

@Beanavil Beanavil force-pushed the external-mem-sample branch from 5ca6f4c to ad13557 Compare March 17, 2025 20:18
@CLAassistant
Copy link

CLAassistant commented Mar 17, 2025

CLA assistant check
All committers have signed the CLA.

@Beanavil Beanavil force-pushed the external-mem-sample branch 2 times, most recently from e724350 to ad13557 Compare March 30, 2025 18:49
@Beanavil Beanavil force-pushed the external-mem-sample branch from ad13557 to 5f52d0d Compare April 2, 2025 12:57
@bashbaug bashbaug merged commit 89e0e06 into KhronosGroup:main Apr 2, 2025
82 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants