Migrate CUDA Driver Backend from Derelict to BindBC#96
Open
badnikhil wants to merge 6 commits into
Open
Conversation
| { | ||
| // Stub — intentionally left empty. | ||
| // See module documentation for rationale. | ||
| status = cast(Status)cuMemPrefetchAsync(cast(CUdeviceptr)raw, _length * T.sizeof, dev.raw, q.raw); |
Collaborator
There was a problem hiding this comment.
what version of the driver/runtime is this defined? or more accurately, not defined? does it makes sense to rewrite this to check if (cuMemPrefetchAsync) first?
Contributor
Author
There was a problem hiding this comment.
yes.. Unified Memory came out in CUDA 6 and prefetch APIs were introduced in CUDA 8.0.. i'll update that.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR replaces the CUDA driver backend
derelict-cudawith the type-safe, dynamically-loadedbindbc-cudabinding.Major Changes
Replaced derelict-cuda with bindbc-cuda in dub.json and implemented library loading using loadCUDA() in Platform.initialise().
Replaced internal generic void* handle fields in driver wrapper structs (Context, Event, Kernel, Program, Queue) with their corresponding strongly-typed BindBC types (e.g. CUcontext, CUstream).
Fully implemented the previously stubbed UnifiedBuffer!T.prefetch() method using cuMemPrefetchAsync, enabled by the upgraded binding capability.
Fixed a bug in Context where cuCtxSetLimit was erroneously used instead of cuCtxGetLimit during property querying.
Added explicit casts to CUdevice_attribute inside Device attribute queries to satisfy strict type checks.