-
Notifications
You must be signed in to change notification settings - Fork 216
Fix non-release kernel builds via CudaBuilder #322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
First, we weren't handling all the types. After fixing that, it exposed a `libnvvm` crash. Also saw a type issue in one of the warp APIs used by vecadd so fixed that. Fixes Rust-GPU#320
nnethercote
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem, one question. It would be nice to have some kind of test added to avoid regressing, too, though I'm not sure what that would look like.
| extern "C" { | ||
| #[link_name = "llvm.nvvm.match.any.sync.i64"] | ||
| fn __nvvm_warp_match_any_64(mask: u32, value: u64) -> u32; | ||
| fn __nvvm_warp_match_any_64(mask: u32, value: u64) -> u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks wrong. https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html has this:
declare i32 @llvm.nvvm.match.any.sync.i32(i32 %membermask, i32 %value)
declare i32 @llvm.nvvm.match.any.sync.i64(i32 %membermask, i64 %value)
declare {i32, i1} @llvm.nvvm.match.all.sync.i32(i32 %membermask, i32 %value)
declare {i32, i1} @llvm.nvvm.match.all.sync.i64(i32 %membermask, i64 %value)
Not sure about the signed/unsigned mismatches, but the return value is definitely 32-bits.
Aside: The match_all_{32,64} functions below don't have link_name attributes the way the match_any_{32,64} functions do. Not sure if this is valid. I suspect these functions aren't tested at all!
Anyway, I think this change should be reverted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, hm. Ok, will revert.
| TypeKind::Vector | TypeKind::ScalableVector => { | ||
| // Recurse on element type for vector floats | ||
| self.float_width(self.element_type(ty)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of Half/BFloat/Vector/ScalableVector needed to fix the issue? I see that rustc_codegen_llvm only has Half. Seems wise to only add code that's necessary for the fix (and thus has some level of testing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need at least BFloat as well but I'll double check.
|
What do you think about changing the default here to be keyed off of |
sounds ok |
First, we weren't handling all the types.
After fixing that, it exposed a
libnvvmcrash.Also saw a type issue in one of the warp APIs used by vecadd so fixed that.
Fixes #320