Conversation
murmur.go
Outdated
| // us take over the default behavior of the compiler to have the byte slice | ||
| // share the underlying memory buffer of the string and avoid the extra heap | ||
| // allocation. | ||
| return d.Write(*(*[]byte)(unsafe.Pointer(&reflect.SliceHeader{ |
There was a problem hiding this comment.
This conversion is unsafe (see golang/go#40701)
There was a problem hiding this comment.
I believe this case is correct because it is covered by the rules 1, 3, and 6 of unsafe:
(1) Conversion of a *T1 to Pointer to *T2.
(3) Conversion of a Pointer to a uintptr and back, with arithmetic.
(6) Conversion of a reflect.SliceHeader or reflect.StringHeader Data field to or from Pointer.
https://golang.org/pkg/unsafe/#Pointer
Let me know if I'm missing something.
There was a problem hiding this comment.
It's hard for me to actually tell and I wish this type of conversion were documented as explicitly safe or unsafe. I took this as violating rule 6 because the slice doesn't refer to an actual underlying slice, the slice is being made up out of thin air from a reflect.SliceHeader. What's more obviously safe to me is when the slice header is referring to an underlying slice beforehand.
It is entirely possible that this is safe, I just haven't trusted it since the six unsafe rules came out (this was my original go to for how to convert strings to slices).
As an aside, I think the 40701 I mentioned in the first ticket refers to other behavior that this is definitely not doing: taking in a byte pointer and making a slice out of that.
There was a problem hiding this comment.
https://github.com/jlauinger/go-safer
I believe this is effectively patterns 1 and 2.
There was a problem hiding this comment.
I don't think it is pattern 1 and 2, because the conversion to and from unsafe is done in a single expression.
There was a problem hiding this comment.
Actually, looks like the Go doc may have been updated to explicitly indicate that this conversion may not be safe:
In general, reflect.SliceHeader and reflect.StringHeader should be used only as *reflect.SliceHeader and *reflect.StringHeader pointing at actual slices or strings, never as plain structs. A program should not declare or allocate variables of these struct types.
Here's an alternative approach we could try:
https://github.com/cespare/xxhash/blob/master/xxhash_unsafe.go#L52-L57
// sliceHeader is similar to reflect.SliceHeader, but it assumes that the layout
// of the first two words is the same as the layout of a string.
type sliceHeader struct {
s string
cap int
}What do you think?
There was a problem hiding this comment.
I think something like twmb/murmur3@610077f is safer, and afaict it's just as fast.
|
Hello @twmb, would you like to get this PR merged or would you prefer not to bring it in? |
|
I'm not a gate on this repo, I just saw the PR and decided to look since I fixed roughly the same thing in my own fork from way back. |
|
We're maintaining a fork as well which includes this optimization https://github.com/segmentio/murmur3 (in case it is useful to someone). |
Convert to go module
Return opaque types
Hello!
This PR adds a
WriteStringmethod to the internaldigesttype to implement theio.StringWriterinterface and avoid memory copies when hashing strings.I added a benchmark and ran it before and after the change to demonstrate that a heap allocation is removed by providing this method:
The use of the
unsafepackage is a bit unfortunate, but it seemed like the package already depended on it, so this change doesn't introduce a new dependency onunsafe.Let me know if you'd like to see anything changed!