Safe transmute_copy()#232
Conversation
…e generator instead of emitted unsafe { transmute_copy() }
… configurations is clearly overly ambitious.
|
Thanks, not expecting you to rebase, merge commit is fine! |
LaurenzV
left a comment
There was a problem hiding this comment.
LGTM, also nice so see the unsafe being more reduced to a single module.
We actually did use bytemuck initially, but got rid of it precisely to avoid pulling in bytemuck, with the argument that fearless_simd is low-level enough that we should just do our own thing here. It's a trade-off, but I think it's good to avoid extra dependencies here.
| #![cfg_attr(not(test), warn(unused_crate_dependencies))] | ||
| // These lints shouldn't apply to examples. | ||
| #![warn(clippy::print_stdout, clippy::print_stderr)] | ||
| #![cfg_attr(not(test), deny(clippy::disallowed_methods))] |
There was a problem hiding this comment.
Let's move this below // END LINEBENDER LINT SET since it's not part of the set to my understanding?
| Aligned512<[u32; 16]>, | ||
| ); | ||
|
|
||
| #[cfg(any(target_arch = "x86", target_arch = "x86_64"))] |
There was a problem hiding this comment.
Could you add a comment that the const is just to avoid repeating the cfg, that confused me a bit.
| // - A `#[repr(simd)]` type, which has the same layout as an array of scalars | ||
| // - An array of `#[repr(simd)]` types | ||
| // - For AArch64 specifically, a `#[repr(C)]` tuple of `#[repr(simd)]` types | ||
| // |
There was a problem hiding this comment.
I think it would be valuable to retain those comments for the SimdPod implementation of the types.
|
transient DNS resolution error, retrying |
Mirrors linebender#232 but for stores Performance is unchanged or potentially improved: the old codegen emitted `ptr::copy_nonoverlapping` which lowered into a `memcpy` in the LLVM IR, while this approach lowers into an unaligned store directly, which is the same thing as platform-specific SIMD instructions lower to. (Yes, platform-specific SIMD intrinsics lower into a generic op!)
@LaurenzV can you link to where that decision was made? I can only find #120, which removed the use of bytemuck only because we had the unsafe code for these cases already; in that thread, Valerie also would have been happy implementing this in terms of bytemuck. Alternatively, the mood in #simd > ✔ Using `bytemuck` also doesn't match your analysis. Fwiw, I'd be loosely in favour of moving to bytemuck completely, but equally I don't think it's urgent.
I don't see how that follows, fwiw. We would only depend on bytemuck without using the |
I'm not sure if it's written down anywhere, maybe I remember wrong. But I feel pretty confident that at some point I heard something along the lines of "fearless_simd is going to be a crate deep in the depenency tree, so we should avoid additional dependencies if possible" being said somewhere, but I can't back it up right now. I guess it wouldn't be a big deal, but I think for fearless_simd specifically it makes sense to just use our own solution here (unlike color or peniko), but if other's prefer bytemuck I'm not gonna block it! |
"we have
bytemuckat home"These are the
transmute_copy()safety improvements from #231 taken one step further: on top of checking the sizes match at compile time and that both types areCopy, we also add a marker trait to types that are safe to transmute and require it inchecked_transmute_copy(), making that function fully safe to call.The design follows the
bytemuckcrate closely. We could just usebytemuck, but it would pull insynand other proc macro machinery, and we would still have to write theimpl_aligned_simd_pod!macro because bytemuck rightfully refuses to derivePodon generic wrappers likeAligned128<T>because they may introduce padding depending on theT. So the amount of code it saves us is minimal.Rebasing #231 on top of this would be hellish, but a merge commit probably wouldn't be too bad.