Releases · linebender/fearless_simd

18 Jun 19:29

LaurenzV

v0.5.0

d241166

v0.5.0 Latest

Latest

Crates.io | Docs

This release has an MSRV of 1.88.

Added

The kernel! macro, which creates safe wrappers around SIMD-level-specific kernels so platform intrinsics from core::arch or std::arch can be used safely when a token proves the required target features. (#214 by @Shnatsel)
The approximate_recip method on floating-point SIMD vector types. It uses fast hardware reciprocal estimates where available and exact division otherwise. (#204 by @tomcur)
SimdMask::from_bitmask, SimdMask::to_bitmask, SimdMask::test, and SimdMask::set, mirroring the std::simd mask API. (#226 by @Shnatsel)

Changed

Breaking change: the crate's SIMD extension traits are now sealed, so external crates can no longer implement them for their own types. (#211 by @LaurenzV)
Breaking change: mask types now have opaque storage and use the new SimdMask trait instead of SimdBase. Masks no longer expose integer-vector APIs such as Deref, indexing, Bytes, public SimdSplit/SimdCombine, slide, slide_within_blocks, byte conversions, or scalar bit-operator overloads. (#218 by @Shnatsel)
Generated SIMD loads, stores, reference casts, transmute-like conversions, helpers, const-generic functions, and intrinsic calls now use checked wrappers or kernel!, removing most unsafe from generated code. (#232, #233, #234, #235, #236, #237, #238, #239, #244, #245 by @Shnatsel)
Documentation and examples have been expanded and cleaned up for SIMD level tokens, mask types, platform-specific intrinsics, custom transmute wrappers, README consistency, and docs.rs visibility for NEON and WebAssembly APIs. (#213, #221, #222, #230, #240, #243 by @Shnatsel, #224, #225 by @DJMcNab)

Removed

Breaking change: the core_arch wrapper module and the safe_wrappers feature have been removed. Use kernel! with core::arch or std::arch intrinsics instead. (#216 by @Shnatsel)

Full Changelog: v0.4.1...v0.5.0

Assets 2

16 May 09:52

LaurenzV

v0.4.1

0e8e242

v0.4.1

Crates.io | Docs

This release has an MSRV of 1.88.

Added

The interleave and deinterleave methods on integer and floating-point SIMD vector types. (#206 by @Shnatsel)

Fixed

Sse4_2 and Avx2 now consistently use the x86-64-v2 and x86-64-v3 feature sets for detection, dispatch, and generated target_feature attributes. (#208 by @Shnatsel)

Full Changelog: v0.4.0...v0.4.1

Assets 2

13 Feb 19:31

LaurenzV

v0.4.0

c3632ab

v0.4.0

Crates.io | Docs

This release has an MSRV of 1.88.

Added

All vector types now implement Index and IndexMut. (#112 by @Ralith)
256-bit vector types now use native AVX2 intrinsics on supported platforms. (#115 by @valadaptive)
8-bit integer multiplication is now implemented on x86. (#115 by @valadaptive)
New native-width associated types: f64s and mask64s. (#125 by @valadaptive)
The bitwise "not" operation on integer vector types. (#130 by @valadaptive)
The from_fn method on vector types. (#137 by @valadaptive)
The load_interleaved and store_interleaved operations now use native intrinsics on x86, instead of using the fallback implementations. (#140 by @valadaptive)
Add support for relaxed_simd operations in WebAssembly. (#143 by @valadaptive)
The ceil and round_ties_even operations on floating-point vector types. (Rust's round operation rounds away from zero in the case of ties. Many architectures do not natively implement that behavior, so it's omitted.) (#145 by @valadaptive)
A prelude module, which exports all the traits in the library but not the types. (#149 by @valadaptive)
The any_true, all_true, any_false, and all_false methods on mask types. (#141 by @valadaptive)
Documentation for most traits, vector types, and operations. (#154 by @valadaptive)
A "shift left by vector" operation, to go with the existing "shift right by vector". (#155 by @valadaptive)
"Precise" float-to-integer conversions, which saturate out-of-bounds results and convert NaN to 0 across all platforms. (#167 by @valadaptive)
Add the slide and slide_within_blocks methods for shifting elements within a vector. (#164 by @valadaptive)
The Level::is_fallback method, which lets you check if the current SIMD level is the scalar fallback. This works even if Level::Fallback is not compiled in, always returning false in that case. (#168 by @valadaptive)
Added store_array methods to store SIMD vectors back to memory explicitly using intrinsics. (#181 by @LaurenzV)

Fixed

Improved the performance for load/store operations of vectors. (#185 by @valadaptive)
Integer equality comparisons now function properly on x86. Previously, they performed "greater than" comparisons.
(#115 by @valadaptive)
All float-to-integer and integer-to-float conversions are implemented properly on x86, including the precise versions. (#134 by @valadaptive)
The floating-point min_precise and max_precise operations now behave the same way on x86 and WebAssembly as they do on AArch64, returning the non-NaN operand if one operand is NaN and the other is not. Previously, they returned the second operand if either was NaN. (#136 by @valadaptive)

Changed

Breaking change: The AVX2 level now requires all features from the x86-64-v3 baseline. (#188 by @Shnatsel)
Breaking change: Level::fallback has been removed, replaced with Level::baseline. (#105 by @DJMcNab)
This corresponds with a change to avoid compiling in support for the fallback level on compilation targets which don't
require it; this is most impactful for binary size on WASM, Apple Silicon Macs or Android.
A consequence of this is that the available variants on Level are now dependent on the target features you are compiling with.
The fallback level can be restored with the force_support_fallback cargo feature. We don't expect this to be necessary outside
of tests.
Code generation for select and unzip operations on x86 has been improved. (#115 by @valadaptive)
Breaking change: The native-width associated types (f32s, u8s, etc.) for the Avx2 struct have been widened from 128-bit
types (like f32x4) to 256-bit types (like f32x8). (#123 by @valadaptive)
Breaking change: All the vector types' inherent methods have been removed. Any remaining functionality has been moved
to trait methods. (#149 by @valadaptive)

Some functionality is exposed under different names:
- Instead of the reinterpret methods, use the bitcast method on the Bytes trait. (e.g. foo.reinterpret_i32()
  -> foo.bitcast::<i32x4<_>>())
- Instead of the cvt methods, use the to_int or to_float convenience methods on the SimdFloat and SimdInt
  traits (e.g. foo.cvt_u32() -> foo.to_int::<u32x4<_>>())
Some functionality (such as split or combine) is exposed under new traits. You may use the new prelude module to
conveniently import all of the traits.
Breaking change: The madd and msub methods have been renamed to mul_add and mul_sub, matching Rust's naming conventions.
(#158 by @Shnatsel)
Breaking change: the val field on SIMD vector types is now private, and vector types are no longer represented as arrays internally. To access a vector type's elements, you can use the Into or Deref traits to obtain an array, or the as_slice/as_mut_slice methods to obtain a slice. (#159 by @valadaptive)
Breaking change: the Element type on the SimdBase trait is now an associated type instead of a type parameter. This should make it more pleasant to write code that's generic over different vector types. (#170 by @valadaptive)
The WasmSimd128 token type now wraps the new crate::core_arch::wasm32::WasmSimd128 type. This doesn't expose any new functionality as WASM SIMD128 can only be enabled statically, but matches all the other backend tokens. (#176 by @valadaptive)
Breaking change: the SimdFrom::simd_from method now takes the SIMD token as the first argument instead of the second. This matches the argument order of the from_slice, splat, and from_fn methods on SimdBase. (#180 by @valadaptive)

Removed

Breaking change: The (deprecated) simd_dispatch! macro. (#105 by @DJMcNab)

Full Changelog: v0.3.0...v0.4.0

Assets 2

14 Oct 15:10

raphlinus

v0.3.0

0a3ac74

v0.3.0

Crates.io | Docs

This release has an MSRV of 1.86.

Added

SimdBase::witness to fetch the Simd implementation associated with a
generic vector. (#76 by @Ralith)
Select is now available on native-width masks. (#77, #83 by @Ralith)
Simd::shrv_* preforms a right shift with shift amount specified
per-lane. (#79 by @Ralith)
The >> operator is implemented for SIMD vectors. (#79 by @Ralith)
Assignment operator implementations. (#80 by @Ralith)
SimdFrom splatting is available on native-width vectors. (#84 by @Ralith)
Left shift by u32. (#86 by @Ralith)
Unary negation of signed integers. (#91 by @Ralith)
A simpler dispatch macro to replace simd_dispatch. (#96, #99 by @Ralith, @DJMcNab)

Fixed

Simd now requires consistent mask types for native-width
vectors. (#75 by @Ralith)
Simd now requires consistent Bytes types for native-width vectors,
enabling Bytes::bitcast in generic code. (#81 by @Ralith)
Scalar fallback now uses wrapping integer addition. (#85 by @Ralith)

Changed

Breaking: a.madd(b, c) and a.msub(b, c) now correspond to a * b + c and a * b - c for consistency with mul_add in
std. (#88 by @Ralith)
Previously, madd was a + b * c, and msub was a - b * c.
Therefore, if you previously had a.madd(b, c), that's now written as b.madd(c, a).
And if you had a.msub(b, c), that's now written b.madd(-c, a).
Constructors for static SIMD levels are now const (#93 by @Ralith)