Skip to content

Releases: linebender/fearless_simd

v0.5.0

18 Jun 19:29
d241166

Choose a tag to compare

Crates.io | Docs

This release has an MSRV of 1.88.

Added

  • The kernel! macro, which creates safe wrappers around SIMD-level-specific kernels so platform intrinsics from core::arch or std::arch can be used safely when a token proves the required target features. (#214 by @Shnatsel)
  • The approximate_recip method on floating-point SIMD vector types. It uses fast hardware reciprocal estimates where available and exact division otherwise. (#204 by @tomcur)
  • SimdMask::from_bitmask, SimdMask::to_bitmask, SimdMask::test, and SimdMask::set, mirroring the std::simd mask API. (#226 by @Shnatsel)

Changed

  • Breaking change: the crate's SIMD extension traits are now sealed, so external crates can no longer implement them for their own types. (#211 by @LaurenzV)
  • Breaking change: mask types now have opaque storage and use the new SimdMask trait instead of SimdBase. Masks no longer expose integer-vector APIs such as Deref, indexing, Bytes, public SimdSplit/SimdCombine, slide, slide_within_blocks, byte conversions, or scalar bit-operator overloads. (#218 by @Shnatsel)
  • Generated SIMD loads, stores, reference casts, transmute-like conversions, helpers, const-generic functions, and intrinsic calls now use checked wrappers or kernel!, removing most unsafe from generated code. (#232, #233, #234, #235, #236, #237, #238, #239, #244, #245 by @Shnatsel)
  • Documentation and examples have been expanded and cleaned up for SIMD level tokens, mask types, platform-specific intrinsics, custom transmute wrappers, README consistency, and docs.rs visibility for NEON and WebAssembly APIs. (#213, #221, #222, #230, #240, #243 by @Shnatsel, #224, #225 by @DJMcNab)

Removed

  • Breaking change: the core_arch wrapper module and the safe_wrappers feature have been removed. Use kernel! with core::arch or std::arch intrinsics instead. (#216 by @Shnatsel)

Full Changelog: v0.4.1...v0.5.0

v0.4.1

16 May 09:52
0e8e242

Choose a tag to compare

Crates.io | Docs

This release has an MSRV of 1.88.

Added

  • The interleave and deinterleave methods on integer and floating-point SIMD vector types. (#206 by @Shnatsel)

Fixed

  • Sse4_2 and Avx2 now consistently use the x86-64-v2 and x86-64-v3 feature sets for detection, dispatch, and generated target_feature attributes. (#208 by @Shnatsel)

Full Changelog: v0.4.0...v0.4.1

v0.4.0

13 Feb 19:31
c3632ab

Choose a tag to compare

Crates.io | Docs

This release has an MSRV of 1.88.

Added

  • All vector types now implement Index and IndexMut. (#112 by @Ralith)
  • 256-bit vector types now use native AVX2 intrinsics on supported platforms. (#115 by @valadaptive)
  • 8-bit integer multiplication is now implemented on x86. (#115 by @valadaptive)
  • New native-width associated types: f64s and mask64s. (#125 by @valadaptive)
  • The bitwise "not" operation on integer vector types. (#130 by @valadaptive)
  • The from_fn method on vector types. (#137 by @valadaptive)
  • The load_interleaved and store_interleaved operations now use native intrinsics on x86, instead of using the fallback implementations. (#140 by @valadaptive)
  • Add support for relaxed_simd operations in WebAssembly. (#143 by @valadaptive)
  • The ceil and round_ties_even operations on floating-point vector types. (Rust's round operation rounds away from zero in the case of ties. Many architectures do not natively implement that behavior, so it's omitted.) (#145 by @valadaptive)
  • A prelude module, which exports all the traits in the library but not the types. (#149 by @valadaptive)
  • The any_true, all_true, any_false, and all_false methods on mask types. (#141 by @valadaptive)
  • Documentation for most traits, vector types, and operations. (#154 by @valadaptive)
  • A "shift left by vector" operation, to go with the existing "shift right by vector". (#155 by @valadaptive)
  • "Precise" float-to-integer conversions, which saturate out-of-bounds results and convert NaN to 0 across all platforms. (#167 by @valadaptive)
  • Add the slide and slide_within_blocks methods for shifting elements within a vector. (#164 by @valadaptive)
  • The Level::is_fallback method, which lets you check if the current SIMD level is the scalar fallback. This works even if Level::Fallback is not compiled in, always returning false in that case. (#168 by @valadaptive)
  • Added store_array methods to store SIMD vectors back to memory explicitly using intrinsics. (#181 by @LaurenzV)

Fixed

  • Improved the performance for load/store operations of vectors. (#185 by @valadaptive)
  • Integer equality comparisons now function properly on x86. Previously, they performed "greater than" comparisons.
    (#115 by @valadaptive)
  • All float-to-integer and integer-to-float conversions are implemented properly on x86, including the precise versions. (#134 by @valadaptive)
  • The floating-point min_precise and max_precise operations now behave the same way on x86 and WebAssembly as they do on AArch64, returning the non-NaN operand if one operand is NaN and the other is not. Previously, they returned the second operand if either was NaN. (#136 by @valadaptive)

Changed

  • Breaking change: The AVX2 level now requires all features from the x86-64-v3 baseline. (#188 by @Shnatsel)

  • Breaking change: Level::fallback has been removed, replaced with Level::baseline. (#105 by @DJMcNab)
    This corresponds with a change to avoid compiling in support for the fallback level on compilation targets which don't
    require it; this is most impactful for binary size on WASM, Apple Silicon Macs or Android.
    A consequence of this is that the available variants on Level are now dependent on the target features you are compiling with.
    The fallback level can be restored with the force_support_fallback cargo feature. We don't expect this to be necessary outside
    of tests.

  • Code generation for select and unzip operations on x86 has been improved. (#115 by @valadaptive)

  • Breaking change: The native-width associated types (f32s, u8s, etc.) for the Avx2 struct have been widened from 128-bit
    types (like f32x4) to 256-bit types (like f32x8). (#123 by @valadaptive)

  • Breaking change: All the vector types' inherent methods have been removed. Any remaining functionality has been moved
    to trait methods. (#149 by @valadaptive)

    Some functionality is exposed under different names:

    • Instead of the reinterpret methods, use the bitcast method on the Bytes trait. (e.g. foo.reinterpret_i32()
      -> foo.bitcast::<i32x4<_>>())
    • Instead of the cvt methods, use the to_int or to_float convenience methods on the SimdFloat and SimdInt
      traits (e.g. foo.cvt_u32() -> foo.to_int::<u32x4<_>>())

    Some functionality (such as split or combine) is exposed under new traits. You may use the new prelude module to
    conveniently import all of the traits.

  • Breaking change: The madd and msub methods have been renamed to mul_add and mul_sub, matching Rust's naming conventions.
    (#158 by @Shnatsel)

  • Breaking change: the val field on SIMD vector types is now private, and vector types are no longer represented as arrays internally. To access a vector type's elements, you can use the Into or Deref traits to obtain an array, or the as_slice/as_mut_slice methods to obtain a slice. (#159 by @valadaptive)

  • Breaking change: the Element type on the SimdBase trait is now an associated type instead of a type parameter. This should make it more pleasant to write code that's generic over different vector types. (#170 by @valadaptive)

  • The WasmSimd128 token type now wraps the new crate::core_arch::wasm32::WasmSimd128 type. This doesn't expose any new functionality as WASM SIMD128 can only be enabled statically, but matches all the other backend tokens. (#176 by @valadaptive)

  • Breaking change: the SimdFrom::simd_from method now takes the SIMD token as the first argument instead of the second. This matches the argument order of the from_slice, splat, and from_fn methods on SimdBase. (#180 by @valadaptive)

Removed

  • Breaking change: The (deprecated) simd_dispatch! macro. (#105 by @DJMcNab)

Full Changelog: v0.3.0...v0.4.0

v0.3.0

14 Oct 15:10
0a3ac74

Choose a tag to compare

Crates.io | Docs

This release has an MSRV of 1.86.

Added

  • SimdBase::witness to fetch the Simd implementation associated with a
    generic vector. (#76 by @Ralith)
  • Select is now available on native-width masks. (#77, #83 by @Ralith)
  • Simd::shrv_* preforms a right shift with shift amount specified
    per-lane. (#79 by @Ralith)
  • The >> operator is implemented for SIMD vectors. (#79 by @Ralith)
  • Assignment operator implementations. (#80 by @Ralith)
  • SimdFrom splatting is available on native-width vectors. (#84 by @Ralith)
  • Left shift by u32. (#86 by @Ralith)
  • Unary negation of signed integers. (#91 by @Ralith)
  • A simpler dispatch macro to replace simd_dispatch. (#96, #99 by @Ralith, @DJMcNab)

Fixed

  • Simd now requires consistent mask types for native-width
    vectors. (#75 by @Ralith)
  • Simd now requires consistent Bytes types for native-width vectors,
    enabling Bytes::bitcast in generic code. (#81 by @Ralith)
  • Scalar fallback now uses wrapping integer addition. (#85 by @Ralith)

Changed

  • Breaking: a.madd(b, c) and a.msub(b, c) now correspond to a * b + c and a * b - c for consistency with mul_add in
    std. (#88 by @Ralith)
    Previously, madd was a + b * c, and msub was a - b * c.
    Therefore, if you previously had a.madd(b, c), that's now written as b.madd(c, a).
    And if you had a.msub(b, c), that's now written b.madd(-c, a).
  • Constructors for static SIMD levels are now const (#93 by @Ralith)

Full Changelog: v0.2.0...v0.3.0

v0.2.0

26 Aug 14:30
5eafe36

Choose a tag to compare

Crates.io | Docs

This release has an MSRV of 1.86.

There has been a complete rewrite of Fearless SIMD.
For some details of the ideas used, see our blog post Towards fearless SIMD, 7 years later.

The repository has also been moved into the Linebender organisation.

New Contributors

Full Changelog: https://github.com/linebender/fearless_simd/commits/v0.2.0