Skip to content

Commit da12a44

Browse files
SS-JIAfacebook-github-bot
authored andcommitted
Fix Android ARM64 build for torchao lowbit kernels (#4029)
Summary: D95224222 added torchao ARM lowbit kernel dependencies to the ExecuTorch llama runner for ARM64 builds, but the Buck targets had two issues that prevented the Android ARM64 build from succeeding. 1. `std::aligned_alloc` is not available on Android API < 28. Android's Bionic libc only added `aligned_alloc` in API 28 (Android 9 Pie). The NDK's libc++ declares `using ::aligned_alloc _LIBCPP_USING_IF_EXISTS` which silently becomes unresolved when targeting API < 28 (the default `app_platform` is android-21). This caused a compile error in `shared_kernels/internal/memory.h`. Fixed by using `posix_memalign` (which is available since API 16) as a fallback when `__ANDROID_API__ < 28`. 2. The aarch64 `linear` kernels use ARM dot product intrinsics (`vdotq_s32`) which require the `+dotprod` architecture feature. The CMake build already passed `-march=armv8.4-a+dotprod`, but the Buck targets were missing this flag for Android builds. Fixed by adding `-march=armv8.2-a+dotprod` to `fbandroid_compiler_flags` in both the `aarch64/linear` target and the `op_linear_8bit_act_xbit_weight_executorch` target. Reviewed By: tanvirislam-meta Differential Revision: D95832860
1 parent ab4a336 commit da12a44

1 file changed

Lines changed: 8 additions & 0 deletions

File tree

  • torchao/csrc/cpu/shared_kernels/internal

torchao/csrc/cpu/shared_kernels/internal/memory.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,15 @@ inline aligned_byte_ptr make_aligned_byte_ptr(size_t alignment, size_t size) {
1919
// Adjust size to next multiple of alignment >= size
2020
size_t adjusted_size = ((size + alignment - 1) / alignment) * alignment;
2121

22+
#if defined(__ANDROID__) && __ANDROID_API__ < 28
23+
void* raw_ptr = nullptr;
24+
if (::posix_memalign(&raw_ptr, alignment, adjusted_size) != 0) {
25+
raw_ptr = nullptr;
26+
}
27+
char* ptr = static_cast<char*>(raw_ptr);
28+
#else
2229
char* ptr = static_cast<char*>(std::aligned_alloc(alignment, adjusted_size));
30+
#endif
2331
if (!ptr) {
2432
throw std::runtime_error(
2533
"Failed to allocate memory. Requested size: " + std::to_string(size) +

0 commit comments

Comments
 (0)