Introduce `count{l,r}_{zero,one}` for `batch_bool` by onalante-ebay · Pull Request #1269 · xtensor-stack/xsimd

onalante-ebay · 2026-03-04T14:22:05Z

In #1236, it was mentioned that variable-sized bit groups for certain
batch_bool reductions would be slightly more efficient than extracting
a proper bitmask. To achieve this, the xsimd API is extended with the
functions xsimd::count{l,r}_{zero,one}, and count is revised to
allow per-platform kernels. The default implementations for each
function simply apply the corresponding scalar operation (for which
__cpp_lib_bitops == 201907L is partially backported) on
batch_bool::mask. This is specialized for NEON(64) by instead
applying the scalar operation to the narrowed batch, then scaling the
result by the "lane" size of the bit group size.

serge-sans-paille · 2026-03-04T14:57:27Z

I'm fine with the overall approach, but I think it means those operation should live in the kernel namespace with the appropriate dispatch, as we do for other operations.

Please ping me once you reach a green CI, and thanks for working on this 🙇

onalante-ebay · 2026-03-04T15:29:29Z

Right, the public xsimd::count{l,r}_{zero,one} functions call kernel::count{l,r}_{zero,one} as is done for other operations. Have I made a mistake with the implementation?

DiamonDinoia · 2026-03-04T16:46:41Z

I would check for __cpp_lib_bitops and if it fails provide a custom popcount.

onalante-ebay · 2026-03-04T17:08:47Z

Done. I was concerned that just trusting __cpp_lib_bitops might be problematic since libstdc++<13 would return some bit operations results' as the argument type rather than int (e.g. bit_width)¹. Thankfully, this does not appear to apply to the count operations.

Though this worry is admittedly overblown in that the result could simply just be cast to int. ↩

onalante-ebay · 2026-03-04T17:31:36Z

@serge-sans-paille CI is passing.

include/xsimd/arch/common/xsimd_common_logical.hpp

include/xsimd/arch/xsimd_neon.hpp

include/xsimd/arch/xsimd_neon64.hpp

In xtensor-stack#1236, it was mentioned that variable-sized bit groups for certain `batch_bool` reductions would be slightly more efficient than extracting a proper bitmask. To achieve this, the xsimd API is extended with the functions `xsimd::count{l,r}_{zero,one}`, and `count` is revised to allow per-platform kernels. The default implementations for each function simply apply the corresponding scalar operation (for which `__cpp_lib_bitops == 201907L` is partially backported) on `batch_bool::mask`. This is specialized for NEON(64) by instead applying the scalar operation to the narrowed batch, then scaling the result by the bit group size.

serge-sans-paille · 2026-03-06T08:39:05Z

LGTM! Please just squash the history and we're good.

Thanks a lot for your effort and... a question, if you don't mind: in which context are you using xsimd, and what for?

onalante-ebay · 2026-03-06T14:10:34Z

Squashed. Sorry, I am not at liberty to discuss the context at this time.

serge-sans-paille · 2026-03-06T19:33:09Z

That's totally fine!
Thanks for this cool PR.

onalante-ebay force-pushed the batch_countl_zero branch 2 times, most recently from 9cf4926 to 8e73330 Compare March 4, 2026 16:09

serge-sans-paille requested changes Mar 4, 2026

View reviewed changes

include/xsimd/arch/common/xsimd_common_logical.hpp Outdated Show resolved Hide resolved

include/xsimd/arch/xsimd_neon.hpp Outdated Show resolved Hide resolved

include/xsimd/arch/xsimd_neon.hpp Outdated Show resolved Hide resolved

onalante-ebay force-pushed the batch_countl_zero branch 2 times, most recently from 06dac6b to a3765aa Compare March 5, 2026 16:49

onalante-ebay requested a review from serge-sans-paille March 5, 2026 17:15

serge-sans-paille reviewed Mar 5, 2026

View reviewed changes

include/xsimd/arch/xsimd_neon64.hpp Outdated Show resolved Hide resolved

onalante-ebay force-pushed the batch_countl_zero branch from 06f0171 to 70ff59e Compare March 6, 2026 04:19

onalante-ebay requested a review from serge-sans-paille March 6, 2026 04:22

onalante-ebay force-pushed the batch_countl_zero branch from 70ff59e to 83914de Compare March 6, 2026 13:21

serge-sans-paille merged commit 923b986 into xtensor-stack:master Mar 6, 2026
70 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `count{l,r}_{zero,one}` for `batch_bool`#1269

Introduce `count{l,r}_{zero,one}` for `batch_bool`#1269
serge-sans-paille merged 1 commit intoxtensor-stack:masterfrom
onalante-ebay:batch_countl_zero

onalante-ebay commented Mar 4, 2026 •

edited

Loading

Uh oh!

serge-sans-paille commented Mar 4, 2026

Uh oh!

onalante-ebay commented Mar 4, 2026 •

edited

Loading

Uh oh!

DiamonDinoia commented Mar 4, 2026

Uh oh!

onalante-ebay commented Mar 4, 2026 •

edited

Loading

Uh oh!

onalante-ebay commented Mar 4, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

serge-sans-paille commented Mar 6, 2026

Uh oh!

onalante-ebay commented Mar 6, 2026

Uh oh!

serge-sans-paille commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

onalante-ebay commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

serge-sans-paille commented Mar 4, 2026

Uh oh!

onalante-ebay commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DiamonDinoia commented Mar 4, 2026

Uh oh!

onalante-ebay commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Footnotes

Uh oh!

onalante-ebay commented Mar 4, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

serge-sans-paille commented Mar 6, 2026

Uh oh!

onalante-ebay commented Mar 6, 2026

Uh oh!

serge-sans-paille commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

onalante-ebay commented Mar 4, 2026 •

edited

Loading

onalante-ebay commented Mar 4, 2026 •

edited

Loading

onalante-ebay commented Mar 4, 2026 •

edited

Loading