Skip to content

perf/store sync#3725

Open
d-v-b wants to merge 10 commits intozarr-developers:mainfrom
d-v-b:perf/store-sync
Open

perf/store sync#3725
d-v-b wants to merge 10 commits intozarr-developers:mainfrom
d-v-b:perf/store-sync

Conversation

@d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Feb 26, 2026

This PR adds synchronous methods to stores that benefit from executing IO outside of an event loop, i.e. memory storage and local storage. These methods are not defined on the ABC, and are instead defined as protocols that stores can opt into.

These methods are needed to unlock a hot path for chunk encoding / decoding that avoids event loop and to_thread overhead.

part of #3720

builds on #3722 and #3721

d-v-b and others added 5 commits February 25, 2026 18:33
Introduces CodecChain, a frozen dataclass that chains array-array,
array-bytes, and bytes-bytes codecs with synchronous encode/decode
methods. Pure compute only -- no IO, no threading, no batching.

Also adds sync roundtrip tests for individual codecs (blosc, gzip,
zstd, crc32c, bytes, transpose, vlen) and CodecChain integration tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Feb 26, 2026
@d-v-b d-v-b mentioned this pull request Feb 27, 2026
chunk_bytes: Buffer,
chunk_spec: ArraySpec,
) -> Buffer | None:
# Since blosc only support host memory, we convert the input and output of the encoding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this shouldn't be deleted


def test_zstd_is_not_fixed_size() -> None:
assert ZstdCodec.is_fixed_size is False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these seem useless

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, looks like claude didn't have faith in our ability to avoid making the same mistake twice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed it

return fill_value


@dataclass(frozen=True, slots=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but this is straight up unreviewable. I cant tell if there's a one character typo that breaks things here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, this got included by accident -- it's from a separate PR that needs to happen after this one.


if aa_out is None:
return None
bb_out: Any = cast("SupportsSyncCodec", self._ab_codec)._encode_sync(aa_out, self._ab_spec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😵‍💫

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is ugly but to set expectations, I don't think we are going to win a beauty contests with any of the stuff we have to bolt on to our codec, store, and chunk encoding routines. We are starting from a performance-unfriendly design, and trying to fix that design from the outside. In any case, this class will appear in a later PR -- I'm removing it from this one.

# Synchronous store methods
# -------------------------------------------------------------------

def _ensure_open_sync(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. in our current Store design, creating an instance of a store doesn't necessarily "open" it, so we have an async open method that actually opens the store. Our async get/set methods guard against the store being un-open:

if not self._is_open:
await self._open()
, and our sync methods have to do the same thing here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs release notes Automatically applied to PRs which haven't added release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants