Conversation
Introduces CodecChain, a frozen dataclass that chains array-array, array-bytes, and bytes-bytes codecs with synchronous encode/decode methods. Pure compute only -- no IO, no threading, no batching. Also adds sync roundtrip tests for individual codecs (blosc, gzip, zstd, crc32c, bytes, transpose, vlen) and CodecChain integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7995764 to
4e262b1
Compare
| chunk_bytes: Buffer, | ||
| chunk_spec: ArraySpec, | ||
| ) -> Buffer | None: | ||
| # Since blosc only support host memory, we convert the input and output of the encoding |
There was a problem hiding this comment.
seems like this shouldn't be deleted
tests/test_codecs/test_zstd.py
Outdated
|
|
||
| def test_zstd_is_not_fixed_size() -> None: | ||
| assert ZstdCodec.is_fixed_size is False | ||
|
|
There was a problem hiding this comment.
agreed, looks like claude didn't have faith in our ability to avoid making the same mistake twice
| return fill_value | ||
|
|
||
|
|
||
| @dataclass(frozen=True, slots=True) |
There was a problem hiding this comment.
Sorry but this is straight up unreviewable. I cant tell if there's a one character typo that breaks things here.
There was a problem hiding this comment.
good catch, this got included by accident -- it's from a separate PR that needs to happen after this one.
|
|
||
| if aa_out is None: | ||
| return None | ||
| bb_out: Any = cast("SupportsSyncCodec", self._ab_codec)._encode_sync(aa_out, self._ab_spec) |
There was a problem hiding this comment.
Yeah this is ugly but to set expectations, I don't think we are going to win a beauty contests with any of the stuff we have to bolt on to our codec, store, and chunk encoding routines. We are starting from a performance-unfriendly design, and trying to fix that design from the outside. In any case, this class will appear in a later PR -- I'm removing it from this one.
| # Synchronous store methods | ||
| # ------------------------------------------------------------------- | ||
|
|
||
| def _ensure_open_sync(self) -> None: |
There was a problem hiding this comment.
yes. in our current Store design, creating an instance of a store doesn't necessarily "open" it, so we have an async open method that actually opens the store. Our async get/set methods guard against the store being un-open:
zarr-python/src/zarr/storage/_local.py
Lines 199 to 200 in b6d3ae2
This PR adds synchronous methods to stores that benefit from executing IO outside of an event loop, i.e. memory storage and local storage. These methods are not defined on the ABC, and are instead defined as protocols that stores can opt into.
These methods are needed to unlock a hot path for chunk encoding / decoding that avoids event loop and
to_threadoverhead.part of #3720
builds on #3722 and #3721