chore: Convert SliceKernel impls to SliceReduce#6727
Conversation
Merging this PR will improve performance by 42.12%
Performance Changes
Comparing Footnotes
|
|
Where they are constant time we should indeed be reduce rules. RunEnd cannot be a reduce rule since it needs access to the data. In either case, it should have no impact on execution time or available optimizations if the other compute kernels were written to not recurse incorrectly. @joseph-isaacs and I are planning an API change to make it harder to do the wrong thing here |
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Thanks for clarifying! |
Benchmarks: PolarSignals ProfilingSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb NVMeSummary
Detailed Results Table
|
Resolving `SliceArray` nodes at optimization time rather than execution time lets subsequent passes apply further compute pushdown. Changed for: `ALPVTable`, `ALPRDVTable`, `BitPackedVTable`, `RLEVTable`, `SparseVTable`, `ChunkedVTable` Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Benchmarks: TPC-H SF=1 on S3Summary
Detailed Results Table
|
Benchmarks: TPC-DS SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb S3Summary
Detailed Results Table
|
Dropped this change for runend. |
SliceKernel impls to SliceReduceSliceKernel impls to SliceReduce
SliceKernel impls to SliceReduceSliceKernel impls to SliceReduce
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
joseph-isaacs
left a comment
There was a problem hiding this comment.
This was intentionally marked as a kernel, see need to change the way we handle patches to move over
| array | ||
| .patches() | ||
| .map(|p| p.slice(range)) | ||
| .transpose()? | ||
| .flatten(), |
There was a problem hiding this comment.
This is a uses buffers
There was a problem hiding this comment.
Yea, if we added a PatchesArray I think this could work by wrapping with SliceArray
There was a problem hiding this comment.
with galp style patches all patches slicing is metadata only
|
To do this we have to handle patches correctly, maybe using a custom array. |
|
Do we still want this pr? |
|
i have a separate pr #6815 in progress which adds the new array for patches and implements the slicerule |
|
I'm still not sure we want a patches array. At least, not yet. Unless we have some way of passing down an "is_defined" mask into execution, there's no way for the child array to know whether some checked operation should fail or not. e.g. the child of a Patches array may overflow on some arithmetic, but the patch would have succeeded. We need a way to tell the child to ignore the overflow for the invalid positions. |
Resolving
SliceArraynodes at optimization time rather than execution time.Changed for:
ALPVTable,ALPRDVTable,BitPackedVTable,RLEVTable,SparseVTable,ChunkedVTable.