Skip to content

parquet bench + cuda changes to scan clickbench#6739

Draft
onursatici wants to merge 14 commits intoos/gpu-scan-benchfrom
os/parquet-bench
Draft

parquet bench + cuda changes to scan clickbench#6739
onursatici wants to merge 14 commits intoos/gpu-scan-benchfrom
os/parquet-bench

Conversation

@onursatici
Copy link
Contributor

Summary

Closes: #000

Testing

Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
# This serves as the baseline for comparing against Vortex GPU scans.
#
# Usage:
# uv run bench_parquet.py dataset.parquet --iterations 5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a standalone uv script to do the same scan we do on gpu-scan-bench above, but for parquet instead of vortex

materialize_constant_decimal::<D>(array, decimal_dtype, validity, ctx).await
})
}
DType::Extension(ext_dtype) => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed this to solve a panic on datetimeparts


// Components may decompress as unsigned (e.g. from BitPacked). Reinterpret
// as signed since the CUDA kernel only has signed variants and casts
// everything to int64_t anyway — the bit pattern is identical.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why I was getting a uint here on datetimeparts, so hacked around this

// the same for signed/unsigned pairs (e.g. i16/u16).
if let Some(bitpacked) = array.encoded().as_opt::<BitPackedVTable>() {
match_each_integer_ptype!(bitpacked.ptype(), |P| {
match_each_integer_ptype!(array.ptype(), |P| {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an irrelevant fix that I did first, but turned out it is datetimeparts

));

#[cfg(debug_assertions)]
validate_decompress_results(&plan, device_actual_sizes, device_statuses).await?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip validation on release builds because they force a host copy

@codspeed-hq
Copy link

codspeed-hq bot commented Mar 2, 2026

Merging this PR will degrade performance by 10.2%

⚡ 3 improved benchmarks
❌ 3 regressed benchmarks
✅ 948 untouched benchmarks
⏩ 1466 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_opt_bool_canonical_into[(10, 1000)] 1.6 ms 1.4 ms +16.12%
Simulation bench_many_nulls[0.5] 379.5 µs 342.8 µs +10.72%
Simulation bench_many_nulls[0.9] 547.5 µs 481.6 µs +13.7%
Simulation patched_take_200k_dispersed 5.1 ms 5.6 ms -10.14%
Simulation map_each[BufferMut<i32>, 128] 770.6 ns 858.1 ns -10.2%
Simulation true_count_vortex_buffer[128] 1 µs 1.2 µs -10.06%

Comparing os/parquet-bench (4be081e) with os/gpu-scan-bench (f08430d)

Open in CodSpeed

Footnotes

  1. 1466 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Signed-off-by: Onur Satici <onur@spiraldb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant