Skip to content

Agg bench#20761

Closed
Dandandan wants to merge 5 commits intoapache:mainfrom
Dandandan:emit_aggregation
Closed

Agg bench#20761
Dandandan wants to merge 5 commits intoapache:mainfrom
Dandandan:emit_aggregation

Conversation

@Dandandan
Copy link
Contributor

@Dandandan Dandandan commented Mar 6, 2026

Which issue does this PR close?

  • Closes #.

Rationale for this change

Trying morsel-paper-like aggregations (not really yet)

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@Dandandan
Copy link
Contributor Author

run benchmarks

@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Mar 6, 2026
@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing emit_aggregation (40463ea) to d025869 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and emit_aggregation
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ emit_aggregation ┃       Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0 │  2499.50 ms │       2470.55 ms │    no change │
│ QQuery 1 │   894.64 ms │        947.64 ms │ 1.06x slower │
│ QQuery 2 │  1841.54 ms │       1891.27 ms │    no change │
│ QQuery 3 │  1123.94 ms │       1129.20 ms │    no change │
│ QQuery 4 │  2416.71 ms │       3180.67 ms │ 1.32x slower │
│ QQuery 5 │ 26998.73 ms │      28069.69 ms │    no change │
│ QQuery 6 │  3966.93 ms │       4253.29 ms │ 1.07x slower │
│ QQuery 7 │  2840.86 ms │       4163.18 ms │ 1.47x slower │
└──────────┴─────────────┴──────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)               │ 42582.84ms │
│ Total Time (emit_aggregation)   │ 46105.50ms │
│ Average Time (HEAD)             │  5322.86ms │
│ Average Time (emit_aggregation) │  5763.19ms │
│ Queries Faster                  │          0 │
│ Queries Slower                  │          4 │
│ Queries with No Change          │          4 │
│ Queries with Failure            │          0 │
└─────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ emit_aggregation ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     2.56 ms │          2.59 ms │     no change │
│ QQuery 1  │    50.87 ms │         50.22 ms │     no change │
│ QQuery 2  │   166.70 ms │        165.06 ms │     no change │
│ QQuery 3  │   172.20 ms │        175.42 ms │     no change │
│ QQuery 4  │  1059.63 ms │       1057.57 ms │     no change │
│ QQuery 5  │  1299.82 ms │       1447.44 ms │  1.11x slower │
│ QQuery 6  │     6.79 ms │          6.58 ms │     no change │
│ QQuery 7  │    55.93 ms │         57.30 ms │     no change │
│ QQuery 8  │  1519.99 ms │       1547.00 ms │     no change │
│ QQuery 9  │  1958.43 ms │       1617.90 ms │ +1.21x faster │
│ QQuery 10 │   343.11 ms │        366.85 ms │  1.07x slower │
│ QQuery 11 │   387.29 ms │        417.07 ms │  1.08x slower │
│ QQuery 12 │  1209.45 ms │       1380.69 ms │  1.14x slower │
│ QQuery 13 │  1950.36 ms │       2374.22 ms │  1.22x slower │
│ QQuery 14 │  1244.64 ms │       1349.61 ms │  1.08x slower │
│ QQuery 15 │  1255.78 ms │       1211.67 ms │     no change │
│ QQuery 16 │  2612.56 ms │       2553.54 ms │     no change │
│ QQuery 17 │  2577.02 ms │       2554.60 ms │     no change │
│ QQuery 18 │  5966.33 ms │       4788.67 ms │ +1.25x faster │
│ QQuery 19 │   127.24 ms │        136.02 ms │  1.07x slower │
│ QQuery 20 │  1869.25 ms │       1974.33 ms │  1.06x slower │
│ QQuery 21 │  2147.48 ms │       2275.35 ms │  1.06x slower │
│ QQuery 22 │  3837.04 ms │       3935.72 ms │     no change │
│ QQuery 23 │ 19571.86 ms │      11785.90 ms │ +1.66x faster │
│ QQuery 24 │   195.41 ms │        201.40 ms │     no change │
│ QQuery 25 │   432.76 ms │        457.24 ms │  1.06x slower │
│ QQuery 26 │   198.78 ms │        198.17 ms │     no change │
│ QQuery 27 │  2739.23 ms │       2766.00 ms │     no change │
│ QQuery 28 │ 21905.89 ms │      23949.92 ms │  1.09x slower │
│ QQuery 29 │  1025.81 ms │       1062.56 ms │     no change │
│ QQuery 30 │  1215.25 ms │       1335.38 ms │  1.10x slower │
│ QQuery 31 │  1376.27 ms │       1611.61 ms │  1.17x slower │
│ QQuery 32 │  4462.74 ms │       6227.06 ms │  1.40x slower │
│ QQuery 33 │  5481.40 ms │       6197.44 ms │  1.13x slower │
│ QQuery 34 │  6493.63 ms │       6394.94 ms │     no change │
│ QQuery 35 │  1939.21 ms │       1968.84 ms │     no change │
│ QQuery 36 │   180.22 ms │        186.61 ms │     no change │
│ QQuery 37 │    73.77 ms │         71.08 ms │     no change │
│ QQuery 38 │   113.08 ms │        113.18 ms │     no change │
│ QQuery 39 │   330.67 ms │        364.40 ms │  1.10x slower │
│ QQuery 40 │    37.74 ms │         45.16 ms │  1.20x slower │
│ QQuery 41 │    33.44 ms │         35.31 ms │  1.06x slower │
│ QQuery 42 │    30.95 ms │         31.53 ms │     no change │
└───────────┴─────────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)               │ 99658.56ms │
│ Total Time (emit_aggregation)   │ 96449.14ms │
│ Average Time (HEAD)             │  2317.64ms │
│ Average Time (emit_aggregation) │  2243.00ms │
│ Queries Faster                  │          3 │
│ Queries Slower                  │         18 │
│ Queries with No Change          │         22 │
│ Queries with Failure            │          0 │
└─────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ emit_aggregation ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 128.99 ms │        129.51 ms │     no change │
│ QQuery 2  │  31.61 ms │         32.09 ms │     no change │
│ QQuery 3  │  36.81 ms │         40.71 ms │  1.11x slower │
│ QQuery 4  │  35.42 ms │         34.57 ms │     no change │
│ QQuery 5  │  90.44 ms │         91.01 ms │     no change │
│ QQuery 6  │  24.54 ms │         27.66 ms │  1.13x slower │
│ QQuery 7  │ 162.87 ms │        154.66 ms │ +1.05x faster │
│ QQuery 8  │  41.83 ms │         39.84 ms │     no change │
│ QQuery 9  │ 107.66 ms │        105.73 ms │     no change │
│ QQuery 10 │  73.62 ms │         73.13 ms │     no change │
│ QQuery 11 │  19.19 ms │         18.66 ms │     no change │
│ QQuery 12 │  66.69 ms │         67.83 ms │     no change │
│ QQuery 13 │  53.04 ms │         55.33 ms │     no change │
│ QQuery 14 │  15.78 ms │         15.93 ms │     no change │
│ QQuery 15 │  32.99 ms │         34.04 ms │     no change │
│ QQuery 16 │  31.37 ms │         30.32 ms │     no change │
│ QQuery 17 │ 166.68 ms │        195.32 ms │  1.17x slower │
│ QQuery 18 │ 307.22 ms │        297.66 ms │     no change │
│ QQuery 19 │  52.84 ms │         51.92 ms │     no change │
│ QQuery 20 │  62.38 ms │         64.57 ms │     no change │
│ QQuery 21 │ 203.50 ms │        197.80 ms │     no change │
│ QQuery 22 │  25.17 ms │         24.59 ms │     no change │
└───────────┴───────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary               ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)               │ 1770.65ms │
│ Total Time (emit_aggregation)   │ 1782.88ms │
│ Average Time (HEAD)             │   80.48ms │
│ Average Time (emit_aggregation) │   81.04ms │
│ Queries Faster                  │         1 │
│ Queries Slower                  │         3 │
│ Queries with No Change          │        18 │
│ Queries with Failure            │         0 │
└─────────────────────────────────┴───────────┘

@Dandandan
Copy link
Contributor Author

Hmm looks a bit mixed (different than on my machine). But I think the DuckDB approach might give more improvements.

@Dandandan Dandandan closed this Mar 6, 2026
@Dandandan
Copy link
Contributor Author

run benchmarks

@Dandandan Dandandan reopened this Mar 6, 2026
@Dandandan Dandandan changed the title Test early emitting Test aggregations Mar 6, 2026
@Dandandan Dandandan changed the title Test aggregations Agg bench Mar 6, 2026
Introduce PartitionAggState to support multiple internal hash tables
in partial aggregation. When enabled via AggregateExec::with_num_agg_partitions(),
input rows are hashed by group keys (using the same hash as RepartitionExec)
and routed to separate smaller hash tables for better cache locality.

Defaults to 1 partition (no behavior change). The optimizer can set
higher values when a hash repartition follows the partial aggregate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing emit_aggregation (b793a93) to d025869 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

…tial aggregation

When num_agg_partitions > 1, the partial aggregate now acts as a
repartitioning operator, producing T output partitions directly via
channels. Each input task runs a GroupedHashAggregateStream with T
internal hash tables, then routes emitted batches to the correct
output channel using last_emitted_partition (no re-hashing needed
since internal tables use the same REPARTITION_RANDOM_STATE hash).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and emit_aggregation
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ emit_aggregation ┃    Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0 │  2440.48 ms │       2459.58 ms │ no change │
│ QQuery 1 │   891.01 ms │        899.48 ms │ no change │
│ QQuery 2 │  1834.30 ms │       1745.94 ms │ no change │
│ QQuery 3 │  1133.07 ms │       1099.29 ms │ no change │
│ QQuery 4 │  2421.85 ms │       2443.52 ms │ no change │
│ QQuery 5 │ 27133.12 ms │      27319.98 ms │ no change │
│ QQuery 6 │  3940.38 ms │       4116.71 ms │ no change │
│ QQuery 7 │  3056.37 ms │       2981.92 ms │ no change │
└──────────┴─────────────┴──────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)               │ 42850.58ms │
│ Total Time (emit_aggregation)   │ 43066.43ms │
│ Average Time (HEAD)             │  5356.32ms │
│ Average Time (emit_aggregation) │  5383.30ms │
│ Queries Faster                  │          0 │
│ Queries Slower                  │          0 │
│ Queries with No Change          │          8 │
│ Queries with Failure            │          0 │
└─────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ emit_aggregation ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     2.57 ms │          2.61 ms │     no change │
│ QQuery 1  │    51.67 ms │         51.69 ms │     no change │
│ QQuery 2  │   164.74 ms │        166.00 ms │     no change │
│ QQuery 3  │   171.47 ms │        174.36 ms │     no change │
│ QQuery 4  │  1123.14 ms │       1191.12 ms │  1.06x slower │
│ QQuery 5  │  1305.32 ms │       1396.23 ms │  1.07x slower │
│ QQuery 6  │     6.42 ms │          6.91 ms │  1.08x slower │
│ QQuery 7  │    56.15 ms │         56.72 ms │     no change │
│ QQuery 8  │  1542.79 ms │       1580.15 ms │     no change │
│ QQuery 9  │  1909.74 ms │       1991.22 ms │     no change │
│ QQuery 10 │   337.50 ms │        353.80 ms │     no change │
│ QQuery 11 │   388.74 ms │        401.65 ms │     no change │
│ QQuery 12 │  1215.68 ms │       1301.20 ms │  1.07x slower │
│ QQuery 13 │  2023.27 ms │       2097.86 ms │     no change │
│ QQuery 14 │  1259.58 ms │       1318.93 ms │     no change │
│ QQuery 15 │  1312.30 ms │       1320.23 ms │     no change │
│ QQuery 16 │  2690.43 ms │       2719.49 ms │     no change │
│ QQuery 17 │  2612.64 ms │       2712.60 ms │     no change │
│ QQuery 18 │  5831.12 ms │       5194.65 ms │ +1.12x faster │
│ QQuery 19 │   127.09 ms │        129.42 ms │     no change │
│ QQuery 20 │  1902.32 ms │       1950.30 ms │     no change │
│ QQuery 21 │  2140.98 ms │       2237.94 ms │     no change │
│ QQuery 22 │  3737.72 ms │       3927.23 ms │  1.05x slower │
│ QQuery 23 │ 28406.92 ms │      11802.77 ms │ +2.41x faster │
│ QQuery 24 │   198.50 ms │        218.08 ms │  1.10x slower │
│ QQuery 25 │   434.79 ms │        447.79 ms │     no change │
│ QQuery 26 │   191.42 ms │        198.25 ms │     no change │
│ QQuery 27 │  2634.56 ms │       2780.21 ms │  1.06x slower │
│ QQuery 28 │ 22039.37 ms │      25077.70 ms │  1.14x slower │
│ QQuery 29 │  1048.08 ms │       1050.17 ms │     no change │
│ QQuery 30 │  1295.74 ms │       1286.61 ms │     no change │
│ QQuery 31 │  1468.83 ms │       1390.48 ms │ +1.06x faster │
│ QQuery 32 │  5013.84 ms │       4680.44 ms │ +1.07x faster │
│ QQuery 33 │  5576.51 ms │       5856.25 ms │  1.05x slower │
│ QQuery 34 │  5858.81 ms │       6206.05 ms │  1.06x slower │
│ QQuery 35 │  2050.11 ms │       2077.04 ms │     no change │
│ QQuery 36 │   185.21 ms │        192.74 ms │     no change │
│ QQuery 37 │    71.84 ms │         74.55 ms │     no change │
│ QQuery 38 │   108.07 ms │        112.21 ms │     no change │
│ QQuery 39 │   324.00 ms │        348.12 ms │  1.07x slower │
│ QQuery 40 │    38.95 ms │         39.10 ms │     no change │
│ QQuery 41 │    36.03 ms │         39.15 ms │  1.09x slower │
│ QQuery 42 │    30.38 ms │         31.48 ms │     no change │
└───────────┴─────────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)               │ 108925.32ms │
│ Total Time (emit_aggregation)   │  96191.54ms │
│ Average Time (HEAD)             │   2533.15ms │
│ Average Time (emit_aggregation) │   2237.01ms │
│ Queries Faster                  │           4 │
│ Queries Slower                  │          12 │
│ Queries with No Change          │          27 │
│ Queries with Failure            │           0 │
└─────────────────────────────────┴─────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ emit_aggregation ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 128.37 ms │        129.31 ms │     no change │
│ QQuery 2  │  31.98 ms │         33.04 ms │     no change │
│ QQuery 3  │  38.73 ms │         41.52 ms │  1.07x slower │
│ QQuery 4  │  34.72 ms │         34.78 ms │     no change │
│ QQuery 5  │  90.42 ms │         95.64 ms │  1.06x slower │
│ QQuery 6  │  24.76 ms │         24.79 ms │     no change │
│ QQuery 7  │ 154.11 ms │        165.02 ms │  1.07x slower │
│ QQuery 8  │  45.70 ms │         40.57 ms │ +1.13x faster │
│ QQuery 9  │ 119.16 ms │        110.38 ms │ +1.08x faster │
│ QQuery 10 │  80.84 ms │         71.38 ms │ +1.13x faster │
│ QQuery 11 │  18.88 ms │         18.93 ms │     no change │
│ QQuery 12 │  63.92 ms │         66.10 ms │     no change │
│ QQuery 13 │  53.09 ms │         53.39 ms │     no change │
│ QQuery 14 │  16.68 ms │         15.63 ms │ +1.07x faster │
│ QQuery 15 │  33.52 ms │         32.99 ms │     no change │
│ QQuery 16 │  30.09 ms │         30.51 ms │     no change │
│ QQuery 17 │ 174.36 ms │        171.00 ms │     no change │
│ QQuery 18 │ 292.07 ms │        300.11 ms │     no change │
│ QQuery 19 │  53.86 ms │         54.42 ms │     no change │
│ QQuery 20 │  63.36 ms │         63.29 ms │     no change │
│ QQuery 21 │ 201.63 ms │        204.42 ms │     no change │
│ QQuery 22 │  25.47 ms │         24.48 ms │     no change │
└───────────┴───────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary               ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)               │ 1775.71ms │
│ Total Time (emit_aggregation)   │ 1781.71ms │
│ Average Time (HEAD)             │   80.71ms │
│ Average Time (emit_aggregation) │   80.99ms │
│ Queries Faster                  │         4 │
│ Queries Slower                  │         3 │
│ Queries with No Change          │        15 │
│ Queries with Failure            │         0 │
└─────────────────────────────────┴───────────┘

@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate labels Mar 6, 2026
@Dandandan
Copy link
Contributor Author

run benchmarks

- Fix ProducingOutput to carry partition index alongside batch, ensuring
  correct routing when emit_next_partition eagerly advances the index
- Add use_channels() method to centralize the decision of when to use
  the channel-based multi-output path
- Add update_cache_partitioning() to keep output partitioning in sync
  when limit_options or num_agg_partitions change
- Fix with_new_limit_options to recalculate output partitioning (prevents
  TopK aggregation from claiming Hash partitioning when channels won't
  be used)
- Guard CombinePartialFinalAggregate from combining when Partial has
  num_agg_partitions > 1
- Set num_agg_partitions in physical planner when repartitioning is enabled
- Keep spawned task references alive via Arc in output streams to prevent
  abort-on-drop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing emit_aggregation (f76efe6) to 02ce571 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and emit_aggregation
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ emit_aggregation ┃       Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 0 │  2492.44 ms │       2412.17 ms │    no change │
│ QQuery 1 │   868.07 ms │        978.50 ms │ 1.13x slower │
│ QQuery 2 │  1728.10 ms │       2568.87 ms │ 1.49x slower │
│ QQuery 3 │  1128.56 ms │       1643.68 ms │ 1.46x slower │
│ QQuery 4 │  2402.03 ms │       7815.27 ms │ 3.25x slower │
│ QQuery 5 │ 27648.92 ms │     248181.67 ms │ 8.98x slower │
│ QQuery 6 │  3966.12 ms │       4263.17 ms │ 1.07x slower │
│ QQuery 7 │  3020.18 ms │      19349.35 ms │ 6.41x slower │
└──────────┴─────────────┴──────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)               │  43254.43ms │
│ Total Time (emit_aggregation)   │ 287212.69ms │
│ Average Time (HEAD)             │   5406.80ms │
│ Average Time (emit_aggregation) │  35901.59ms │
│ Queries Faster                  │           0 │
│ Queries Slower                  │           7 │
│ Queries with No Change          │           1 │
│ Queries with Failure            │           0 │
└─────────────────────────────────┴─────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ emit_aggregation ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     2.60 ms │          2.59 ms │     no change │
│ QQuery 1  │    51.86 ms │         52.37 ms │     no change │
│ QQuery 2  │   168.26 ms │        165.32 ms │     no change │
│ QQuery 3  │   166.90 ms │        175.62 ms │  1.05x slower │
│ QQuery 4  │  1129.45 ms │       1499.56 ms │  1.33x slower │
│ QQuery 5  │  1333.16 ms │       1703.62 ms │  1.28x slower │
│ QQuery 6  │     6.69 ms │          7.17 ms │  1.07x slower │
│ QQuery 7  │    55.33 ms │         59.50 ms │  1.08x slower │
│ QQuery 8  │  1518.88 ms │       2114.26 ms │  1.39x slower │
│ QQuery 9  │  1878.36 ms │       2323.07 ms │  1.24x slower │
│ QQuery 10 │   340.45 ms │        392.53 ms │  1.15x slower │
│ QQuery 11 │   399.70 ms │        451.50 ms │  1.13x slower │
│ QQuery 12 │  1222.48 ms │       1458.46 ms │  1.19x slower │
│ QQuery 13 │  2040.01 ms │       4518.06 ms │  2.21x slower │
│ QQuery 14 │  1233.76 ms │       1451.43 ms │  1.18x slower │
│ QQuery 15 │  1300.48 ms │       1740.70 ms │  1.34x slower │
│ QQuery 16 │  2622.57 ms │       3282.89 ms │  1.25x slower │
│ QQuery 17 │  2578.32 ms │       3154.57 ms │  1.22x slower │
│ QQuery 18 │  5916.43 ms │       5924.96 ms │     no change │
│ QQuery 19 │   127.60 ms │        131.71 ms │     no change │
│ QQuery 20 │  1896.26 ms │       1973.74 ms │     no change │
│ QQuery 21 │  2182.31 ms │       2225.55 ms │     no change │
│ QQuery 22 │  3851.69 ms │       3939.70 ms │     no change │
│ QQuery 23 │ 29897.66 ms │      12280.16 ms │ +2.43x faster │
│ QQuery 24 │   198.53 ms │        208.68 ms │  1.05x slower │
│ QQuery 25 │   452.75 ms │        455.83 ms │     no change │
│ QQuery 26 │   196.49 ms │        203.71 ms │     no change │
│ QQuery 27 │  2697.64 ms │       3120.52 ms │  1.16x slower │
│ QQuery 28 │ 24567.96 ms │      24389.24 ms │     no change │
│ QQuery 29 │  1050.83 ms │       1096.42 ms │     no change │
│ QQuery 30 │  1283.41 ms │       1397.11 ms │  1.09x slower │
│ QQuery 31 │  1405.27 ms │       2474.23 ms │  1.76x slower │
│ QQuery 32 │  5202.11 ms │      20593.28 ms │  3.96x slower │
│ QQuery 33 │  6342.26 ms │       6498.98 ms │     no change │
│ QQuery 34 │  6742.91 ms │       6575.12 ms │     no change │
│ QQuery 35 │  1273.77 ms │       1624.11 ms │  1.28x slower │
│ QQuery 36 │   192.71 ms │        222.67 ms │  1.16x slower │
│ QQuery 37 │    72.75 ms │         84.32 ms │  1.16x slower │
│ QQuery 38 │   113.15 ms │        118.60 ms │     no change │
│ QQuery 39 │   354.61 ms │        425.43 ms │  1.20x slower │
│ QQuery 40 │    40.02 ms │         41.85 ms │     no change │
│ QQuery 41 │    34.45 ms │         34.98 ms │     no change │
│ QQuery 42 │    33.54 ms │         39.18 ms │  1.17x slower │
└───────────┴─────────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary               ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)               │ 114176.41ms │
│ Total Time (emit_aggregation)   │ 120633.30ms │
│ Average Time (HEAD)             │   2655.27ms │
│ Average Time (emit_aggregation) │   2805.43ms │
│ Queries Faster                  │           1 │
│ Queries Slower                  │          25 │
│ Queries with No Change          │          17 │
│ Queries with Failure            │           0 │
└─────────────────────────────────┴─────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ emit_aggregation ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 133.42 ms │        177.86 ms │  1.33x slower │
│ QQuery 2  │  32.69 ms │         30.96 ms │ +1.06x faster │
│ QQuery 3  │  40.07 ms │         37.33 ms │ +1.07x faster │
│ QQuery 4  │  35.83 ms │         35.61 ms │     no change │
│ QQuery 5  │  90.62 ms │         93.15 ms │     no change │
│ QQuery 6  │  24.85 ms │         25.29 ms │     no change │
│ QQuery 7  │ 161.42 ms │        152.93 ms │ +1.06x faster │
│ QQuery 8  │  41.06 ms │         40.35 ms │     no change │
│ QQuery 9  │ 114.86 ms │        112.77 ms │     no change │
│ QQuery 10 │  73.00 ms │         71.33 ms │     no change │
│ QQuery 11 │  18.95 ms │         17.43 ms │ +1.09x faster │
│ QQuery 12 │  59.23 ms │         68.48 ms │  1.16x slower │
│ QQuery 13 │  55.09 ms │         57.64 ms │     no change │
│ QQuery 14 │  15.58 ms │         15.91 ms │     no change │
│ QQuery 15 │  33.74 ms │         33.14 ms │     no change │
│ QQuery 16 │  30.97 ms │         29.97 ms │     no change │
│ QQuery 17 │ 172.11 ms │        177.33 ms │     no change │
│ QQuery 18 │ 301.82 ms │        318.37 ms │  1.05x slower │
│ QQuery 19 │  63.61 ms │         51.78 ms │ +1.23x faster │
│ QQuery 20 │  76.33 ms │         62.64 ms │ +1.22x faster │
│ QQuery 21 │ 205.63 ms │        197.74 ms │     no change │
│ QQuery 22 │  24.47 ms │         26.21 ms │  1.07x slower │
└───────────┴───────────┴──────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary               ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)               │ 1805.34ms │
│ Total Time (emit_aggregation)   │ 1834.24ms │
│ Average Time (HEAD)             │   82.06ms │
│ Average Time (emit_aggregation) │   83.37ms │
│ Queries Faster                  │         6 │
│ Queries Slower                  │         4 │
│ Queries with No Change          │        12 │
│ Queries with Failure            │         0 │
└─────────────────────────────────┴───────────┘

@Dandandan Dandandan closed this Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate optimizer Optimizer rules physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants