Skip to content

perf: skip ensureWasmTrees when native engine provides complete data#344

Open
carlos-alm wants to merge 26 commits intomainfrom
perf/skip-wasm-preparse-native
Open

perf: skip ensureWasmTrees when native engine provides complete data#344
carlos-alm wants to merge 26 commits intomainfrom
perf/skip-wasm-preparse-native

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Before calling ensureWasmTrees, checks whether native engine already provides CFG and dataflow data for all files
  • When native data is complete, skips the WASM pre-parse entirely — avoiding a full WASM parse of every file on native builds
  • Sets _t.wasmPreMs = 0 when skipped so timings remain accurate

Builds on PRs #342 (native CFG) and #343 (native dataflow) which add per-module bypass logic. This change lifts the optimization to the shared pre-parse step in builder.js, eliminating the redundant ensureWasmTrees call when native data is already sufficient.

Test plan

  • All 1431 tests pass
  • node src/cli.js build . --engine native --timings — verify wasmPreMs drops to ~0
  • node src/cli.js build . --engine wasm --timings — verify WASM path still works

carlos-alm and others added 21 commits March 3, 2026 22:45
Three optimizations to recover build speed after CFG/dataflow default-on:

1. Eliminate redundant WASM parsing (biggest win):
   - Remove complexity.js clearing _tree after each file (builder already
     clears all trees after all phases complete)
   - Add ensureWasmTrees() in parser.js for a single WASM pre-parse pass
     before CFG/dataflow, replacing two independent parse passes
   - Memoize createParsers() so repeated calls return cached parsers

2. Filter CFG/dataflow to changed files only:
   - Build changedSymbols map excluding reverse-dep-only files
   - Pass changedSymbols (not allSymbols) to buildCFGData/buildDataflowEdges
   - Reverse-dep files only need edge rebuilding, not AST re-analysis

3. Report wasmPreMs in phase timing output for visibility

Impact: 4 functions changed, 28 affected

Impact: 4 functions changed, 10 affected
…325)

Three optimizations to recover build speed after CFG/dataflow default-on:

1. Eliminate redundant WASM parsing (biggest win):
   - Remove complexity.js clearing _tree after each file (builder already
     clears all trees after all phases complete)
   - Add ensureWasmTrees() in parser.js for a single WASM pre-parse pass
     before CFG/dataflow, replacing two independent parse passes
   - Memoize createParsers() so repeated calls return cached parsers

2. Filter CFG/dataflow to changed files only:
   - Build changedSymbols map excluding reverse-dep-only files
   - Pass changedSymbols (not allSymbols) to buildCFGData/buildDataflowEdges
   - Reverse-dep files only need edge rebuilding, not AST re-analysis

3. Report wasmPreMs in phase timing output for visibility

Impact: 4 functions changed, 28 affected

Impact: 4 functions changed, 10 affected
…ames (#324)

collectIdentifiers hardcoded node.type === 'identifier', missing PHP's
variable_name and other language-specific identifier types in return
statement referencedNames. Now uses the existing isIdent(nodeType, rules)
helper with rules passed from the call site. Also adds null guard.

Updated PHP dataflow test to assert referencedNames contains $x, $a, $b
— previously only checked funcName without validating identifier capture.

Impact: 3 functions changed, 2 affected
…ty tests

Update parity tests to compare children (parameters, properties,
constants) and call receivers now that the native binary supports them.
Document known native gaps: implicit self parameters (Python/Rust) and
local constant scope (extracts function-local const as top-level).
The native engine's JS extractor walked the entire tree recursively and
extracted `const x = <literal>` as top-level constants regardless of
scope. The WASM extractor correctly only walks program-level children.

Add a `find_parent_of_types` check to skip constants nested inside
function_declaration, arrow_function, function_expression, or
method_definition nodes.

Impact: 2 functions changed, 1 affected
* perf: fix v3.0.1 build performance regression (14.1 → ~5.8 ms/file)

Three optimizations to recover build speed after CFG/dataflow default-on:

1. Eliminate redundant WASM parsing (biggest win):
   - Remove complexity.js clearing _tree after each file (builder already
     clears all trees after all phases complete)
   - Add ensureWasmTrees() in parser.js for a single WASM pre-parse pass
     before CFG/dataflow, replacing two independent parse passes
   - Memoize createParsers() so repeated calls return cached parsers

2. Filter CFG/dataflow to changed files only:
   - Build changedSymbols map excluding reverse-dep-only files
   - Pass changedSymbols (not allSymbols) to buildCFGData/buildDataflowEdges
   - Reverse-dep files only need edge rebuilding, not AST re-analysis

3. Report wasmPreMs in phase timing output for visibility

Impact: 4 functions changed, 28 affected

Impact: 4 functions changed, 10 affected

* fix: update native binary to v3.0.1 and enable extended kinds in parity tests

Update parity tests to compare children (parameters, properties,
constants) and call receivers now that the native binary supports them.
Document known native gaps: implicit self parameters (Python/Rust) and
local constant scope (extracts function-local const as top-level).

* fix(native): skip local constants inside function bodies

The native engine's JS extractor walked the entire tree recursively and
extracted `const x = <literal>` as top-level constants regardless of
scope. The WASM extractor correctly only walks program-level children.

Add a `find_parent_of_types` check to skip constants nested inside
function_declaration, arrow_function, function_expression, or
method_definition nodes.

Impact: 2 functions changed, 1 affected
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
buildAstNodes was calling tx(rows) per-file inside the loop, creating
172 separate SQLite transactions each with a BEGIN/COMMIT/fsync cycle.
All other phases (CFG, dataflow, complexity) correctly used a single
wrapping transaction.

Collect all rows across files first, then insert in one tx(allRows)
call. Benchmarked: astMs drops from ~3600ms to ~350ms (native) and
~547ms (WASM). Native per-file build time: 24.9 → 8.5 ms/file.

Impact: 1 functions changed, 12 affected
)

* perf: use single transaction for AST node insertion (3.6s → 350ms)

buildAstNodes was calling tx(rows) per-file inside the loop, creating
172 separate SQLite transactions each with a BEGIN/COMMIT/fsync cycle.
All other phases (CFG, dataflow, complexity) correctly used a single
wrapping transaction.

Collect all rows across files first, then insert in one tx(allRows)
call. Benchmarked: astMs drops from ~3600ms to ~350ms (native) and
~547ms (WASM). Native per-file build time: 24.9 → 8.5 ms/file.

Impact: 1 functions changed, 12 affected

* chore: bump version to 3.0.3
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…-58) (#338)

Add 19 backlog items across 6 new tiers (1b-1f) for features that
leverage existing build data (ast_nodes, cfg_blocks, dataflow,
co_changes, embeddings) which is currently computed on every build
but consumed by only 1-2 commands each.

Highest-value items: data-dependent impact analysis (ID 49, 52),
co-change coupling in triage (ID 53), missing co-change partner
warning (ID 55).
Before calling ensureWasmTrees, check whether native engine already
supplies CFG and dataflow data for all files. When it does, skip the
WASM pre-parse entirely — avoiding a full WASM parse of every file
on native builds where the data is already available.

Impact: 1 functions changed, 14 affected
#339)

The phase breakdown table in BUILD-BENCHMARKS.md was missing astMs,
cfgMs, dataflowMs, and wasmPreMs — phases that account for ~75% of
native build time since v3.0.1. Add them to phaseKeys/phaseLabels in
update-benchmark-report.js so future regenerations include them.

Also add a notes section explaining the v3.0.0→v3.0.3 regression:
the four new default-on phases total ~1,575ms, and the WASM pre-parse
double-parses all files on native builds because CFG/dataflow/complexity
are JS-only and need tree-sitter ASTs the native engine doesn't expose.
@greptile-apps
Copy link

greptile-apps bot commented Mar 5, 2026

Greptile Summary

This PR lifts the ensureWasmTrees skip optimisation from the per-module CFG/dataflow bypass (added in PRs #342/#343) to the shared pre-parse step in builder.js, eliminating the redundant full-codebase WASM parse on native builds. It also corrects a regression where CFG and dataflow were processing all files on incremental builds instead of only the changed-file set (astComplexitySymbols), fixes language-aware identifier collection in collectIdentifiers (PHP variable_name support), closes a native-engine const-scoping bug in the Rust extractor, and replaces per-file SQLite transactions in ast.js with a single all-files transaction for a ~10× astMs improvement.

Key changes:

  • src/builder.js: New detection loop checks whether every function/method def in astComplexitySymbols already has native CFG (d.cfg === null || Array.isArray(d.cfg?.blocks)) and whether every file already has symbols.dataflow; if so, ensureWasmTrees is skipped and _t.wasmPreMs is set to 0. buildCFGData and buildDataflowEdges now correctly receive astComplexitySymbols instead of allSymbols.
  • src/parser.js: New ensureWasmTrees export centralises the WASM pre-parse; _cachedParsers module-level memoisation avoids reloading WASM grammars on every call.
  • src/complexity.js: Removes premature symbols._tree = null, delegating GC cleanup to the centralised release loop at the end of buildGraph.
  • src/dataflow.js: collectIdentifiers updated to use isIdent() and accept rules, fixing PHP and other non-identifier node types in referencedNames; null guard added.
  • crates/codegraph-core/src/extractors/javascript.rs: find_parent_of_types guard prevents function-scoped const from being emitted as top-level constants.
  • Tests: Parity tests re-enabled for children (with implicit self/&self filtered) and receiver; build-parity SQL excludes constant nodes while the native binary ships the fix; PHP dataflow test strengthened to assert referencedNames.

Confidence Score: 4/5

  • Safe to merge — the optimisation is correctly gated behind a detection loop that validates native CFG/dataflow presence before skipping WASM pre-parse; all 1431 tests pass and prior issues have been addressed.
  • The core skip logic is sound: empty-blocks false-negatives are properly handled via Array.isArray(d.cfg?.blocks) check, null guards are in place for symbol definitions, and timing accuracy is preserved by setting wasmPreMs = 0. The incremental-build narrowing (allSymbolsastComplexitySymbols for CFG/dataflow) is consistent with existing patterns and correctly preserves stale-but-valid rows for unchanged files. No files require special attention; all phases are correctly implemented and tested.
  • No files require special attention

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[buildGraph] --> B{opts.cfg != false\nOR opts.dataflow != false?}
    B -- No --> G[Skip CFG + dataflow entirely]
    B -- Yes --> C[Loop over astComplexitySymbols]
    C --> D{symbols._tree\nalready set?}
    D -- Yes --> C
    D -- No --> E{needsCfg?\nCheck fnDefs}
    E -- fnDefs.length > 0\nAND some lack\nnative CFG --> F[needsWasmTrees = true\nbreak]
    E -- All have native CFG\nOR no fn defs --> H{needsDataflow?\nCheck symbols.dataflow}
    H -- dataflow missing --> F
    H -- native dataflow present --> C
    F --> I[ensureWasmTrees\nastComplexitySymbols]
    I --> J[Parse each file\nwithout _tree via WASM]
    J --> K
    C -- All files checked,\nneedsWasmTrees = false --> L[_t.wasmPreMs = 0\nskip WASM entirely]
    L --> K[buildCFGData\nastComplexitySymbols]
    K --> M[buildDataflowEdges\nastComplexitySymbols]
    M --> N[Release _tree refs\nfor GC]
Loading

Last reviewed commit: b6a7699

Comment on lines +35 to 51
// Exclude constant nodes — the native engine has a known scope bug where it
// extracts local `const` variables inside functions as top-level constants,
// while WASM correctly limits constant extraction to program-level declarations.
const nodes = db
.prepare('SELECT name, kind, file, line FROM nodes ORDER BY name, kind, file, line')
.prepare(
"SELECT name, kind, file, line FROM nodes WHERE kind != 'constant' ORDER BY name, kind, file, line",
)
.all();
const edges = db
.prepare(`
SELECT n1.name AS source_name, n2.name AS target_name, e.kind
FROM edges e
JOIN nodes n1 ON e.source_id = n1.id
JOIN nodes n2 ON e.target_id = n2.id
WHERE n1.kind != 'constant' AND n2.kind != 'constant'
ORDER BY n1.name, n2.name, e.kind
`)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WHERE kind != 'constant' exclusion handles the native engine scope bug where function-scoped const declarations are wrongly extracted as top-level constants. That bug is fixed in this PR's javascript.rs change. However, the test exclusion has no mechanism to be re-enabled once the 3.0.3 native binary is published — constant-node parity will never be verified even after the fix is live.

Consider adding a TODO comment referencing the binary version at which this exclusion should be removed:

// TODO: Remove this exclusion once the native binary >= 3.0.3 is published.
// The fix is in crates/codegraph-core/src/extractors/javascript.rs (find_parent_of_types guard).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in bc3b58c — added TODO comment referencing native binary >= 3.0.4 as the threshold for removing the kind != 'constant' exclusion.

CHANGELOG.md Outdated
Comment on lines 5 to 11
## [3.0.3](https://github.com/optave/codegraph/compare/v3.0.1...v3.0.3) (2026-03-04)

### Performance

* **ast:** use single transaction for AST node insertion — `astMs` drops from ~3600ms to ~350ms (native) and ~547ms (WASM), reducing overall native build from 24.9 to 8.5 ms/file ([#333](https://github.com/optave/codegraph/pull/333))

## [3.0.1](https://github.com/optave/codegraph/compare/v3.0.0...v3.0.1) (2026-03-03)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 3.0.3 entry only documents the AST single-transaction optimization. Two other significant changes shipped in this release are missing:

  • src/builder.js: skip ensureWasmTrees when native engine provides complete CFG + dataflow data
  • crates/codegraph-core/src/extractors/javascript.rs: fix function-scoped const declarations being incorrectly extracted as top-level constants

Additionally, the version jumps from 3.0.1 directly to 3.0.3 with no 3.0.2 entry. A note explaining why 3.0.2 was skipped would help future maintainers.

Consider updating the entry to include all three changes and clarifying the version gap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in bc3b58c — added the two missing changelog entries (ensureWasmTrees skip and const scope fix) and a note about 3.0.2 being internal/unpublished.

@carlos-alm
Copy link
Contributor Author

Addressed both review comments in bc3b58c:

  1. Added TODO comment for constant exclusion removal (>= 3.0.4)
  2. Completed 3.0.3 changelog with ensureWasmTrees skip + const scope fix + version gap note

@greptileai

src/builder.js Outdated
Comment on lines +1331 to +1333
const fnDefs = symbols.definitions.filter(
(d) => (d.kind === 'function' || d.kind === 'method') && d.line,
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing null guard on symbols.definitions

symbols.definitions.filter(...) will throw a TypeError if a symbols object has definitions: undefined or definitions: null. This can happen for binary or unsupported files whose extractor returns a partial object, and for language-specific extractors that may omit definitions under error conditions.

Every other place in the codebase that accesses symbols.definitions uses a || [] guard (e.g. src/ast.js:171const defs = symbols.definitions || []). This new code path is the only exception, which means a single bad file can abort the entire build at the pre-parse check rather than silently skipping.

Suggested change
const fnDefs = symbols.definitions.filter(
(d) => (d.kind === 'function' || d.kind === 'method') && d.line,
);
const fnDefs = (symbols.definitions || []).filter(
(d) => (d.kind === 'function' || d.kind === 'method') && d.line,
);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cb4a6c3 — changed to (symbols.definitions || []).filter(...), matching the guard pattern used everywhere else in the codebase.

Impact: 3 functions changed, 10 affected
@claude
Copy link

claude bot commented Mar 5, 2026

Claude finished @carlos-alm's task —— View job


PR Review Analysis

Reviewing PR #344: Performance optimization to skip WASM parsing when native engine provides complete data.

Todo List:

  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity review
  • Phase 4: Code quality assessment
  • Final assessment and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Performance optimization to avoid redundant WASM parsing when the native engine already provides complete CFG and dataflow data. This addresses unnecessary work in builds that use the native engine.

Why This Approach: The solution adds a pre-check loop that examines astComplexitySymbols to determine if any files actually need WASM trees before calling ensureWasmTrees. This is a logical optimization that builds on existing native engine capabilities.

Risk Assessment: Low risk - the optimization is conservative (only skips when data is definitely complete) and maintains backward compatibility.

Phase 2: Configuration Change Detection ✅

No problematic configuration changes detected. The changes are purely algorithmic optimizations in src/builder.js with appropriate changelog documentation. No linting rules, test timeouts, or build configurations were modified.

Phase 3: Test Integrity ✅

Test integrity is maintained. The author addressed previous review feedback by:

  • Adding appropriate TODO comment for constant exclusion (though I note the TODO reference wasn't found in the current codebase)
  • Test exclusions are justified due to a known native engine scoping bug
  • All 1431 tests continue to pass per PR description

Phase 4: Code Quality Assessment ✅

CRITICAL BUG CONFIRMED: The review comment from Greptile is correct. At src/builder.js:1331, there is a missing null guard on symbols.definitions.filter(...) that will cause a TypeError if any symbols object has definitions: undefined or definitions: null.

Evidence of Inconsistent Pattern:

  • src/ast.js:171: const defs = symbols.definitions || []; (SAFE)
  • src/complexity.js:1667: symbols.definitions.every(...) (checks symbols.definitions directly - presumably safe due to pre-filter logic)
  • src/builder.js:1331: symbols.definitions.filter(...) (UNSAFE - no null guard)

This inconsistency suggests the new code path doesn't follow the established safety pattern used elsewhere in the codebase.

Other Code Quality Observations:

  • Logic and performance optimization approach is sound
  • Changelog is appropriately updated
  • Test plan is reasonable though manual verification steps remain unchecked

Root Cause Analysis

Problem Being Solved: Performance bottleneck where ensureWasmTrees was being called unnecessarily when the native engine already provided complete CFG and dataflow data. This caused redundant WASM parsing of every file even when native data was sufficient.

Why This Approach: The pre-check heuristic is logical and conservative - it only skips WASM parsing when it can definitively confirm that native data is complete for all required functionality (CFG blocks for functions/methods and dataflow data).

Risk Assessment: Low risk optimization with proper fallback behavior and accurate timing metrics.

Critical Concerns

❌ NULL SAFETY BUG: Line src/builder.js:1331 will crash with TypeError: Cannot read property 'filter' of undefined if symbols.definitions is null or undefined. This violates the established pattern used throughout the codebase where symbols.definitions accesses are guarded with || [].

Required Fix:

const fnDefs = (symbols.definitions || []).filter(
  (d) => (d.kind === 'function' || d.kind === 'method') && d.line,
);

Final Recommendation

  • Rating: ⭐⭐☆☆☆ (2/5)
  • Action: REQUEST CHANGES
  • Reasoning: While the performance optimization logic is sound and the approach is well-designed, there is a critical null safety bug that must be fixed before merge. The inconsistency with established codebase patterns increases the risk of runtime crashes for edge cases involving partial symbol objects from extractors.

The fix is trivial but essential - all other aspects of the PR are acceptable.

Impact: 1 functions changed, 0 affected
@claude
Copy link

claude bot commented Mar 5, 2026

Claude finished @carlos-alm's task —— View job


Claude is reviewing PR #344

Todo List:

  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity review
  • Phase 4: Code quality assessment
  • Final assessment and rating

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Performance optimization to avoid redundant WASM parsing when the native engine already provides complete CFG and dataflow data. This addresses unnecessary work in builds that use the native engine.

Why This Approach: The solution adds a pre-check loop that examines astComplexitySymbols to determine if any files actually need WASM trees before calling ensureWasmTrees. This is a logical optimization that builds on existing native engine capabilities.

Risk Assessment: Low risk - the optimization is conservative (only skips when data is definitely complete) and maintains backward compatibility.

Phase 2: Configuration Change Detection ✅

No problematic configuration changes detected. The changes are purely algorithmic optimizations in src/builder.js with appropriate changelog documentation. No linting rules, test timeouts, or build configurations were modified.

Phase 3: Test Integrity ✅

Test integrity is maintained. The author addressed previous review feedback by:

  • Adding appropriate TODO comment for constant exclusion (though I note the TODO reference wasn't found in the current codebase)
  • Test exclusions are justified due to a known native engine scoping bug
  • All 1431 tests continue to pass per PR description

Phase 4: Code Quality Assessment ✅

CRITICAL BUG CONFIRMED: The review comment from Greptile is correct. At src/builder.js:1331, there is a missing null guard on symbols.definitions.filter(...) that will cause a TypeError if any symbols object has definitions: undefined or definitions: null.

Evidence of Inconsistent Pattern:

  • src/ast.js:171: const defs = symbols.definitions || []; (SAFE)
  • src/complexity.js:1667: symbols.definitions.every(...) (checks symbols.definitions directly - presumably safe due to pre-filter logic)
  • src/builder.js:1331: symbols.definitions.filter(...) (UNSAFE - no null guard)

This inconsistency suggests the new code path doesn't follow the established safety pattern used elsewhere in the codebase.

Other Code Quality Observations:

  • Logic and performance optimization approach is sound
  • Changelog is appropriately updated
  • Test plan is reasonable though manual verification steps remain unchecked

Root Cause Analysis

Problem Being Solved: Performance bottleneck where ensureWasmTrees was being called unnecessarily when the native engine already provided complete CFG and dataflow data. This caused redundant WASM parsing of every file even when native data was sufficient.

Why This Approach: The pre-check heuristic is logical and conservative - it only skips WASM parsing when it can definitively confirm that native data is complete for all required functionality (CFG blocks for functions/methods and dataflow data).

Risk Assessment: Low risk optimization with proper fallback behavior and accurate timing metrics.

Critical Concerns

❌ NULL SAFETY BUG: Line src/builder.js:1331 will crash with TypeError: Cannot read property 'filter' of undefined if symbols.definitions is null or undefined. This violates the established pattern used throughout the codebase where symbols.definitions accesses are guarded with || [].

Required Fix:

const fnDefs = (symbols.definitions || []).filter(
  (d) => (d.kind === 'function' || d.kind === 'method') && d.line,
);

Final Recommendation

  • Rating: ⭐⭐☆☆☆ (2/5)
  • Action: REQUEST CHANGES
  • Reasoning: While the performance optimization logic is sound and the approach is well-designed, there is a critical null safety bug that must be fixed before merge. The inconsistency with established codebase patterns increases the risk of runtime crashes for edge cases involving partial symbol objects from extractors.

The fix is trivial but essential - all other aspects of the PR are acceptable.


@carlos-alm
Copy link
Contributor Author

Addressed the null guard comment in cb4a6c3 — added (symbols.definitions || []) guard.

@greptileai

src/builder.js Outdated
const fnDefs = (symbols.definitions || []).filter(
(d) => (d.kind === 'function' || d.kind === 'method') && d.line,
);
if (fnDefs.length > 0 && !fnDefs.every((d) => d.cfg?.blocks?.length)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty blocks array treated as missing native CFG

d.cfg?.blocks?.length evaluates to 0 (falsy) when the native engine returns cfg: { blocks: [] } for trivial functions (e.g., empty body or single return). This causes !fnDefs.every(d => d.cfg?.blocks?.length) to be true, spuriously triggering needsWasmTrees = true even though native CFG was provided.

In practice, any file containing a trivially simple function prevents the skip from firing, silently negating the optimization. cfg.js line 1151 also skips such functions, but that doesn't help here—the builder check should recognize empty-blocks as "native CFG was computed, just returned empty."

Consider using Array.isArray(d.cfg?.blocks) instead, which correctly treats an explicitly-set empty-blocks array as "native CFG was provided":

Suggested change
if (fnDefs.length > 0 && !fnDefs.every((d) => d.cfg?.blocks?.length)) {
if (fnDefs.length > 0 && !fnDefs.every((d) => Array.isArray(d.cfg?.blocks))) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b6a7699 — changed the check from d.cfg?.blocks?.length (falsy for empty arrays) to d.cfg === null || Array.isArray(d.cfg?.blocks). This correctly treats cfg: { blocks: [] } as valid native CFG while preserving cfg: null for nodes that intentionally have no CFG (e.g. interface members).

@carlos-alm carlos-alm force-pushed the perf/skip-wasm-preparse-native branch from cb4a6c3 to bf8de1f Compare March 5, 2026 05:19
d.cfg?.blocks?.length evaluates to 0 (falsy) when the native engine
returns cfg: { blocks: [] } for trivial functions, spuriously triggering
needsWasmTrees. Use Array.isArray(d.cfg?.blocks) instead, and preserve
d.cfg === null for nodes that intentionally have no CFG (e.g. interface
members).

Impact: 1 functions changed, 1 affected
Impact: 1 functions changed, 0 affected
@carlos-alm
Copy link
Contributor Author

@greptileai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant