perf: fix v3.0.1 build regression (14.1 → ~5.8 ms/file)#325
Conversation
Three optimizations to recover build speed after CFG/dataflow default-on:
1. Eliminate redundant WASM parsing (biggest win):
- Remove complexity.js clearing _tree after each file (builder already
clears all trees after all phases complete)
- Add ensureWasmTrees() in parser.js for a single WASM pre-parse pass
before CFG/dataflow, replacing two independent parse passes
- Memoize createParsers() so repeated calls return cached parsers
2. Filter CFG/dataflow to changed files only:
- Build changedSymbols map excluding reverse-dep-only files
- Pass changedSymbols (not allSymbols) to buildCFGData/buildDataflowEdges
- Reverse-dep files only need edge rebuilding, not AST re-analysis
3. Report wasmPreMs in phase timing output for visibility
Impact: 4 functions changed, 28 affected
Impact: 4 functions changed, 10 affected
|
Claude finished @carlos-alm's task —— View job Code Review CompleteReviewing PR: perf: fix v3.0.1 build regression (14.1 → ~5.8 ms/file) Todo List:
Root Cause AnalysisProblem Being Solved: 59% performance regression (5.8ms → 14.1ms/file) introduced when CFG/dataflow analysis became default. Root causes identified:
Why This Approach: The solution directly targets each wasteful operation:
Risk Assessment: LOW RISK - Changes are surgical optimizations that preserve existing behavior while eliminating waste. Technical Implementation ReviewStrengths:
Code Quality:
Performance Logic:
Critical ConcernsNone identified. This is a well-executed performance optimization with:
Final Recommendation
The implementation is technically sound, well-documented, and delivers the promised performance improvements through targeted waste elimination. |
Greptile SummaryThis PR delivers a substantial 59% performance improvement (14.1 → 5.8 ms/file) by eliminating redundant WASM parsing across build phases. Key changes:
The optimization is well-designed and maintains correctness while significantly reducing build times. Trees are still properly released at the end of the build process for garbage collection. Confidence Score: 5/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph "Before (v3.0.1)"
A1[Complexity Phase] --> B1[Parse files via WASM]
B1 --> C1[Compute metrics]
C1 --> D1[Clear _tree]
D1 --> E1[CFG Phase]
E1 --> F1[Re-parse files via WASM]
F1 --> G1[Build CFG]
G1 --> H1[Dataflow Phase]
H1 --> I1[Re-parse files via WASM again]
I1 --> J1[Build dataflow]
end
subgraph "After (v3.0.1 fix)"
A2[ensureWasmTrees] --> B2[Parse once via memoized parsers]
B2 --> C2[Complexity Phase]
C2 --> D2[Use cached _tree]
D2 --> E2[CFG Phase]
E2 --> F2[Reuse _tree from Complexity]
F2 --> G2[Dataflow Phase]
G2 --> H2[Reuse _tree from Complexity]
H2 --> I2[Clear _tree after all phases]
end
Last reviewed commit: 5915af4 |
Summary
_treeafter each file, forcing CFG and dataflow to each re-create parsers and re-parse all files via WASM. Removed that nullification (builder already clears trees after all phases), addedensureWasmTrees()for a single shared pre-parse pass, and memoizedcreateParsers()so WASM grammars only load once.astComplexitySymbolsfilter (which excludes reverse-dep-only files) is now also applied to CFG and dataflow phases, skipping unchanged files that only needed edge rebuilding.wasmPreMsin phase timing for visibility into the pre-parse cost.Benchmarks (self-build, 172 files, native engine)
Test plan
cfg buildGraph --format mermaid) and dataflow (dataflow buildGraph)diff-impact --staged -Tshows clean 4-function impact