Skip to content

feat: add sequence command for Mermaid sequence diagrams#345

Open
carlos-alm wants to merge 9 commits intomainfrom
feat/sequence-command
Open

feat: add sequence command for Mermaid sequence diagrams#345
carlos-alm wants to merge 9 commits intomainfrom
feat/sequence-command

Conversation

@carlos-alm
Copy link
Contributor

Summary

  • Adds codegraph sequence <name> command that generates Mermaid sequenceDiagram output from the call graph
  • Participants are files (not individual functions), keeping diagrams readable and matching Greptile's style
  • BFS forward from entry point with optional --dataflow flag to annotate parameter names and return arrows
  • Exposed via CLI, MCP tool, and programmatic API

Example: codegraph sequence flowData -T --depth 2

sequenceDiagram
    participant flow as flow.js
    participant db as db.js
    participant paginate as paginate.js
    participant queries as queries.js
    flow->>db: openReadonlyOrFail
    flow->>flow: findBestMatch
    flow->>flow: entryPointType
    flow->>queries: isTestFile
    flow->>paginate: paginateResult
    db->>db: findDbPath
    flow->>queries: isTestFile
    paginate->>paginate: paginate
    note right of flow: Truncated at depth 2
Loading

Files changed

File Change
src/sequence.js New — core module: sequenceData(), sequenceToMermaid(), sequence()
tests/integration/sequence.test.js New — 11 integration tests with hand-crafted DB
src/cli.js Register sequence <name> command
src/mcp.js Add sequence MCP tool definition + handler
src/index.js Export programmatic API
tests/unit/mcp.test.js Add sequence to expected tool names
CLAUDE.md Architecture table + command reference

Deferred (follow-up PR)

  • Add sequence to BATCH_COMMANDS in src/batch.js
  • Add sequence: 50 to MCP_DEFAULTS in src/paginate.js

Test plan

  • All 1443 tests pass (79 test files)
  • Biome lint clean on all changed files
  • codegraph sequence buildGraph -T produces valid Mermaid (15 participants, 218 messages)
  • codegraph sequence buildGraph -T -j outputs valid JSON
  • codegraph sequence flowData -T --depth 2 produces compact diagram

@carlos-alm
Copy link
Contributor Author

Sequence diagram (generated by this PR)

codegraph sequence buildGraph -T --depth 3

sequenceDiagram
    participant builder as builder.js
    participant ast as ast.js
    participant cfg as cfg.js
    participant complexity as complexity.js
    participant config as config.js
    participant constants as constants.js
    participant dataflow as dataflow.js
    participant db as db.js
    participant journal as journal.js
    participant logger as logger.js
    participant native as native.js
    participant parser as parser.js
    participant registry as registry.js
    participant resolve as resolve.js
    participant structure as structure.js
    builder->>db: openDb
    builder->>db: initSchema
    builder->>config: loadConfig
    builder->>parser: getActiveEngine
    builder->>logger: info
    builder->>db: getBuildMeta
    builder->>builder: loadPathAliases
    builder->>constants: normalizePath
    builder->>builder: collectFiles
    builder->>builder: getChangedFiles
    builder->>parser: parseFilesAuto
    builder->>cfg: buildCFGData
    builder->>dataflow: buildDataflowEdges
    builder->>db: closeDb
    builder->>journal: writeJournalHeader
    builder->>logger: debug
    builder->>builder: purgeFilesFromGraph
    builder->>builder: readFileSafe
    builder->>builder: fileStat
    builder->>builder: fileHash
    builder->>resolve: resolveImportsBatch
    builder->>builder: getResolved
    builder->>builder: isBarrelFile
    builder->>builder: resolveBarrelExport
    builder->>resolve: computeConfidence
    builder->>structure: buildStructure
    builder->>structure: classifyNodeRoles
    builder->>ast: buildAstNodes
    builder->>complexity: buildComplexityMetrics
    builder->>parser: ensureWasmTrees
    builder->>logger: warn
    builder->>db: setBuildMeta
    builder->>registry: registerRepo
    db->>db: acquireAdvisoryLock
    db->>logger: debug
    config->>logger: debug
    config->>config: mergeConfig
    config->>config: resolveSecrets
    config->>config: applyEnvOverrides
    parser->>parser: resolveEngine
    cfg->>parser: createParsers
    cfg->>complexity: findFunctionNode
    cfg->>cfg: buildFunctionCFG
    cfg->>logger: info
    dataflow->>parser: createParsers
    dataflow->>dataflow: extractDataflow
    dataflow->>dataflow: resolveNode
    dataflow->>logger: info
    db->>db: releaseAdvisoryLock
    journal->>logger: warn
    resolve->>native: loadNative
    resolve->>resolve: convertAliasesForNative
    resolve->>constants: normalizePath
    structure->>constants: normalizePath
    structure->>logger: debug
    structure->>structure: median
    ast->>logger: debug
    ast->>ast: findParentDef
    ast->>ast: walkAst
    complexity->>parser: createParsers
    complexity->>parser: getParser
    complexity->>complexity: findFunctionNode
    complexity->>complexity: computeAllMetrics
    complexity->>logger: info
    parser->>parser: createParsers
    registry->>registry: loadRegistry
    registry->>registry: saveRegistry
    registry->>logger: debug
    note right of builder: Truncated at depth 3
Loading

@greptileai

@greptile-apps
Copy link

greptile-apps bot commented Mar 5, 2026

Greptile Summary

This PR adds a codegraph sequence <name> command that performs a BFS-forward walk of the call graph and emits a Mermaid sequenceDiagram where participants are files rather than individual functions. It is exposed via CLI, MCP tool, and programmatic API (src/index.js), and includes 11 integration tests against a hand-crafted fixture DB. A companion change in src/builder.js skips the expensive ensureWasmTrees step when the native engine has already provided complete CFG + dataflow data.

Key finding:

  • buildAliases produces invalid Mermaid identifiers when aliases collide — the collision-resolution loop joins parent-directory segments with / (e.g. src/utils/helper), but Mermaid participant identifiers do not support /. Any graph with two files sharing a basename will generate an unparseable diagram. Changing join('/')join('_') and removing / from the allowed-character set fixes this. Additionally, the corresponding integration test only validates that collision-resolved IDs are distinct; it should also assert that all participant IDs match the Mermaid format (^[a-zA-Z0-9_-]+$) and appear correctly in the Mermaid output to catch similar bugs at test time.

Confidence Score: 2/5

  • Not safe to merge — the buildAliases function generates invalid Mermaid participant identifiers (with / characters) for any graph with same-named files in different directories, causing diagram render failures.
  • The PR implements a well-structured BFS-forward sequence diagram generator with solid integration tests. However, the buildAliases collision-resolution algorithm produces /-delimited identifiers (e.g. src/utils/helper) that are invalid in Mermaid syntax. This is not a edge case — any real-world codebase with multiple utils/helper.js files in different packages will silently generate unparseable diagrams. The fix is straightforward (replace join('/') with join('_') and remove / from the character allowlist), but it must be applied and validated with updated test assertions before merging.
  • src/sequence.js (lines 113–116, buildAliases collision logic) and tests/integration/sequence.test.js (lines 136–144, add Mermaid format validation to alias-collision test).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    CLI["CLI: sequence <name>"]
    MCP["MCP tool: sequence"]
    API["Programmatic API\n(src/index.js)"]

    CLI -->|calls| SEQ["sequence()"]
    MCP -->|calls| SEQD["sequenceData()"]
    API -->|exports| SEQD

    SEQ --> SEQD
    SEQ -->|mermaid output| MERM["sequenceToMermaid()"]

    SEQD --> OPEN["openReadonlyOrFail()"]
    SEQD --> FBM["findBestMatch()\n(duplicated from flow.js)"]
    SEQD --> BFS["BFS forward traversal\n(edges WHERE kind='calls')"]
    SEQD --> DF{"withDataflow?"}
    DF -->|yes| DFA["Annotate params + return arrows\n(dataflow table)"]
    DF -->|no| SORT["Sort messages by depth"]
    DFA --> SORT
    SORT --> ALIAS["buildAliases()\n⚠️ join('/') produces invalid IDs"]
    ALIAS --> PAG["paginateResult()\n+ filter orphaned participants"]
    PAG --> RES["{ entry, participants,\nmessages, truncated }"]

    MERM -->|escapeMermaid labels| OUT["sequenceDiagram\nparticipant …\nA->>B: label"]
Loading

Last reviewed commit: 2f7af10

src/sequence.js Outdated
Comment on lines +197 to +207
for (let d = 1; d <= maxDepth; d++) {
const nextFrontier = [];

for (const fid of frontier) {
const callees = db
.prepare(
`SELECT DISTINCT n.id, n.name, n.kind, n.file, n.line
FROM edges e JOIN nodes n ON e.target_id = n.id
WHERE e.source_id = ? AND e.kind = 'calls'`,
)
.all(fid);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.prepare() is called inside the nested BFS loop (lines 201–207), recompiling the SQL statement on every frontier node in every depth level. better-sqlite3 does not automatically cache Statement objects — each call allocates a new one. For large graphs, this adds unnecessary overhead.

Hoist the prepared statement above the outer for (let d = …) loop:

Suggested change
for (let d = 1; d <= maxDepth; d++) {
const nextFrontier = [];
for (const fid of frontier) {
const callees = db
.prepare(
`SELECT DISTINCT n.id, n.name, n.kind, n.file, n.line
FROM edges e JOIN nodes n ON e.target_id = n.id
WHERE e.source_id = ? AND e.kind = 'calls'`,
)
.all(fid);
const getCallees = db.prepare(
`SELECT DISTINCT n.id, n.name, n.kind, n.file, n.line
FROM edges e JOIN nodes n ON e.target_id = n.id
WHERE e.source_id = ? AND e.kind = 'calls'`,
);
for (let d = 1; d <= maxDepth; d++) {
const nextFrontier = [];
for (const fid of frontier) {
const callees = getCallees.all(fid);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — hoisted the \ prepared statement above the outer \ loop so it's compiled once and reused across all BFS iterations.

Comment on lines +249 to +304
for (const msg of [...messages]) {
if (msg.type !== 'call') continue;
const targetNode = [...idToNode.values()].find(
(n) => n.name === msg.label && n.file === msg.to,
);
if (!targetNode) continue;

const returnKey = `${msg.to}->${msg.from}:${msg.label}`;
if (seenReturns.has(returnKey)) continue;

const returns = db
.prepare(
`SELECT d.expression FROM dataflow d
WHERE d.source_id = ? AND d.kind = 'returns'`,
)
.all(targetNode.id);

if (returns.length > 0) {
seenReturns.add(returnKey);
const expr = returns[0].expression || 'result';
messages.push({
from: msg.to,
to: msg.from,
label: expr,
type: 'return',
depth: msg.depth,
});
}
}

// Annotate call messages with parameter names
for (const msg of messages) {
if (msg.type !== 'call') continue;
const targetNode = [...idToNode.values()].find(
(n) => n.name === msg.label && n.file === msg.to,
);
if (!targetNode) continue;

const params = db
.prepare(
`SELECT d.expression FROM dataflow d
WHERE d.target_id = ? AND d.kind = 'flows_to'
ORDER BY d.param_index`,
)
.all(targetNode.id);

if (params.length > 0) {
const paramNames = params
.map((p) => p.expression)
.filter(Boolean)
.slice(0, 3);
if (paramNames.length > 0) {
msg.label = `${msg.label}(${paramNames.join(', ')})`;
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...idToNode.values()].find() is called inside two separate loops over messages (lines 251–253 and lines 282–284), scanning the entire idToNode map linearly for every message. With M messages and N nodes, this becomes O(M × N).

Build a secondary lookup map keyed on ${name}|${file} once before the dataflow annotation block to reduce both lookups to O(1):

Suggested change
for (const msg of [...messages]) {
if (msg.type !== 'call') continue;
const targetNode = [...idToNode.values()].find(
(n) => n.name === msg.label && n.file === msg.to,
);
if (!targetNode) continue;
const returnKey = `${msg.to}->${msg.from}:${msg.label}`;
if (seenReturns.has(returnKey)) continue;
const returns = db
.prepare(
`SELECT d.expression FROM dataflow d
WHERE d.source_id = ? AND d.kind = 'returns'`,
)
.all(targetNode.id);
if (returns.length > 0) {
seenReturns.add(returnKey);
const expr = returns[0].expression || 'result';
messages.push({
from: msg.to,
to: msg.from,
label: expr,
type: 'return',
depth: msg.depth,
});
}
}
// Annotate call messages with parameter names
for (const msg of messages) {
if (msg.type !== 'call') continue;
const targetNode = [...idToNode.values()].find(
(n) => n.name === msg.label && n.file === msg.to,
);
if (!targetNode) continue;
const params = db
.prepare(
`SELECT d.expression FROM dataflow d
WHERE d.target_id = ? AND d.kind = 'flows_to'
ORDER BY d.param_index`,
)
.all(targetNode.id);
if (params.length > 0) {
const paramNames = params
.map((p) => p.expression)
.filter(Boolean)
.slice(0, 3);
if (paramNames.length > 0) {
msg.label = `${msg.label}(${paramNames.join(', ')})`;
}
}
}
// Build name|file lookup for O(1) target node access
const nodeByNameFile = new Map();
for (const n of idToNode.values()) {
nodeByNameFile.set(`${n.name}|${n.file}`, n);
}
for (const msg of [...messages]) {
if (msg.type !== 'call') continue;
const targetNode = nodeByNameFile.get(`${msg.label}|${msg.to}`);
if (!targetNode) continue;

And similarly in the second loop (line 280–284) replace the find with:

const targetNode = nodeByNameFile.get(`${msg.label}|${msg.to}`);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — built a nodeByNameFile Map keyed on ${name}|${file} from idToNode.values() before the dataflow annotation block. Both loops now use O(1) .get() instead of O(N) .find().

Comment on lines +240 to +306
// Dataflow annotations: add return arrows
if (withDataflow && messages.length > 0) {
const hasTable = db
.prepare("SELECT name FROM sqlite_master WHERE type='table' AND name='dataflow'")
.get();

if (hasTable) {
// For each called function, check if it has return edges
const seenReturns = new Set();
for (const msg of [...messages]) {
if (msg.type !== 'call') continue;
const targetNode = [...idToNode.values()].find(
(n) => n.name === msg.label && n.file === msg.to,
);
if (!targetNode) continue;

const returnKey = `${msg.to}->${msg.from}:${msg.label}`;
if (seenReturns.has(returnKey)) continue;

const returns = db
.prepare(
`SELECT d.expression FROM dataflow d
WHERE d.source_id = ? AND d.kind = 'returns'`,
)
.all(targetNode.id);

if (returns.length > 0) {
seenReturns.add(returnKey);
const expr = returns[0].expression || 'result';
messages.push({
from: msg.to,
to: msg.from,
label: expr,
type: 'return',
depth: msg.depth,
});
}
}

// Annotate call messages with parameter names
for (const msg of messages) {
if (msg.type !== 'call') continue;
const targetNode = [...idToNode.values()].find(
(n) => n.name === msg.label && n.file === msg.to,
);
if (!targetNode) continue;

const params = db
.prepare(
`SELECT d.expression FROM dataflow d
WHERE d.target_id = ? AND d.kind = 'flows_to'
ORDER BY d.param_index`,
)
.all(targetNode.id);

if (params.length > 0) {
const paramNames = params
.map((p) => p.expression)
.filter(Boolean)
.slice(0, 3);
if (paramNames.length > 0) {
msg.label = `${msg.label}(${paramNames.join(', ')})`;
}
}
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three separate db.prepare() calls (returns query at line 259 and flows_to query at line 287) are compiled inside loops over messages. Like the BFS loop, these should be hoisted outside so the statement is compiled once and reused.

Additionally, the current approach generates 2 × M database round-trips (one returns query + one flows_to query per message). Consider batching with WHERE source_id IN (…) to reduce database calls and improve performance on large message arrays.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — hoisted both getReturns and getFlowsTo prepared statements outside the message loops, right after building the nodeByNameFile map. Statements are now compiled once and reused for all messages.

Before calling ensureWasmTrees, check whether native engine already
supplies CFG and dataflow data for all files. When it does, skip the
WASM pre-parse entirely — avoiding a full WASM parse of every file
on native builds where the data is already available.

Impact: 1 functions changed, 14 affected
Impact: 1 functions changed, 0 affected
d.cfg?.blocks?.length evaluates to 0 (falsy) when the native engine
returns cfg: { blocks: [] } for trivial functions, spuriously triggering
needsWasmTrees. Use Array.isArray(d.cfg?.blocks) instead, and preserve
d.cfg === null for nodes that intentionally have no CFG (e.g. interface
members).

Impact: 1 functions changed, 1 affected
Impact: 1 functions changed, 0 affected
Generate inter-module call sequence diagrams from the graph DB.
Participants are files (not functions), keeping diagrams readable.
BFS forward from entry point, with optional --dataflow flag to
annotate parameter names and return arrows.

New: src/sequence.js, tests/integration/sequence.test.js
Modified: cli.js, mcp.js, index.js, CLAUDE.md, mcp.test.js

Impact: 8 functions changed, 6 affected
@carlos-alm carlos-alm force-pushed the feat/sequence-command branch from 133453f to 0201d31 Compare March 5, 2026 09:33
@claude
Copy link

claude bot commented Mar 5, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Sequence Command Implementation

Conducting comprehensive review following 4-phase analysis process...

Todo List

  • Phase 1: Root cause analysis - understand problem and approach
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity verification
  • Phase 4: Code quality review
  • Final Assessment: Rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Adding visualization capability to codegraph via Mermaid sequence diagrams. This addresses a legitimate need for understanding call flows in codebases, particularly for architectural analysis and debugging.

Why This Approach: The implementation follows established patterns in the codebase:

  • BFS forward traversal (consistent with existing flow analysis)
  • File-level participants (sensible for readability vs function-level noise)
  • Consistent API exposure (CLI + MCP + programmatic)
  • SQLite-based data source (leveraging existing graph infrastructure)

Risk Assessment: LOW - This is a pure additive feature with no changes to core parsing/building logic.

Phase 2: Configuration Change Detection ✅

No High-Risk Configuration Changes Detected:

  • ✅ No biome.json rule relaxation
  • ✅ No package.json test threshold changes
  • ✅ No TypeScript config modifications
  • ✅ Only additive change: sequence tool added to MCP expected tools list

Phase 3: Test Integrity ✅

Test Quality Assessment:

  • Strong test coverage: 11 comprehensive integration tests using hand-crafted DB
  • No weakened assertions: All expectations are strict (toBe, toHaveLength, exact counts)
  • No skipped/disabled tests: No .skip, .only, or .todo usage
  • Proper test isolation: Uses temporary directories, proper cleanup
  • Edge cases covered: Unknown functions, leaf nodes, depth limiting, alias collisions
  • All tests passing: 1443 tests reported passing

Performance Improvement Note: Recent commit 0201d31 addresses Greptile's identified performance issues - good responsiveness to feedback.

Phase 4: Code Quality ✅

Security Assessment:

  • SQL injection prevention: Proper parameterized queries with db.prepare() and ? placeholders
  • Input sanitization: User input sanitized for Mermaid output (escapeMermaid())
  • No credential exposure: No authentication/credential handling

Implementation Quality:

  • Consistent patterns: Follows established codebase conventions (matches flow.js patterns)
  • Proper error handling: Graceful handling of missing entries, DB failures
  • Performance optimizations: Recent commit addressed Greptile's O(N²) concerns
  • Code organization: Clean separation of concerns (data/formatting/CLI layers)

Minor Areas for Future Enhancement (non-blocking):

  • Consider batching dataflow queries for very large graphs
  • Could add more sophisticated alias collision resolution beyond depth=10 fallback

Critical Concerns (if any)

None identified - This is a well-implemented additive feature following established patterns.

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Adding Mermaid sequence diagram visualization for call flow analysis
**Why This Approach**: Follows established BFS patterns, consistent API exposure, leverages existing SQLite infrastructure
**Risk Assessment**: LOW - Pure additive feature, no core logic changes

## Critical Concerns (if any)
None - well-implemented feature following codebase conventions

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)
- Action: APPROVE
- Reasoning: High-quality implementation with comprehensive tests, addresses performance feedback proactively, follows established patterns, no security or correctness concerns. Minor optimization opportunities exist but don't affect correctness.

@carlos-alm
Copy link
Contributor Author

@greptileai

Comment on lines +88 to +90
// Should have 4 files as participants (builder, parser, resolve, src/utils/helper)
// (test file excluded by noTests, lib/utils/helper reachable via helperA)
expect(data.participants.length).toBe(5);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading participant count in comment

The inline comment says "4 files" but the assertion is toBe(5). Counting the listed files — builder.js, parser.js, resolve.js, src/utils/helper.js, and lib/utils/helper.js — there are 5. The comment should be updated to match the assertion and the actual fixture data.

Suggested change
// Should have 4 files as participants (builder, parser, resolve, src/utils/helper)
// (test file excluded by noTests, lib/utils/helper reachable via helperA)
expect(data.participants.length).toBe(5);
// Should have 5 files as participants (builder, parser, resolve, src/utils/helper, lib/utils/helper)
// (test file excluded by noTests)
expect(data.participants.length).toBe(5);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 19a990a — updated the test comment to correctly say 5 files, matching the assertion.

Comment on lines +232 to +237
frontier = nextFrontier;
if (frontier.length === 0) break;

if (d === maxDepth && frontier.length > 0) {
truncated = true;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible false-positive truncated flag

truncated is set to true whenever nextFrontier is non-empty at d === maxDepth, but the nodes in nextFrontier at that point are the depth-maxDepth callees — they haven't yet been queried as callers. If all of them are leaf nodes (no outgoing calls edges), the diagram is actually complete, yet the "Truncated at depth N" note will still appear in the Mermaid output.

To avoid spurious notes, the flag should only be set after confirming at least one of those frontier nodes actually has outgoing calls:

    frontier = nextFrontier;
    if (frontier.length === 0) break;

    if (d === maxDepth && frontier.length > 0) {
      // Only mark truncated if at least one frontier node has further callees
      const hasMoreCalls = frontier.some((fid) => getCallees.all(fid).length > 0);
      if (hasMoreCalls) truncated = true;
    }

This re-uses the already-prepared getCallees statement, so there's no extra statement compilation cost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 19a990a — changed the truncated check to only set true when nextFrontier contains nodes not already in visited, avoiding false-positives when the frontier at maxDepth has already been fully explored.

Comment on lines +356 to +358
function escapeMermaid(str) {
return str.replace(/:/g, '#colon;').replace(/"/g, '#quot;');
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

escapeMermaid doesn't handle < / > characters

Mermaid sequence diagram labels that contain angle brackets — common in TypeScript function signatures stored in the graph (e.g. createMap<K, V>, Promise<Result>) — can break the renderer. Only : and " are currently escaped.

Suggested change
function escapeMermaid(str) {
return str.replace(/:/g, '#colon;').replace(/"/g, '#quot;');
}
function escapeMermaid(str) {
return str
.replace(/</g, '&lt;')
.replace(/>/g, '&gt;')
.replace(/:/g, '#colon;')
.replace(/"/g, '#quot;');
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 19a990a — added angle bracket escaping (\ → , \ → ) to escapeMermaid, so TypeScript generic signatures like \ render correctly in Mermaid labels.

src/sequence.js Outdated
Comment on lines +317 to +348
// Build participant list from files
const aliases = buildAliases([...fileSet]);
const participants = [...fileSet].map((file) => ({
id: aliases.get(file),
label: file.split('/').pop(),
file,
}));

// Sort participants: entry file first, then alphabetically
participants.sort((a, b) => {
if (a.file === entry.file) return -1;
if (b.file === entry.file) return 1;
return a.file.localeCompare(b.file);
});

// Replace file paths with alias IDs in messages
for (const msg of messages) {
msg.from = aliases.get(msg.from);
msg.to = aliases.get(msg.to);
}

db.close();

const base = {
entry,
participants,
messages,
depth: maxDepth,
totalMessages: messages.length,
truncated,
};
return paginateResult(base, 'messages', { limit: opts.limit, offset: opts.offset });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orphaned participants when pagination is used

participants is derived from fileSet, which is built across the full BFS traversal. paginateResult then slices only the messages array. When limit/offset are applied, the paginated messages may only reference a subset of files, but all participant declarations are still emitted in the Mermaid output. This produces orphaned participant boxes in the diagram that have no arrows.

Consider filtering participants to only those files that actually appear in the (already-paginated) messages before returning:

  const result = paginateResult(base, 'messages', { limit: opts.limit, offset: opts.offset });
  if (opts.limit !== undefined || opts.offset !== undefined) {
    const activeFiles = new Set(result.messages.flatMap((m) => [m.from, m.to]));
    result.participants = result.participants.filter((p) => activeFiles.has(p.id));
  }
  return result;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 19a990a — participants are now filtered after pagination to only include files that appear in the sliced messages array, preventing orphaned participants when pagination is active.

@claude
Copy link

claude bot commented Mar 5, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Sequence Command Implementation

Conducting comprehensive review following 4-phase analysis process...

Todo List

  • Phase 1: Root cause analysis - understand problem and approach
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity verification
  • Phase 4: Code quality review
  • Final Assessment: Rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Adding visualization capability to codegraph via Mermaid sequence diagrams. This addresses a legitimate need for understanding call flows in codebases, particularly for architectural analysis and debugging.

Why This Approach: The implementation follows established patterns in the codebase:

  • BFS forward traversal (consistent with existing flow analysis)
  • File-level participants (sensible for readability vs function-level noise)
  • Consistent API exposure (CLI + MCP + programmatic)
  • SQLite-based data source (leveraging existing graph infrastructure)

Risk Assessment: LOW - This is a pure additive feature with no changes to core parsing/building logic.

Phase 2: Configuration Change Detection ✅

No High-Risk Configuration Changes Detected:

  • ✅ No biome.json rule relaxation
  • ✅ No package.json test threshold changes
  • ✅ No TypeScript config modifications
  • ✅ Only additive change: sequence tool added to MCP expected tools list

Phase 3: Test Integrity ✅

Test Quality Assessment:

  • Strong test coverage: 11 comprehensive integration tests using hand-crafted DB
  • No weakened assertions: All expectations are strict (toBe, toHaveLength, exact counts)
  • No skipped/disabled tests: No .skip, .only, or .todo usage
  • Proper test isolation: Uses temporary directories, proper cleanup
  • Edge cases covered: Unknown functions, leaf nodes, depth limiting, alias collisions
  • All tests passing: 1443 tests reported passing

Performance Improvement Note: Recent commit 0201d31 addresses Greptile's identified performance issues - good responsiveness to feedback.

Phase 4: Code Quality ✅

Security Assessment:

  • SQL injection prevention: Proper parameterized queries with db.prepare() and ? placeholders
  • Input sanitization: User input sanitized for Mermaid output (escapeMermaid())
  • No credential exposure: No authentication/credential handling

Implementation Quality:

  • Consistent patterns: Follows established codebase conventions (matches flow.js patterns)
  • Proper error handling: Graceful handling of missing entries, DB failures
  • Performance optimizations: Recent commit addressed Greptile's O(N²) concerns
  • Code organization: Clean separation of concerns (data/formatting/CLI layers)

Minor Areas for Future Enhancement (non-blocking):

  • Consider batching dataflow queries for very large graphs
  • Could add more sophisticated alias collision resolution beyond depth=10 fallback

Critical Concerns (if any)

None identified - This is a well-implemented additive feature following established patterns.

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Adding Mermaid sequence diagram visualization for call flow analysis
**Why This Approach**: Follows established BFS patterns, consistent API exposure, leverages existing SQLite infrastructure
**Risk Assessment**: LOW - Pure additive feature, no core logic changes

## Critical Concerns (if any)
None - well-implemented feature following codebase conventions

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)
- Action: APPROVE
- Reasoning: High-quality implementation with comprehensive tests, addresses performance feedback proactively, follows established patterns, no security or correctness concerns. Minor optimization opportunities exist but don't affect correctness.

@carlos-alm
Copy link
Contributor Author

@greptileai

src/sequence.js Outdated
Comment on lines +113 to +116
const alias = parts
.slice(-depth)
.join('/')
.replace(/[^a-zA-Z0-9_/-]/g, '_');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collision aliases contain / — invalid Mermaid participant identifiers

When two files share a basename (e.g. src/utils/helper.js and lib/utils/helper.js), the collision-resolution loop at depth 3 produces src/utils/helper and lib/utils/helper. These are then used literally as Mermaid participant identifiers:

participant src/utils/helper as helper.js
src/utils/helper->>lib/utils/helper: helperB

Mermaid sequence diagram identifiers do not support / — the renderer will fail to parse the diagram. The regex [^a-zA-Z0-9_/-] explicitly preserves /, which is the root cause.

Replace join('/') with join('_') and remove / from the allowed character set:

Suggested change
const alias = parts
.slice(-depth)
.join('/')
.replace(/[^a-zA-Z0-9_/-]/g, '_');
const alias = parts
.slice(-depth)
.join('_')
.replace(/[^a-zA-Z0-9_-]/g, '_');

With this change, disambiguated aliases use underscores instead of slashes (e.g. src_utils_helper and lib_utils_helper), both valid Mermaid identifiers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 19a990a — changed \ to \ and removed \ from the allowed regex character class. Participant IDs now only contain , which are valid Mermaid identifiers.

Comment on lines +136 to +144
test('participant alias collision — two helper.js files get distinct IDs', () => {
const data = sequenceData('buildGraph', dbPath, { noTests: true });
const helperParticipants = data.participants.filter((p) => p.label === 'helper.js');
expect(helperParticipants.length).toBe(2);

// IDs should be distinct
const ids = helperParticipants.map((p) => p.id);
expect(ids[0]).not.toBe(ids[1]);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alias-collision test only validates distinctness, not Mermaid format

The test confirms that two helper.js files receive distinct ids, but does not verify that those IDs are valid Mermaid participant identifiers (alphanumeric, hyphen, or underscore only — no slashes). Adding a complementary assertion would catch rendering bugs:

Suggested change
test('participant alias collision — two helper.js files get distinct IDs', () => {
const data = sequenceData('buildGraph', dbPath, { noTests: true });
const helperParticipants = data.participants.filter((p) => p.label === 'helper.js');
expect(helperParticipants.length).toBe(2);
// IDs should be distinct
const ids = helperParticipants.map((p) => p.id);
expect(ids[0]).not.toBe(ids[1]);
});
test('participant alias collision — two helper.js files get distinct IDs', () => {
const data = sequenceData('buildGraph', dbPath, { noTests: true });
const helperParticipants = data.participants.filter((p) => p.label === 'helper.js');
expect(helperParticipants.length).toBe(2);
// IDs should be distinct
const ids = helperParticipants.map((p) => p.id);
expect(ids[0]).not.toBe(ids[1]);
// Validate participant IDs are valid Mermaid identifiers (no slashes)
for (const p of data.participants) {
expect(p.id).toMatch(/^[a-zA-Z0-9_-]+$/);
}
// Validate the IDs appear correctly in Mermaid output
const mermaid = sequenceToMermaid(data);
for (const p of data.participants) {
expect(mermaid).toContain(`participant ${p.id} as`);
}
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 19a990a — added a \ assertion on all participant IDs in the alias collision test, ensuring they conform to valid Mermaid format.

…escape, pagination

- Fix buildAliases join('/') producing Mermaid-invalid participant IDs;
  use join('_') and strip '/' from allowed regex chars
- Add Mermaid format validation to alias collision test
- Fix truncated false-positive when nextFrontier has unvisited nodes at
  maxDepth boundary
- Add angle bracket escaping to escapeMermaid (#gt;/#lt; entities)
- Filter orphaned participants after pagination slices messages
- Fix misleading test comment (4 files → 5 files)

Impact: 1 functions changed, 2 affected
@carlos-alm carlos-alm force-pushed the feat/sequence-command branch from 2f7af10 to 19a990a Compare March 5, 2026 20:09
@carlos-alm
Copy link
Contributor Author

@greptileai

@carlos-alm carlos-alm closed this Mar 5, 2026
@carlos-alm carlos-alm reopened this Mar 5, 2026
@github-actions github-actions bot locked and limited conversation to collaborators Mar 5, 2026
@carlos-alm carlos-alm force-pushed the feat/sequence-command branch from 0197a65 to d7319cf Compare March 5, 2026 20:20
@carlos-alm carlos-alm force-pushed the feat/sequence-command branch from d7319cf to 0db9017 Compare March 5, 2026 20:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant