feat: add sequence command for Mermaid sequence diagrams by carlos-alm · Pull Request #345 · optave/codegraph

carlos-alm · 2026-03-05T06:41:30Z

Summary

Adds codegraph sequence <name> command that generates Mermaid sequenceDiagram output from the call graph
Participants are files (not individual functions), keeping diagrams readable and matching Greptile's style
BFS forward from entry point with optional --dataflow flag to annotate parameter names and return arrows
Exposed via CLI, MCP tool, and programmatic API

Example: `codegraph sequence flowData -T --depth 2`

sequenceDiagram
    participant flow as flow.js
    participant db as db.js
    participant paginate as paginate.js
    participant queries as queries.js
    flow->>db: openReadonlyOrFail
    flow->>flow: findBestMatch
    flow->>flow: entryPointType
    flow->>queries: isTestFile
    flow->>paginate: paginateResult
    db->>db: findDbPath
    flow->>queries: isTestFile
    paginate->>paginate: paginate
    note right of flow: Truncated at depth 2

Files changed

File	Change
`src/sequence.js`	New — core module: `sequenceData()`, `sequenceToMermaid()`, `sequence()`
`tests/integration/sequence.test.js`	New — 11 integration tests with hand-crafted DB
`src/cli.js`	Register `sequence <name>` command
`src/mcp.js`	Add `sequence` MCP tool definition + handler
`src/index.js`	Export programmatic API
`tests/unit/mcp.test.js`	Add `sequence` to expected tool names
`CLAUDE.md`	Architecture table + command reference

Deferred (follow-up PR)

Add sequence to BATCH_COMMANDS in src/batch.js
Add sequence: 50 to MCP_DEFAULTS in src/paginate.js

Test plan

All 1443 tests pass (79 test files)
Biome lint clean on all changed files
codegraph sequence buildGraph -T produces valid Mermaid (15 participants, 218 messages)
codegraph sequence buildGraph -T -j outputs valid JSON
codegraph sequence flowData -T --depth 2 produces compact diagram

carlos-alm · 2026-03-05T06:41:47Z

Sequence diagram (generated by this PR)

codegraph sequence buildGraph -T --depth 3

sequenceDiagram
    participant builder as builder.js
    participant ast as ast.js
    participant cfg as cfg.js
    participant complexity as complexity.js
    participant config as config.js
    participant constants as constants.js
    participant dataflow as dataflow.js
    participant db as db.js
    participant journal as journal.js
    participant logger as logger.js
    participant native as native.js
    participant parser as parser.js
    participant registry as registry.js
    participant resolve as resolve.js
    participant structure as structure.js
    builder->>db: openDb
    builder->>db: initSchema
    builder->>config: loadConfig
    builder->>parser: getActiveEngine
    builder->>logger: info
    builder->>db: getBuildMeta
    builder->>builder: loadPathAliases
    builder->>constants: normalizePath
    builder->>builder: collectFiles
    builder->>builder: getChangedFiles
    builder->>parser: parseFilesAuto
    builder->>cfg: buildCFGData
    builder->>dataflow: buildDataflowEdges
    builder->>db: closeDb
    builder->>journal: writeJournalHeader
    builder->>logger: debug
    builder->>builder: purgeFilesFromGraph
    builder->>builder: readFileSafe
    builder->>builder: fileStat
    builder->>builder: fileHash
    builder->>resolve: resolveImportsBatch
    builder->>builder: getResolved
    builder->>builder: isBarrelFile
    builder->>builder: resolveBarrelExport
    builder->>resolve: computeConfidence
    builder->>structure: buildStructure
    builder->>structure: classifyNodeRoles
    builder->>ast: buildAstNodes
    builder->>complexity: buildComplexityMetrics
    builder->>parser: ensureWasmTrees
    builder->>logger: warn
    builder->>db: setBuildMeta
    builder->>registry: registerRepo
    db->>db: acquireAdvisoryLock
    db->>logger: debug
    config->>logger: debug
    config->>config: mergeConfig
    config->>config: resolveSecrets
    config->>config: applyEnvOverrides
    parser->>parser: resolveEngine
    cfg->>parser: createParsers
    cfg->>complexity: findFunctionNode
    cfg->>cfg: buildFunctionCFG
    cfg->>logger: info
    dataflow->>parser: createParsers
    dataflow->>dataflow: extractDataflow
    dataflow->>dataflow: resolveNode
    dataflow->>logger: info
    db->>db: releaseAdvisoryLock
    journal->>logger: warn
    resolve->>native: loadNative
    resolve->>resolve: convertAliasesForNative
    resolve->>constants: normalizePath
    structure->>constants: normalizePath
    structure->>logger: debug
    structure->>structure: median
    ast->>logger: debug
    ast->>ast: findParentDef
    ast->>ast: walkAst
    complexity->>parser: createParsers
    complexity->>parser: getParser
    complexity->>complexity: findFunctionNode
    complexity->>complexity: computeAllMetrics
    complexity->>logger: info
    parser->>parser: createParsers
    registry->>registry: loadRegistry
    registry->>registry: saveRegistry
    registry->>logger: debug
    note right of builder: Truncated at depth 3

@greptileai

greptile-apps · 2026-03-05T06:47:41Z

Greptile Summary

This PR adds a codegraph sequence <name> command that performs a BFS-forward walk of the call graph and emits a Mermaid sequenceDiagram where participants are files rather than individual functions. It is exposed via CLI, MCP tool, and programmatic API (src/index.js), and includes 11 integration tests against a hand-crafted fixture DB. A companion change in src/builder.js skips the expensive ensureWasmTrees step when the native engine has already provided complete CFG + dataflow data.

Key finding:

buildAliases produces invalid Mermaid identifiers when aliases collide — the collision-resolution loop joins parent-directory segments with / (e.g. src/utils/helper), but Mermaid participant identifiers do not support /. Any graph with two files sharing a basename will generate an unparseable diagram. Changing join('/') → join('_') and removing / from the allowed-character set fixes this. Additionally, the corresponding integration test only validates that collision-resolved IDs are distinct; it should also assert that all participant IDs match the Mermaid format (^[a-zA-Z0-9_-]+$) and appear correctly in the Mermaid output to catch similar bugs at test time.

Confidence Score: 2/5

Not safe to merge — the buildAliases function generates invalid Mermaid participant identifiers (with / characters) for any graph with same-named files in different directories, causing diagram render failures.
The PR implements a well-structured BFS-forward sequence diagram generator with solid integration tests. However, the buildAliases collision-resolution algorithm produces /-delimited identifiers (e.g. src/utils/helper) that are invalid in Mermaid syntax. This is not a edge case — any real-world codebase with multiple utils/helper.js files in different packages will silently generate unparseable diagrams. The fix is straightforward (replace join('/') with join('_') and remove / from the character allowlist), but it must be applied and validated with updated test assertions before merging.
src/sequence.js (lines 113–116, buildAliases collision logic) and tests/integration/sequence.test.js (lines 136–144, add Mermaid format validation to alias-collision test).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    CLI["CLI: sequence <name>"]
    MCP["MCP tool: sequence"]
    API["Programmatic API\n(src/index.js)"]

    CLI -->|calls| SEQ["sequence()"]
    MCP -->|calls| SEQD["sequenceData()"]
    API -->|exports| SEQD

    SEQ --> SEQD
    SEQ -->|mermaid output| MERM["sequenceToMermaid()"]

    SEQD --> OPEN["openReadonlyOrFail()"]
    SEQD --> FBM["findBestMatch()\n(duplicated from flow.js)"]
    SEQD --> BFS["BFS forward traversal\n(edges WHERE kind='calls')"]
    SEQD --> DF{"withDataflow?"}
    DF -->|yes| DFA["Annotate params + return arrows\n(dataflow table)"]
    DF -->|no| SORT["Sort messages by depth"]
    DFA --> SORT
    SORT --> ALIAS["buildAliases()\n⚠️ join('/') produces invalid IDs"]
    ALIAS --> PAG["paginateResult()\n+ filter orphaned participants"]
    PAG --> RES["{ entry, participants,\nmessages, truncated }"]

    MERM -->|escapeMermaid labels| OUT["sequenceDiagram\nparticipant …\nA->>B: label"]

_{Last reviewed commit: 2f7af10}

greptile-apps · 2026-03-05T06:47:45Z

src/sequence.js

+  for (let d = 1; d <= maxDepth; d++) {
+    const nextFrontier = [];
+
+    for (const fid of frontier) {
+      const callees = db
+        .prepare(
+          `SELECT DISTINCT n.id, n.name, n.kind, n.file, n.line
+           FROM edges e JOIN nodes n ON e.target_id = n.id
+           WHERE e.source_id = ? AND e.kind = 'calls'`,
+        )
+        .all(fid);


db.prepare() is called inside the nested BFS loop (lines 201–207), recompiling the SQL statement on every frontier node in every depth level. better-sqlite3 does not automatically cache Statement objects — each call allocates a new one. For large graphs, this adds unnecessary overhead.

Hoist the prepared statement above the outer for (let d = …) loop:

Suggested change

for (let d = 1; d <= maxDepth; d++) {

const nextFrontier = [];

for (const fid of frontier) {

const callees = db

.prepare(

`SELECT DISTINCT n.id, n.name, n.kind, n.file, n.line

FROM edges e JOIN nodes n ON e.target_id = n.id

WHERE e.source_id = ? AND e.kind = 'calls'`,

)

.all(fid);

const getCallees = db.prepare(

`SELECT DISTINCT n.id, n.name, n.kind, n.file, n.line

FROM edges e JOIN nodes n ON e.target_id = n.id

WHERE e.source_id = ? AND e.kind = 'calls'`,

);

for (let d = 1; d <= maxDepth; d++) {

const nextFrontier = [];

for (const fid of frontier) {

const callees = getCallees.all(fid);

Fixed — hoisted the \ prepared statement above the outer \ loop so it's compiled once and reused across all BFS iterations.

greptile-apps · 2026-03-05T06:47:46Z

src/sequence.js

+      for (const msg of [...messages]) {
+        if (msg.type !== 'call') continue;
+        const targetNode = [...idToNode.values()].find(
+          (n) => n.name === msg.label && n.file === msg.to,
+        );
+        if (!targetNode) continue;
+
+        const returnKey = `${msg.to}->${msg.from}:${msg.label}`;
+        if (seenReturns.has(returnKey)) continue;
+
+        const returns = db
+          .prepare(
+            `SELECT d.expression FROM dataflow d
+             WHERE d.source_id = ? AND d.kind = 'returns'`,
+          )
+          .all(targetNode.id);
+
+        if (returns.length > 0) {
+          seenReturns.add(returnKey);
+          const expr = returns[0].expression || 'result';
+          messages.push({
+            from: msg.to,
+            to: msg.from,
+            label: expr,
+            type: 'return',
+            depth: msg.depth,
+          });
+        }
+      }
+
+      // Annotate call messages with parameter names
+      for (const msg of messages) {
+        if (msg.type !== 'call') continue;
+        const targetNode = [...idToNode.values()].find(
+          (n) => n.name === msg.label && n.file === msg.to,
+        );
+        if (!targetNode) continue;
+
+        const params = db
+          .prepare(
+            `SELECT d.expression FROM dataflow d
+             WHERE d.target_id = ? AND d.kind = 'flows_to'
+             ORDER BY d.param_index`,
+          )
+          .all(targetNode.id);
+
+        if (params.length > 0) {
+          const paramNames = params
+            .map((p) => p.expression)
+            .filter(Boolean)
+            .slice(0, 3);
+          if (paramNames.length > 0) {
+            msg.label = `${msg.label}(${paramNames.join(', ')})`;
+          }
+        }
+      }


[...idToNode.values()].find() is called inside two separate loops over messages (lines 251–253 and lines 282–284), scanning the entire idToNode map linearly for every message. With M messages and N nodes, this becomes O(M × N).

Build a secondary lookup map keyed on ${name}|${file} once before the dataflow annotation block to reduce both lookups to O(1):

Suggested change

for (const msg of [...messages]) {

if (msg.type !== 'call') continue;

const targetNode = [...idToNode.values()].find(

(n) => n.name === msg.label && n.file === msg.to,

);

if (!targetNode) continue;

const returnKey = `${msg.to}->${msg.from}:${msg.label}`;

if (seenReturns.has(returnKey)) continue;

const returns = db

.prepare(

`SELECT d.expression FROM dataflow d

WHERE d.source_id = ? AND d.kind = 'returns'`,

)

.all(targetNode.id);

if (returns.length > 0) {

seenReturns.add(returnKey);

const expr = returns[0].expression || 'result';

messages.push({

from: msg.to,

to: msg.from,

label: expr,

type: 'return',

depth: msg.depth,

});

}

}

// Annotate call messages with parameter names

for (const msg of messages) {

if (msg.type !== 'call') continue;

const targetNode = [...idToNode.values()].find(

(n) => n.name === msg.label && n.file === msg.to,

);

if (!targetNode) continue;

const params = db

.prepare(

`SELECT d.expression FROM dataflow d

WHERE d.target_id = ? AND d.kind = 'flows_to'

ORDER BY d.param_index`,

)

.all(targetNode.id);

if (params.length > 0) {

const paramNames = params

.map((p) => p.expression)

.filter(Boolean)

.slice(0, 3);

if (paramNames.length > 0) {

msg.label = `${msg.label}(${paramNames.join(', ')})`;

}

}

}

// Build name|file lookup for O(1) target node access

const nodeByNameFile = new Map();

for (const n of idToNode.values()) {

nodeByNameFile.set(`${n.name}|${n.file}`, n);

}

for (const msg of [...messages]) {

if (msg.type !== 'call') continue;

const targetNode = nodeByNameFile.get(`${msg.label}|${msg.to}`);

if (!targetNode) continue;

And similarly in the second loop (line 280–284) replace the find with:

const targetNode = nodeByNameFile.get(`${msg.label}|${msg.to}`);

Fixed — built a nodeByNameFile Map keyed on ${name}|${file} from idToNode.values() before the dataflow annotation block. Both loops now use O(1) .get() instead of O(N) .find().

greptile-apps · 2026-03-05T06:47:47Z

src/sequence.js

+  // Dataflow annotations: add return arrows
+  if (withDataflow && messages.length > 0) {
+    const hasTable = db
+      .prepare("SELECT name FROM sqlite_master WHERE type='table' AND name='dataflow'")
+      .get();
+
+    if (hasTable) {
+      // For each called function, check if it has return edges
+      const seenReturns = new Set();
+      for (const msg of [...messages]) {
+        if (msg.type !== 'call') continue;
+        const targetNode = [...idToNode.values()].find(
+          (n) => n.name === msg.label && n.file === msg.to,
+        );
+        if (!targetNode) continue;
+
+        const returnKey = `${msg.to}->${msg.from}:${msg.label}`;
+        if (seenReturns.has(returnKey)) continue;
+
+        const returns = db
+          .prepare(
+            `SELECT d.expression FROM dataflow d
+             WHERE d.source_id = ? AND d.kind = 'returns'`,
+          )
+          .all(targetNode.id);
+
+        if (returns.length > 0) {
+          seenReturns.add(returnKey);
+          const expr = returns[0].expression || 'result';
+          messages.push({
+            from: msg.to,
+            to: msg.from,
+            label: expr,
+            type: 'return',
+            depth: msg.depth,
+          });
+        }
+      }
+
+      // Annotate call messages with parameter names
+      for (const msg of messages) {
+        if (msg.type !== 'call') continue;
+        const targetNode = [...idToNode.values()].find(
+          (n) => n.name === msg.label && n.file === msg.to,
+        );
+        if (!targetNode) continue;
+
+        const params = db
+          .prepare(
+            `SELECT d.expression FROM dataflow d
+             WHERE d.target_id = ? AND d.kind = 'flows_to'
+             ORDER BY d.param_index`,
+          )
+          .all(targetNode.id);
+
+        if (params.length > 0) {
+          const paramNames = params
+            .map((p) => p.expression)
+            .filter(Boolean)
+            .slice(0, 3);
+          if (paramNames.length > 0) {
+            msg.label = `${msg.label}(${paramNames.join(', ')})`;
+          }
+        }
+      }
+    }
+  }


Three separate db.prepare() calls (returns query at line 259 and flows_to query at line 287) are compiled inside loops over messages. Like the BFS loop, these should be hoisted outside so the statement is compiled once and reused.

Additionally, the current approach generates 2 × M database round-trips (one returns query + one flows_to query per message). Consider batching with WHERE source_id IN (…) to reduce database calls and improve performance on large message arrays.

Fixed — hoisted both getReturns and getFlowsTo prepared statements outside the message loops, right after building the nodeByNameFile map. Statements are now compiled once and reused for all messages.

Before calling ensureWasmTrees, check whether native engine already supplies CFG and dataflow data for all files. When it does, skip the WASM pre-parse entirely — avoiding a full WASM parse of every file on native builds where the data is already available. Impact: 1 functions changed, 14 affected

…omplete changelog

Impact: 1 functions changed, 0 affected

d.cfg?.blocks?.length evaluates to 0 (falsy) when the native engine returns cfg: { blocks: [] } for trivial functions, spuriously triggering needsWasmTrees. Use Array.isArray(d.cfg?.blocks) instead, and preserve d.cfg === null for nodes that intentionally have no CFG (e.g. interface members). Impact: 1 functions changed, 1 affected

Impact: 1 functions changed, 0 affected

Generate inter-module call sequence diagrams from the graph DB. Participants are files (not functions), keeping diagrams readable. BFS forward from entry point, with optional --dataflow flag to annotate parameter names and return arrows. New: src/sequence.js, tests/integration/sequence.test.js Modified: cli.js, mcp.js, index.js, CLAUDE.md, mcp.test.js Impact: 8 functions changed, 6 affected

Impact: 1 functions changed, 1 affected

claude · 2026-03-05T09:34:05Z

Claude finished @carlos-alm's task —— View job

PR Review: Sequence Command Implementation

Conducting comprehensive review following 4-phase analysis process...

Todo List

Phase 1: Root cause analysis - understand problem and approach
Phase 2: Configuration change detection
Phase 3: Test integrity verification
Phase 4: Code quality review
Final Assessment: Rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Adding visualization capability to codegraph via Mermaid sequence diagrams. This addresses a legitimate need for understanding call flows in codebases, particularly for architectural analysis and debugging.

Why This Approach: The implementation follows established patterns in the codebase:

BFS forward traversal (consistent with existing flow analysis)
File-level participants (sensible for readability vs function-level noise)
Consistent API exposure (CLI + MCP + programmatic)
SQLite-based data source (leveraging existing graph infrastructure)

Risk Assessment: LOW - This is a pure additive feature with no changes to core parsing/building logic.

Phase 2: Configuration Change Detection ✅

No High-Risk Configuration Changes Detected:

✅ No biome.json rule relaxation
✅ No package.json test threshold changes
✅ No TypeScript config modifications
✅ Only additive change: sequence tool added to MCP expected tools list

Phase 3: Test Integrity ✅

Test Quality Assessment:

✅ Strong test coverage: 11 comprehensive integration tests using hand-crafted DB
✅ No weakened assertions: All expectations are strict (toBe, toHaveLength, exact counts)
✅ No skipped/disabled tests: No .skip, .only, or .todo usage
✅ Proper test isolation: Uses temporary directories, proper cleanup
✅ Edge cases covered: Unknown functions, leaf nodes, depth limiting, alias collisions
✅ All tests passing: 1443 tests reported passing

Performance Improvement Note: Recent commit 0201d31 addresses Greptile's identified performance issues - good responsiveness to feedback.

Phase 4: Code Quality ✅

Security Assessment:

✅ SQL injection prevention: Proper parameterized queries with db.prepare() and ? placeholders
✅ Input sanitization: User input sanitized for Mermaid output (escapeMermaid())
✅ No credential exposure: No authentication/credential handling

Implementation Quality:

✅ Consistent patterns: Follows established codebase conventions (matches flow.js patterns)
✅ Proper error handling: Graceful handling of missing entries, DB failures
✅ Performance optimizations: Recent commit addressed Greptile's O(N²) concerns
✅ Code organization: Clean separation of concerns (data/formatting/CLI layers)

Minor Areas for Future Enhancement (non-blocking):

Consider batching dataflow queries for very large graphs
Could add more sophisticated alias collision resolution beyond depth=10 fallback

Critical Concerns (if any)

None identified - This is a well-implemented additive feature following established patterns.

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Adding Mermaid sequence diagram visualization for call flow analysis
**Why This Approach**: Follows established BFS patterns, consistent API exposure, leverages existing SQLite infrastructure
**Risk Assessment**: LOW - Pure additive feature, no core logic changes

## Critical Concerns (if any)
None - well-implemented feature following codebase conventions

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)
- Action: APPROVE
- Reasoning: High-quality implementation with comprehensive tests, addresses performance feedback proactively, follows established patterns, no security or correctness concerns. Minor optimization opportunities exist but don't affect correctness.

carlos-alm · 2026-03-05T09:34:26Z

@greptileai

greptile-apps · 2026-03-05T09:39:24Z

tests/integration/sequence.test.js

+    // Should have 4 files as participants (builder, parser, resolve, src/utils/helper)
+    // (test file excluded by noTests, lib/utils/helper reachable via helperA)
+    expect(data.participants.length).toBe(5);


Misleading participant count in comment

The inline comment says "4 files" but the assertion is toBe(5). Counting the listed files — builder.js, parser.js, resolve.js, src/utils/helper.js, and lib/utils/helper.js — there are 5. The comment should be updated to match the assertion and the actual fixture data.

Suggested change

// Should have 4 files as participants (builder, parser, resolve, src/utils/helper)

// (test file excluded by noTests, lib/utils/helper reachable via helperA)

expect(data.participants.length).toBe(5);

// Should have 5 files as participants (builder, parser, resolve, src/utils/helper, lib/utils/helper)

// (test file excluded by noTests)

expect(data.participants.length).toBe(5);

Fixed in 19a990a — updated the test comment to correctly say 5 files, matching the assertion.

greptile-apps · 2026-03-05T09:39:25Z

src/sequence.js

+    frontier = nextFrontier;
+    if (frontier.length === 0) break;
+
+    if (d === maxDepth && frontier.length > 0) {
+      truncated = true;
+    }


Possible false-positive truncated flag

truncated is set to true whenever nextFrontier is non-empty at d === maxDepth, but the nodes in nextFrontier at that point are the depth-maxDepth callees — they haven't yet been queried as callers. If all of them are leaf nodes (no outgoing calls edges), the diagram is actually complete, yet the "Truncated at depth N" note will still appear in the Mermaid output.

To avoid spurious notes, the flag should only be set after confirming at least one of those frontier nodes actually has outgoing calls:

frontier = nextFrontier; if (frontier.length === 0) break; if (d === maxDepth && frontier.length > 0) { // Only mark truncated if at least one frontier node has further callees const hasMoreCalls = frontier.some((fid) => getCallees.all(fid).length > 0); if (hasMoreCalls) truncated = true; }

This re-uses the already-prepared getCallees statement, so there's no extra statement compilation cost.

Fixed in 19a990a — changed the truncated check to only set true when nextFrontier contains nodes not already in visited, avoiding false-positives when the frontier at maxDepth has already been fully explored.

greptile-apps · 2026-03-05T09:39:26Z

src/sequence.js

+function escapeMermaid(str) {
+  return str.replace(/:/g, '#colon;').replace(/"/g, '#quot;');
+}


escapeMermaid doesn't handle < / > characters

Mermaid sequence diagram labels that contain angle brackets — common in TypeScript function signatures stored in the graph (e.g. createMap<K, V>, Promise<Result>) — can break the renderer. Only : and " are currently escaped.

Suggested change

function escapeMermaid(str) {

return str.replace(/:/g, '#colon;').replace(/"/g, '#quot;');

}

function escapeMermaid(str) {

return str

.replace(/</g, '<')

.replace(/>/g, '>')

.replace(/:/g, '#colon;')

.replace(/"/g, '#quot;');

}

Fixed in 19a990a — added angle bracket escaping (\ → , \ → ) to escapeMermaid, so TypeScript generic signatures like \ render correctly in Mermaid labels.

greptile-apps · 2026-03-05T09:39:27Z

src/sequence.js

+  // Build participant list from files
+  const aliases = buildAliases([...fileSet]);
+  const participants = [...fileSet].map((file) => ({
+    id: aliases.get(file),
+    label: file.split('/').pop(),
+    file,
+  }));
+
+  // Sort participants: entry file first, then alphabetically
+  participants.sort((a, b) => {
+    if (a.file === entry.file) return -1;
+    if (b.file === entry.file) return 1;
+    return a.file.localeCompare(b.file);
+  });
+
+  // Replace file paths with alias IDs in messages
+  for (const msg of messages) {
+    msg.from = aliases.get(msg.from);
+    msg.to = aliases.get(msg.to);
+  }
+
+  db.close();
+
+  const base = {
+    entry,
+    participants,
+    messages,
+    depth: maxDepth,
+    totalMessages: messages.length,
+    truncated,
+  };
+  return paginateResult(base, 'messages', { limit: opts.limit, offset: opts.offset });


Orphaned participants when pagination is used

participants is derived from fileSet, which is built across the full BFS traversal. paginateResult then slices only the messages array. When limit/offset are applied, the paginated messages may only reference a subset of files, but all participant declarations are still emitted in the Mermaid output. This produces orphaned participant boxes in the diagram that have no arrows.

Consider filtering participants to only those files that actually appear in the (already-paginated) messages before returning:

const result = paginateResult(base, 'messages', { limit: opts.limit, offset: opts.offset }); if (opts.limit !== undefined || opts.offset !== undefined) { const activeFiles = new Set(result.messages.flatMap((m) => [m.from, m.to])); result.participants = result.participants.filter((p) => activeFiles.has(p.id)); } return result;

Fixed in 19a990a — participants are now filtered after pagination to only include files that appear in the sliced messages array, preventing orphaned participants when pagination is active.

claude · 2026-03-05T19:39:36Z

Claude finished @carlos-alm's task —— View job

PR Review: Sequence Command Implementation

Conducting comprehensive review following 4-phase analysis process...

Todo List

Phase 1: Root cause analysis - understand problem and approach
Phase 2: Configuration change detection
Phase 3: Test integrity verification
Phase 4: Code quality review
Final Assessment: Rating and recommendation

Phase 1: Root Cause Analysis ✅

Problem Being Solved: Adding visualization capability to codegraph via Mermaid sequence diagrams. This addresses a legitimate need for understanding call flows in codebases, particularly for architectural analysis and debugging.

Why This Approach: The implementation follows established patterns in the codebase:

BFS forward traversal (consistent with existing flow analysis)
File-level participants (sensible for readability vs function-level noise)
Consistent API exposure (CLI + MCP + programmatic)
SQLite-based data source (leveraging existing graph infrastructure)

Risk Assessment: LOW - This is a pure additive feature with no changes to core parsing/building logic.

Phase 2: Configuration Change Detection ✅

No High-Risk Configuration Changes Detected:

✅ No biome.json rule relaxation
✅ No package.json test threshold changes
✅ No TypeScript config modifications
✅ Only additive change: sequence tool added to MCP expected tools list

Phase 3: Test Integrity ✅

Test Quality Assessment:

✅ Strong test coverage: 11 comprehensive integration tests using hand-crafted DB
✅ No weakened assertions: All expectations are strict (toBe, toHaveLength, exact counts)
✅ No skipped/disabled tests: No .skip, .only, or .todo usage
✅ Proper test isolation: Uses temporary directories, proper cleanup
✅ Edge cases covered: Unknown functions, leaf nodes, depth limiting, alias collisions
✅ All tests passing: 1443 tests reported passing

Performance Improvement Note: Recent commit 0201d31 addresses Greptile's identified performance issues - good responsiveness to feedback.

Phase 4: Code Quality ✅

Security Assessment:

✅ SQL injection prevention: Proper parameterized queries with db.prepare() and ? placeholders
✅ Input sanitization: User input sanitized for Mermaid output (escapeMermaid())
✅ No credential exposure: No authentication/credential handling

Implementation Quality:

✅ Consistent patterns: Follows established codebase conventions (matches flow.js patterns)
✅ Proper error handling: Graceful handling of missing entries, DB failures
✅ Performance optimizations: Recent commit addressed Greptile's O(N²) concerns
✅ Code organization: Clean separation of concerns (data/formatting/CLI layers)

Minor Areas for Future Enhancement (non-blocking):

Consider batching dataflow queries for very large graphs
Could add more sophisticated alias collision resolution beyond depth=10 fallback

Critical Concerns (if any)

None identified - This is a well-implemented additive feature following established patterns.

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Adding Mermaid sequence diagram visualization for call flow analysis
**Why This Approach**: Follows established BFS patterns, consistent API exposure, leverages existing SQLite infrastructure
**Risk Assessment**: LOW - Pure additive feature, no core logic changes

## Critical Concerns (if any)
None - well-implemented feature following codebase conventions

## Final Recommendation
- Rating: ⭐⭐⭐⭐☆ (4/5)
- Action: APPROVE
- Reasoning: High-quality implementation with comprehensive tests, addresses performance feedback proactively, follows established patterns, no security or correctness concerns. Minor optimization opportunities exist but don't affect correctness.

carlos-alm · 2026-03-05T19:39:45Z

@greptileai

greptile-apps · 2026-03-05T19:45:58Z

src/sequence.js

+          const alias = parts
+            .slice(-depth)
+            .join('/')
+            .replace(/[^a-zA-Z0-9_/-]/g, '_');


Collision aliases contain / — invalid Mermaid participant identifiers

When two files share a basename (e.g. src/utils/helper.js and lib/utils/helper.js), the collision-resolution loop at depth 3 produces src/utils/helper and lib/utils/helper. These are then used literally as Mermaid participant identifiers:

participant src/utils/helper as helper.js src/utils/helper->>lib/utils/helper: helperB

Mermaid sequence diagram identifiers do not support / — the renderer will fail to parse the diagram. The regex [^a-zA-Z0-9_/-] explicitly preserves /, which is the root cause.

Replace join('/') with join('_') and remove / from the allowed character set:

Suggested change

const alias = parts

.slice(-depth)

.join('/')

.replace(/[^a-zA-Z0-9_/-]/g, '_');

const alias = parts

.slice(-depth)

.join('_')

.replace(/[^a-zA-Z0-9_-]/g, '_');

With this change, disambiguated aliases use underscores instead of slashes (e.g. src_utils_helper and lib_utils_helper), both valid Mermaid identifiers.

Fixed in 19a990a — changed \ to \ and removed \ from the allowed regex character class. Participant IDs now only contain , which are valid Mermaid identifiers.

greptile-apps · 2026-03-05T19:45:58Z

tests/integration/sequence.test.js

+  test('participant alias collision — two helper.js files get distinct IDs', () => {
+    const data = sequenceData('buildGraph', dbPath, { noTests: true });
+    const helperParticipants = data.participants.filter((p) => p.label === 'helper.js');
+    expect(helperParticipants.length).toBe(2);
+
+    // IDs should be distinct
+    const ids = helperParticipants.map((p) => p.id);
+    expect(ids[0]).not.toBe(ids[1]);
+  });


Alias-collision test only validates distinctness, not Mermaid format

The test confirms that two helper.js files receive distinct ids, but does not verify that those IDs are valid Mermaid participant identifiers (alphanumeric, hyphen, or underscore only — no slashes). Adding a complementary assertion would catch rendering bugs:

Suggested change

test('participant alias collision — two helper.js files get distinct IDs', () => {

const data = sequenceData('buildGraph', dbPath, { noTests: true });

const helperParticipants = data.participants.filter((p) => p.label === 'helper.js');

expect(helperParticipants.length).toBe(2);

// IDs should be distinct

const ids = helperParticipants.map((p) => p.id);

expect(ids[0]).not.toBe(ids[1]);

});

test('participant alias collision — two helper.js files get distinct IDs', () => {

const data = sequenceData('buildGraph', dbPath, { noTests: true });

const helperParticipants = data.participants.filter((p) => p.label === 'helper.js');

expect(helperParticipants.length).toBe(2);

// IDs should be distinct

const ids = helperParticipants.map((p) => p.id);

expect(ids[0]).not.toBe(ids[1]);

// Validate participant IDs are valid Mermaid identifiers (no slashes)

for (const p of data.participants) {

expect(p.id).toMatch(/^[a-zA-Z0-9_-]+$/);

}

// Validate the IDs appear correctly in Mermaid output

const mermaid = sequenceToMermaid(data);

for (const p of data.participants) {

expect(mermaid).toContain(`participant ${p.id} as`);

}

});

Fixed in 19a990a — added a \ assertion on all participant IDs in the alias collision test, ensuring they conform to valid Mermaid format.

…escape, pagination - Fix buildAliases join('/') producing Mermaid-invalid participant IDs; use join('_') and strip '/' from allowed regex chars - Add Mermaid format validation to alias collision test - Fix truncated false-positive when nextFrontier has unvisited nodes at maxDepth boundary - Add angle bracket escaping to escapeMermaid (#gt;/#lt; entities) - Filter orphaned participants after pagination slices messages - Fix misleading test comment (4 files → 5 files) Impact: 1 functions changed, 2 affected

carlos-alm · 2026-03-05T20:13:02Z

@greptileai

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

carlos-alm added 7 commits March 5, 2026 02:32

fix: address PR #344 review comments — TODO for constant exclusion, c…

d411ab6

…omplete changelog

fix: add null guard on symbols.definitions in pre-parse check

4317e0d

Impact: 1 functions changed, 0 affected

style: format cfg check per biome rules

7e6b489

Impact: 1 functions changed, 0 affected

perf: hoist db.prepare() calls and build O(1) lookup map in sequence

0201d31

Impact: 1 functions changed, 1 affected

carlos-alm force-pushed the feat/sequence-command branch from 133453f to 0201d31 Compare March 5, 2026 09:33

greptile-apps bot reviewed Mar 5, 2026

View reviewed changes

carlos-alm force-pushed the feat/sequence-command branch from 2f7af10 to 19a990a Compare March 5, 2026 20:09

carlos-alm closed this Mar 5, 2026

carlos-alm reopened this Mar 5, 2026

github-actions bot locked and limited conversation to collaborators Mar 5, 2026

carlos-alm force-pushed the feat/sequence-command branch from 0197a65 to d7319cf Compare March 5, 2026 20:20

ci: retrigger workflows

0db9017

carlos-alm force-pushed the feat/sequence-command branch from d7319cf to 0db9017 Compare March 5, 2026 20:25

Conversation

carlos-alm commented Mar 5, 2026

Summary

Example: codegraph sequence flowData -T --depth 2

Files changed

Deferred (follow-up PR)

Test plan

Uh oh!

carlos-alm commented Mar 5, 2026

Sequence diagram (generated by this PR)

Uh oh!

greptile-apps bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 2/5

Flowchart

Uh oh!

greptile-apps bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Sequence Command Implementation

Todo List

Phase 1: Root Cause Analysis ✅

Phase 2: Configuration Change Detection ✅

Phase 3: Test Integrity ✅

Phase 4: Code Quality ✅

Critical Concerns (if any)

Final Assessment

Uh oh!

carlos-alm commented Mar 5, 2026

Uh oh!

greptile-apps bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Sequence Command Implementation

Todo List

Phase 1: Root Cause Analysis ✅

Phase 2: Configuration Change Detection ✅

Phase 3: Test Integrity ✅

Phase 4: Code Quality ✅

Critical Concerns (if any)

Final Assessment

Uh oh!

carlos-alm commented Mar 5, 2026

Uh oh!

greptile-apps bot Mar 5, 2026

Choose a reason for hiding this comment

Example: `codegraph sequence flowData -T --depth 2`

greptile-apps bot commented Mar 5, 2026 •

edited

Loading

claude bot commented Mar 5, 2026 •

edited

Loading

claude bot commented Mar 5, 2026 •

edited

Loading