Skip to content

Add vMCP optimizer guide for Kubernetes#588

Merged
aponcedeleonch merged 2 commits intomainfrom
add-vmcp-optimizer-guide
Mar 9, 2026
Merged

Add vMCP optimizer guide for Kubernetes#588
aponcedeleonch merged 2 commits intomainfrom
add-vmcp-optimizer-guide

Conversation

@aponcedeleonch
Copy link
Member

@aponcedeleonch aponcedeleonch commented Mar 5, 2026

Summary

  • Add a new how-to guide for enabling the vMCP optimizer via the
    Kubernetes operator (EmbeddingServer + VirtualMCPServer CRDs)
  • Cover quick start, parameter tuning reference, ARM64 compatibility,
    and a full production example
  • Add cross-links from 6 existing pages (vMCP concepts, intro,
    configuration, tool aggregation, tool optimization concepts,
    and the MCP Optimizer tutorial)
  • Add optimizer as a key benefit in the vMCP concept page and
    expand the vMCP intro with a new "Optimize tool discovery" section

Test plan

  • Verify npm run build passes
  • Review page renders correctly in local dev server
  • Confirm all internal links resolve

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 5, 2026 15:22
@vercel
Copy link

vercel bot commented Mar 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs-website Ready Ready Preview, Comment Mar 9, 2026 10:56am

Request Review

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new how-to guide (docs/toolhive/guides-vmcp/optimizer.mdx) for enabling the optimizer feature in Virtual MCP Server (vMCP) on Kubernetes. The guide covers deploying an EmbeddingServer, referencing it from a VirtualMCPServer to enable the optimizer, tuning optimizer parameters, and a production-ready advanced example.

Changes:

  • New documentation page covering EmbeddingServer setup, VirtualMCPServer configuration with optimizer, parameter tuning guidance, and an advanced production example
  • Updated sidebars.ts to include the new optimizer guide between composite-tools and failure-handling in the Virtual MCP Server sidebar section

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
docs/toolhive/guides-vmcp/optimizer.mdx New how-to guide for enabling and configuring the vMCP optimizer feature with EmbeddingServer and VirtualMCPServer CRDs
sidebars.ts Adds the new optimizer guide to the Virtual MCP Server sidebar category

You can also share your feedback on Copilot code review. Take the survey.

@danbarr
Copy link
Collaborator

danbarr commented Mar 5, 2026

@claude use the docs-review skill to perform an editorial review of this PR. Also consider user journey and discoverability of this important new capability for the vMCP server.

@claude
Copy link
Contributor

claude bot commented Mar 5, 2026

Claude finished @danbarr's task in 2m 27s —— View job


Editorial review

  • Gather context and read changed files
  • Run docs-review skill on the new guide
  • Compile and post full editorial review

Summary

The guide is well-structured and delivers genuinely useful content: the quick-start flow is clean, the parameter table is excellent, and the tuning guidance admonition is practical and actionable. The main issues are a missing first-use expansion of "vMCP," an incomplete verification step, an unexplained production config detail, and — most importantly — several discoverability gaps that leave this feature invisible to users who aren't already reading the optimizer guide directly.

Primary issues

1. Missing first-use expansion of "vMCP" (line 8)

The opening sentence reads "When vMCP aggregates many backend MCP servers…" but per the project style guide, the first use in the body text must be "Virtual MCP Server (vMCP)." Every other guide in the guides-vmcp/ section follows this convention (see authentication.mdx line 6, scaling-and-performance.mdx line 7).

Fix: Change line 8 to: "When Virtual MCP Server (vMCP) aggregates many backend MCP servers…"

Fix this →

2. Discoverability gaps — optimizer isn't mentioned in key parent/sibling docs

This is the most impactful issue. A user learning about vMCP has several natural entry points, none of which surface the optimizer:

  • guides-vmcp/intro.mdx — "Core capabilities" (lines 32–38): Lists multi-server aggregation, tool conflict resolution, centralized authentication, and composite workflows. The optimizer — arguably a compelling fifth capability — is absent. A user assessing whether vMCP meets their needs will miss it entirely.
  • concepts/vmcp.mdx — "Core value propositions" and "Key capabilities": Same gap. The optimizer isn't listed among the things vMCP can do.
  • concepts/tool-optimization.mdx — "Related information" (lines 251–259): This page explains exactly the context pollution problem that the optimizer solves, and currently links to tool filtering guides. Adding a link to this guide here would give users a direct path from problem → solution.
  • tutorials/mcp-optimizer.mdx — "Related information" (lines 328–331): The tutorial covers the desktop/CLI approach and links to guides-ui/mcp-optimizer.mdx but not to this new Kubernetes guide. Users finishing that tutorial and wanting the K8s equivalent have no obvious next step.

Fix: Add brief references in each of the above docs pointing to this guide. For intro.mdx, add the optimizer to the Core capabilities list with a link. For concepts/tool-optimization.mdx and tutorials/mcp-optimizer.mdx, add an entry in "Related information." For concepts/vmcp.mdx, add a mention under "Key capabilities."

3. Step 3 "Verify" has no success criteria (lines 113–123)

The step tells users to run kubectl get virtualmcpserver my-vmcp -n toolhive-system but gives no guidance on what a successful output looks like. What column shows readiness? What status value confirms the optimizer is active? Users are left guessing whether the command succeeded.

Fix: Add expected output or a description of what to look for, e.g.: "Look for READY: True in the output. Once ready, clients connecting to the vMCP endpoint will see only find_tool and call_tool." Alternatively, pattern-match on the tip in Step 1, which does a good job explaining what to wait for.

Fix this →

4. Unexplained production config decision in advanced example (line 226)

The advanced example sets embeddingServiceTimeout: 15s — a shorter timeout than the default 30s. This seems counterintuitive for a "production-ready" configuration. Without explanation, users copying this example may end up with a config that causes more failures under load than the default.

Fix: Add a brief inline comment or a sentence explaining the rationale. For example: "A shorter timeout (15s) is appropriate when the EmbeddingServer is co-located in the same namespace and low-latency responses are expected. Increase this if the embedding service is remote or under high load."

Fix this →

Secondary issues

Issue Location Recommendation
"A client sends a prompt" is ambiguous Line 32 Change to "An AI client sends a prompt" for clarity
Step 2 prose says "add embeddingServerRef to your existing VirtualMCPServer" but shows a full spec Lines 88–111 Clarify that the YAML shows the full resource with the new field highlighted, not just the diff — or rephrase to "update your VirtualMCPServer to include embeddingServerRef"
Advanced example has two YAML blocks with almost no prose between them Lines 191–236 Add 1–2 sentences before the second YAML block explaining what this config achieves differently from the quick start and why (e.g. "This configuration enables HA for the embedding server with persistent model caching…")
Overview > Benefits (lines 22–28) partially re-states the intro paragraph Lines 8–12 vs 22–28 The intro already mentions reduced token usage and find_tool/call_tool. The Benefits list mostly repeats this. Consider cutting the intro paragraph's detail and letting the Benefits list carry it, or folding the Benefits list content into the intro prose.

Inline suggestions

Line 34 (Step count): up to 8 by default — the maxToolsToReturn default is confirmed in the parameter table, so this is consistent. Good.

Lines 63–75 (Step 1 YAML): Consider adding a code comment # Uses BAAI/bge-small-en-v1.5 by default inside the YAML for readers who skim code blocks and miss the prose above it. This is a minor nicety.

Lines 168–172 (string-encoded floats note): This is a well-executed design decision explanation — keeps it in a :::note admonition, not buried in prose. No change needed.

Lines 241–245 (Related information): The links are accurate and well-chosen. One addition worth considering: a link to the concepts/vmcp.mdx page for readers who want the broader conceptual context before diving in.


Overall this is solid, publish-ready content with targeted fixes needed. The discoverability issue (#2 above) is the most important to address before merge, as it affects whether users can find this capability at all.

@aponcedeleonch
Copy link
Member Author

@claude check if your comments have been addressed in the latest commit

@claude
Copy link
Contributor

claude bot commented Mar 5, 2026

Claude finished @aponcedeleonch's task in 1m 14s —— View job


Review comment status

  • Gather context
  • Read current state of optimizer.mdx
  • Check discoverability fixes (intro.mdx, vmcp.mdx, tool-optimization.mdx, mcp-optimizer.mdx)
  • Compare against all previous review issues
  • Post findings

All four primary issues and three of the four secondary issues have been addressed. Here's the full breakdown:

Primary issues — all fixed ✅

Issue Status Notes
Missing first-use expansion of "vMCP" ✅ Fixed Line 8 now reads "When Virtual MCP Server (vMCP) aggregates…"
Discoverability gaps ✅ Fixed All four locations updated: intro.mdx Core capabilities list (+ Related info link), concepts/vmcp.mdx Core value propositions (+ Related info link), concepts/tool-optimization.mdx Related info, tutorials/mcp-optimizer.mdx Related info
Step 3 "Verify" — no success criteria ✅ Fixed Lines 122–124 now describe expected output: "Look for READY: True…clients…see only find_tool and call_tool"
Unexplained embeddingServiceTimeout: 15s in advanced example ✅ Fixed Lines 217–219 explain the co-location rationale and advise increasing the value for remote/high-load scenarios

Secondary issues — 3 of 4 fixed ✅

Issue Status Notes
"A client sends a prompt" is ambiguous ✅ Fixed Line 33 now reads "An AI client sends a prompt that requires tool assistance"
Step 2 prose vs. full spec ✅ Fixed Line 91 now reads "Update your existing VirtualMCPServer to include embeddingServerRef"
Advanced example — insufficient prose between YAML blocks ✅ Fixed Two prose paragraphs now separate the EmbeddingServer and VirtualMCPServer YAML blocks, each explaining what the config does and why
Overview > Benefits partially restates intro paragraph ⚠️ Not addressed The intro paragraph (lines 8–13) and Benefits list (lines 24–29) still overlap on token usage and find_tool/call_tool. This was a soft "consider" suggestion, not a blocker

Minor open item (optional)

The "Related information" section of the optimizer guide (lines 246–253) links to concepts/tool-optimization.mdx but not to concepts/vmcp.mdx. Adding that link was suggested as a nice-to-have for readers wanting broader conceptual context before diving in — this is entirely optional and not a blocker.


Overall this is in great shape. The PR addresses all blocking feedback and is ready to merge.

@aponcedeleonch
Copy link
Member Author

@claude check again if all comments have been addressed

@claude
Copy link
Contributor

claude bot commented Mar 5, 2026

Claude finished @aponcedeleonch's task in 1m 9s —— View job


Re-checking review comments

  • Read current state of all changed files
  • Compare against all previous review issues
  • Post findings

The status is unchanged from the previous check. Here's the full breakdown:

Primary issues — all fixed ✅

Issue Status Notes
Missing first-use expansion of "vMCP" ✅ Fixed Line 8: "When Virtual MCP Server (vMCP) aggregates…"
Discoverability gaps ✅ Fixed All four locations updated: intro.mdx Core capabilities + Related info, concepts/vmcp.mdx Core value propositions + Related info, concepts/tool-optimization.mdx Related info, tutorials/mcp-optimizer.mdx Related info
Step 3 "Verify" — no success criteria ✅ Fixed Lines 120–122 describe expected output and confirm optimizer is active
Unexplained embeddingServiceTimeout: 15s ✅ Fixed Lines 215–217 explain co-location rationale and advise increasing for remote/high-load scenarios

Secondary issues — 3 of 4 fixed ✅

Issue Status Notes
"A client sends a prompt" is ambiguous ✅ Fixed Line 31: "An AI client sends a prompt that requires tool assistance"
Step 2 prose vs. full spec ✅ Fixed Line 89: "Update your existing VirtualMCPServer to include embeddingServerRef"
Advanced example — insufficient prose between YAML blocks ✅ Fixed Prose before each YAML block explains what each config does and why
Overview > Benefits partially restates intro paragraph ⚠️ Not addressed The intro (lines 8–11) and the Benefits list (lines 22–27) still overlap on "relevant tools" and "semantic search" — this was flagged as a soft "consider" suggestion, not a blocker

Optional item (not addressed)

The optimizer guide's "Related information" section (lines 244–251) still doesn't include a link to concepts/vmcp.mdx. This was noted as a nice-to-have for readers wanting broader conceptual context — not a blocker.


All blocking feedback has been addressed. The PR is ready to merge.

@aponcedeleonch
Copy link
Member Author

@claude check again if all your comments have been addressed

@claude
Copy link
Contributor

claude bot commented Mar 5, 2026

Claude finished @aponcedeleonch's task in 55s —— View job


Re-checking review comments

  • Read current state of all changed files
  • Compare against all previous review issues
  • Post findings

One new item has been addressed since the last check. Here's the full updated status:

Primary issues — all fixed ✅

Issue Status Notes
Missing first-use expansion of "vMCP" ✅ Fixed Line 8: "When Virtual MCP Server (vMCP) aggregates…"
Discoverability gaps ✅ Fixed All four locations updated: intro.mdx, concepts/vmcp.mdx, concepts/tool-optimization.mdx, tutorials/mcp-optimizer.mdx
Step 3 "Verify" — no success criteria ✅ Fixed Lines 120–122 describe expected output and confirm optimizer is active
Unexplained embeddingServiceTimeout: 15s ✅ Fixed Lines 215–217 explain co-location rationale and advise increasing for remote/high-load scenarios

Secondary issues — all fixed ✅

Issue Status Notes
"A client sends a prompt" is ambiguous ✅ Fixed Line 31: "An AI client sends a prompt that requires tool assistance"
Step 2 prose vs. full spec ✅ Fixed Line 89: "Update your existing VirtualMCPServer to include embeddingServerRef"
Advanced example — insufficient prose between YAML blocks ✅ Fixed Prose before each YAML block explains what each config does and why
Overview > Benefits partially restates intro paragraph ⚠️ Not addressed The intro (lines 8–11) and Benefits list (lines 22–27) still overlap on token usage and tool selection. This was a soft "consider" suggestion, not a blocker.

Optional item — now fixed ✅

The optimizer guide's "Related information" section (lines 251–252) now includes a link to concepts/vmcp.mdx. This was the only item not addressed in the previous check.


All blocking feedback has been addressed. The one remaining soft item (intro/Benefits overlap) is not a blocker. This PR is ready to merge.

Copy link
Contributor

@jerm-dro jerm-dro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's nail the Benefits section. It's the first and potentially only thing people will read, so it's important.

@aponcedeleonch aponcedeleonch force-pushed the add-vmcp-optimizer-guide branch from e0166c3 to eeadace Compare March 6, 2026 14:38
@aponcedeleonch aponcedeleonch force-pushed the add-vmcp-optimizer-guide branch 2 times, most recently from f98e33f to 74385c3 Compare March 6, 2026 14:48
@aponcedeleonch aponcedeleonch force-pushed the add-vmcp-optimizer-guide branch from 74385c3 to 78cfdd0 Compare March 6, 2026 14:52
@aponcedeleonch aponcedeleonch changed the title Add vMCP optimizer guide Add vMCP optimizer guide for Kubernetes Mar 6, 2026
@aponcedeleonch aponcedeleonch requested a review from danbarr March 6, 2026 15:58
danbarr
danbarr previously approved these changes Mar 6, 2026
Copy link
Collaborator

@danbarr danbarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good guide overall. One substantive issue to address before or shortly after merge:

ARM64 workaround is incomplete

The ARM64 warning block shows how to pre-pull a pinned image (cpu-1.7) and load it into a kind cluster, but never shows how to configure the EmbeddingServer to actually use that image. Without setting spec.image, the EmbeddingServer will still default to cpu-latest (the amd64-only image the workaround is trying to avoid), making the workaround a no-op.

The block needs a follow-up YAML snippet showing how to pin the image in the EmbeddingServer resource:

spec:
  image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.7

Secondary concern in the same block: kind load docker-image is kind-specific. Since ARM64 also applies to EKS on Graviton and other distributions, at minimum add a note like "adjust the load command for your cluster distribution."

Minor

In intro.mdx, the new "Tool optimization" bullet is missing a period at the end of the sentence.

aponcedeleonch and others added 2 commits March 9, 2026 12:23
- Add a new how-to guide for enabling the vMCP optimizer via the
  Kubernetes operator (EmbeddingServer + VirtualMCPServer CRDs)
- Cover quick start, parameter tuning reference, ARM64 compatibility,
  and a full production example
- Add cross-links from 6 existing pages (vMCP concepts, intro,
  configuration, tool aggregation, tool optimization concepts,
  and the MCP Optimizer tutorial)
- Add optimizer as a key benefit in the vMCP concept page and
  expand the vMCP intro with a new "Optimize tool discovery" section

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add spec.image pin to EmbeddingServer YAML so the pre-pulled amd64
image is actually used, explain root cause (Intel MKL), and note
that native ARM64 support is in progress upstream.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aponcedeleonch aponcedeleonch merged commit 39f5b63 into main Mar 9, 2026
4 checks passed
@aponcedeleonch aponcedeleonch deleted the add-vmcp-optimizer-guide branch March 9, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Document how to deploy vMCP with optimizer in Kubernetes using CRDs

5 participants