Add vMCP optimizer guide for Kubernetes by aponcedeleonch · Pull Request #588 · stacklok/docs-website

aponcedeleonch · 2026-03-05T15:22:49Z

Summary

Add a new how-to guide for enabling the vMCP optimizer via the
Kubernetes operator (EmbeddingServer + VirtualMCPServer CRDs)
Cover quick start, parameter tuning reference, ARM64 compatibility,
and a full production example
Add cross-links from 6 existing pages (vMCP concepts, intro,
configuration, tool aggregation, tool optimization concepts,
and the MCP Optimizer tutorial)
Add optimizer as a key benefit in the vMCP concept page and
expand the vMCP intro with a new "Optimize tool discovery" section

Test plan

Verify npm run build passes
Review page renders correctly in local dev server
Confirm all internal links resolve

🤖 Generated with Claude Code

vercel · 2026-03-05T15:22:57Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs-website	Ready	Preview, Comment	Mar 9, 2026 10:56am

Copilot

Pull request overview

This PR adds a new how-to guide (docs/toolhive/guides-vmcp/optimizer.mdx) for enabling the optimizer feature in Virtual MCP Server (vMCP) on Kubernetes. The guide covers deploying an EmbeddingServer, referencing it from a VirtualMCPServer to enable the optimizer, tuning optimizer parameters, and a production-ready advanced example.

Changes:

New documentation page covering EmbeddingServer setup, VirtualMCPServer configuration with optimizer, parameter tuning guidance, and an advanced production example
Updated sidebars.ts to include the new optimizer guide between composite-tools and failure-handling in the Virtual MCP Server sidebar section

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`docs/toolhive/guides-vmcp/optimizer.mdx`	New how-to guide for enabling and configuring the vMCP optimizer feature with EmbeddingServer and VirtualMCPServer CRDs
`sidebars.ts`	Adds the new optimizer guide to the Virtual MCP Server sidebar category

You can also share your feedback on Copilot code review. Take the survey.

docs/toolhive/guides-vmcp/optimizer.mdx

danbarr · 2026-03-05T20:02:01Z

@claude use the docs-review skill to perform an editorial review of this PR. Also consider user journey and discoverability of this important new capability for the vMCP server.

claude · 2026-03-05T20:02:19Z

Claude finished @danbarr's task in 2m 27s —— View job

Editorial review

Gather context and read changed files
Run docs-review skill on the new guide
Compile and post full editorial review

Summary

The guide is well-structured and delivers genuinely useful content: the quick-start flow is clean, the parameter table is excellent, and the tuning guidance admonition is practical and actionable. The main issues are a missing first-use expansion of "vMCP," an incomplete verification step, an unexplained production config detail, and — most importantly — several discoverability gaps that leave this feature invisible to users who aren't already reading the optimizer guide directly.

Primary issues

1. Missing first-use expansion of "vMCP" (line 8)

The opening sentence reads "When vMCP aggregates many backend MCP servers…" but per the project style guide, the first use in the body text must be "Virtual MCP Server (vMCP)." Every other guide in the guides-vmcp/ section follows this convention (see authentication.mdx line 6, scaling-and-performance.mdx line 7).

Fix: Change line 8 to: "When Virtual MCP Server (vMCP) aggregates many backend MCP servers…"

Fix this →

2. Discoverability gaps — optimizer isn't mentioned in key parent/sibling docs

This is the most impactful issue. A user learning about vMCP has several natural entry points, none of which surface the optimizer:

guides-vmcp/intro.mdx — "Core capabilities" (lines 32–38): Lists multi-server aggregation, tool conflict resolution, centralized authentication, and composite workflows. The optimizer — arguably a compelling fifth capability — is absent. A user assessing whether vMCP meets their needs will miss it entirely.
concepts/vmcp.mdx — "Core value propositions" and "Key capabilities": Same gap. The optimizer isn't listed among the things vMCP can do.
concepts/tool-optimization.mdx — "Related information" (lines 251–259): This page explains exactly the context pollution problem that the optimizer solves, and currently links to tool filtering guides. Adding a link to this guide here would give users a direct path from problem → solution.
tutorials/mcp-optimizer.mdx — "Related information" (lines 328–331): The tutorial covers the desktop/CLI approach and links to guides-ui/mcp-optimizer.mdx but not to this new Kubernetes guide. Users finishing that tutorial and wanting the K8s equivalent have no obvious next step.

Fix: Add brief references in each of the above docs pointing to this guide. For intro.mdx, add the optimizer to the Core capabilities list with a link. For concepts/tool-optimization.mdx and tutorials/mcp-optimizer.mdx, add an entry in "Related information." For concepts/vmcp.mdx, add a mention under "Key capabilities."

3. Step 3 "Verify" has no success criteria (lines 113–123)

The step tells users to run kubectl get virtualmcpserver my-vmcp -n toolhive-system but gives no guidance on what a successful output looks like. What column shows readiness? What status value confirms the optimizer is active? Users are left guessing whether the command succeeded.

Fix: Add expected output or a description of what to look for, e.g.: "Look for READY: True in the output. Once ready, clients connecting to the vMCP endpoint will see only find_tool and call_tool." Alternatively, pattern-match on the tip in Step 1, which does a good job explaining what to wait for.

Fix this →

4. Unexplained production config decision in advanced example (line 226)

The advanced example sets embeddingServiceTimeout: 15s — a shorter timeout than the default 30s. This seems counterintuitive for a "production-ready" configuration. Without explanation, users copying this example may end up with a config that causes more failures under load than the default.

Fix: Add a brief inline comment or a sentence explaining the rationale. For example: "A shorter timeout (15s) is appropriate when the EmbeddingServer is co-located in the same namespace and low-latency responses are expected. Increase this if the embedding service is remote or under high load."

Fix this →

Secondary issues

Issue	Location	Recommendation
"A client sends a prompt" is ambiguous	Line 32	Change to "An AI client sends a prompt" for clarity
Step 2 prose says "add `embeddingServerRef` to your existing VirtualMCPServer" but shows a full spec	Lines 88–111	Clarify that the YAML shows the full resource with the new field highlighted, not just the diff — or rephrase to "update your VirtualMCPServer to include `embeddingServerRef`"
Advanced example has two YAML blocks with almost no prose between them	Lines 191–236	Add 1–2 sentences before the second YAML block explaining what this config achieves differently from the quick start and why (e.g. "This configuration enables HA for the embedding server with persistent model caching…")
Overview > Benefits (lines 22–28) partially re-states the intro paragraph	Lines 8–12 vs 22–28	The intro already mentions reduced token usage and `find_tool`/`call_tool`. The Benefits list mostly repeats this. Consider cutting the intro paragraph's detail and letting the Benefits list carry it, or folding the Benefits list content into the intro prose.

Inline suggestions

Line 34 (Step count): up to 8 by default — the maxToolsToReturn default is confirmed in the parameter table, so this is consistent. Good.

Lines 63–75 (Step 1 YAML): Consider adding a code comment # Uses BAAI/bge-small-en-v1.5 by default inside the YAML for readers who skim code blocks and miss the prose above it. This is a minor nicety.

Lines 168–172 (string-encoded floats note): This is a well-executed design decision explanation — keeps it in a :::note admonition, not buried in prose. No change needed.

Lines 241–245 (Related information): The links are accurate and well-chosen. One addition worth considering: a link to the concepts/vmcp.mdx page for readers who want the broader conceptual context before diving in.

Overall this is solid, publish-ready content with targeted fixes needed. The discoverability issue (#2 above) is the most important to address before merge, as it affects whether users can find this capability at all.

aponcedeleonch · 2026-03-05T20:18:22Z

@claude check if your comments have been addressed in the latest commit

claude · 2026-03-05T20:18:38Z

Claude finished @aponcedeleonch's task in 1m 14s —— View job

Review comment status

Gather context
Read current state of optimizer.mdx
Check discoverability fixes (intro.mdx, vmcp.mdx, tool-optimization.mdx, mcp-optimizer.mdx)
Compare against all previous review issues
Post findings

All four primary issues and three of the four secondary issues have been addressed. Here's the full breakdown:

Primary issues — all fixed ✅

Issue	Status	Notes
Missing first-use expansion of "vMCP"	✅ Fixed	Line 8 now reads "When Virtual MCP Server (vMCP) aggregates…"
Discoverability gaps	✅ Fixed	All four locations updated: `intro.mdx` Core capabilities list (+ Related info link), `concepts/vmcp.mdx` Core value propositions (+ Related info link), `concepts/tool-optimization.mdx` Related info, `tutorials/mcp-optimizer.mdx` Related info
Step 3 "Verify" — no success criteria	✅ Fixed	Lines 122–124 now describe expected output: "Look for `READY: True`…clients…see only `find_tool` and `call_tool`"
Unexplained `embeddingServiceTimeout: 15s` in advanced example	✅ Fixed	Lines 217–219 explain the co-location rationale and advise increasing the value for remote/high-load scenarios

Secondary issues — 3 of 4 fixed ✅

Issue	Status	Notes
"A client sends a prompt" is ambiguous	✅ Fixed	Line 33 now reads "An AI client sends a prompt that requires tool assistance"
Step 2 prose vs. full spec	✅ Fixed	Line 91 now reads "Update your existing VirtualMCPServer to include `embeddingServerRef`"
Advanced example — insufficient prose between YAML blocks	✅ Fixed	Two prose paragraphs now separate the EmbeddingServer and VirtualMCPServer YAML blocks, each explaining what the config does and why
Overview > Benefits partially restates intro paragraph	⚠️ Not addressed	The intro paragraph (lines 8–13) and Benefits list (lines 24–29) still overlap on token usage and `find_tool`/`call_tool`. This was a soft "consider" suggestion, not a blocker

Minor open item (optional)

The "Related information" section of the optimizer guide (lines 246–253) links to concepts/tool-optimization.mdx but not to concepts/vmcp.mdx. Adding that link was suggested as a nice-to-have for readers wanting broader conceptual context before diving in — this is entirely optional and not a blocker.

Overall this is in great shape. The PR addresses all blocking feedback and is ready to merge.

aponcedeleonch · 2026-03-05T20:26:59Z

@claude check again if all comments have been addressed

claude · 2026-03-05T20:27:11Z

Claude finished @aponcedeleonch's task in 1m 9s —— View job

Re-checking review comments

Read current state of all changed files
Compare against all previous review issues
Post findings

The status is unchanged from the previous check. Here's the full breakdown:

Primary issues — all fixed ✅

Issue	Status	Notes
Missing first-use expansion of "vMCP"	✅ Fixed	Line 8: "When Virtual MCP Server (vMCP) aggregates…"
Discoverability gaps	✅ Fixed	All four locations updated: `intro.mdx` Core capabilities + Related info, `concepts/vmcp.mdx` Core value propositions + Related info, `concepts/tool-optimization.mdx` Related info, `tutorials/mcp-optimizer.mdx` Related info
Step 3 "Verify" — no success criteria	✅ Fixed	Lines 120–122 describe expected output and confirm optimizer is active
Unexplained `embeddingServiceTimeout: 15s`	✅ Fixed	Lines 215–217 explain co-location rationale and advise increasing for remote/high-load scenarios

Secondary issues — 3 of 4 fixed ✅

Issue	Status	Notes
"A client sends a prompt" is ambiguous	✅ Fixed	Line 31: "An AI client sends a prompt that requires tool assistance"
Step 2 prose vs. full spec	✅ Fixed	Line 89: "Update your existing VirtualMCPServer to include `embeddingServerRef`"
Advanced example — insufficient prose between YAML blocks	✅ Fixed	Prose before each YAML block explains what each config does and why
Overview > Benefits partially restates intro paragraph	⚠️ Not addressed	The intro (lines 8–11) and the Benefits list (lines 22–27) still overlap on "relevant tools" and "semantic search" — this was flagged as a soft "consider" suggestion, not a blocker

Optional item (not addressed)

The optimizer guide's "Related information" section (lines 244–251) still doesn't include a link to concepts/vmcp.mdx. This was noted as a nice-to-have for readers wanting broader conceptual context — not a blocker.

All blocking feedback has been addressed. The PR is ready to merge.

aponcedeleonch · 2026-03-05T20:40:08Z

@claude check again if all your comments have been addressed

claude · 2026-03-05T20:40:28Z

Claude finished @aponcedeleonch's task in 55s —— View job

Re-checking review comments

Read current state of all changed files
Compare against all previous review issues
Post findings

One new item has been addressed since the last check. Here's the full updated status:

Primary issues — all fixed ✅

Issue	Status	Notes
Missing first-use expansion of "vMCP"	✅ Fixed	Line 8: "When Virtual MCP Server (vMCP) aggregates…"
Discoverability gaps	✅ Fixed	All four locations updated: `intro.mdx`, `concepts/vmcp.mdx`, `concepts/tool-optimization.mdx`, `tutorials/mcp-optimizer.mdx`
Step 3 "Verify" — no success criteria	✅ Fixed	Lines 120–122 describe expected output and confirm optimizer is active
Unexplained `embeddingServiceTimeout: 15s`	✅ Fixed	Lines 215–217 explain co-location rationale and advise increasing for remote/high-load scenarios

Secondary issues — all fixed ✅

Issue	Status	Notes
"A client sends a prompt" is ambiguous	✅ Fixed	Line 31: "An AI client sends a prompt that requires tool assistance"
Step 2 prose vs. full spec	✅ Fixed	Line 89: "Update your existing VirtualMCPServer to include `embeddingServerRef`"
Advanced example — insufficient prose between YAML blocks	✅ Fixed	Prose before each YAML block explains what each config does and why
Overview > Benefits partially restates intro paragraph	⚠️ Not addressed	The intro (lines 8–11) and Benefits list (lines 22–27) still overlap on token usage and tool selection. This was a soft "consider" suggestion, not a blocker.

Optional item — now fixed ✅

The optimizer guide's "Related information" section (lines 251–252) now includes a link to concepts/vmcp.mdx. This was the only item not addressed in the previous check.

All blocking feedback has been addressed. The one remaining soft item (intro/Benefits overlap) is not a blocker. This PR is ready to merge.

jerm-dro

LGTM, but let's nail the Benefits section. It's the first and potentially only thing people will read, so it's important.

docs/toolhive/guides-vmcp/optimizer.mdx

danbarr

Good guide overall. One substantive issue to address before or shortly after merge:

ARM64 workaround is incomplete

The ARM64 warning block shows how to pre-pull a pinned image (cpu-1.7) and load it into a kind cluster, but never shows how to configure the EmbeddingServer to actually use that image. Without setting spec.image, the EmbeddingServer will still default to cpu-latest (the amd64-only image the workaround is trying to avoid), making the workaround a no-op.

The block needs a follow-up YAML snippet showing how to pin the image in the EmbeddingServer resource:

spec:
  image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.7

Secondary concern in the same block: kind load docker-image is kind-specific. Since ARM64 also applies to EKS on Graviton and other distributions, at minimum add a note like "adjust the load command for your cluster distribution."

Minor

In intro.mdx, the new "Tool optimization" bullet is missing a period at the end of the sentence.

- Add a new how-to guide for enabling the vMCP optimizer via the Kubernetes operator (EmbeddingServer + VirtualMCPServer CRDs) - Cover quick start, parameter tuning reference, ARM64 compatibility, and a full production example - Add cross-links from 6 existing pages (vMCP concepts, intro, configuration, tool aggregation, tool optimization concepts, and the MCP Optimizer tutorial) - Add optimizer as a key benefit in the vMCP concept page and expand the vMCP intro with a new "Optimize tool discovery" section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add spec.image pin to EmbeddingServer YAML so the pre-pulled amd64 image is actually used, explain root cause (Intel MKL), and note that native ARM64 support is in progress upstream. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings March 5, 2026 15:22

Copilot started reviewing on behalf of aponcedeleonch March 5, 2026 15:23 View session

vercel bot deployed to Preview March 5, 2026 15:23 View deployment

Copilot AI reviewed Mar 5, 2026

View reviewed changes

docs/toolhive/guides-vmcp/optimizer.mdx Outdated Show resolved Hide resolved

vercel bot deployed to Preview March 5, 2026 20:18 View deployment

aponcedeleonch force-pushed the add-vmcp-optimizer-guide branch from 0bc2e4f to 6b82116 Compare March 5, 2026 20:26

vercel bot deployed to Preview March 5, 2026 20:27 View deployment

vercel bot deployed to Preview March 5, 2026 20:40 View deployment

aponcedeleonch linked an issue Mar 5, 2026 that may be closed by this pull request

Document how to deploy vMCP with optimizer in Kubernetes using CRDs #546

Closed

4 tasks

jerm-dro reviewed Mar 6, 2026

View reviewed changes

docs/toolhive/guides-vmcp/optimizer.mdx Outdated Show resolved Hide resolved

docs/toolhive/guides-vmcp/optimizer.mdx Outdated Show resolved Hide resolved

lorr1 reviewed Mar 6, 2026

View reviewed changes

vercel bot deployed to Preview March 6, 2026 09:01 View deployment

danbarr reviewed Mar 6, 2026

View reviewed changes

docs/toolhive/guides-vmcp/optimizer.mdx Outdated Show resolved Hide resolved

aponcedeleonch force-pushed the add-vmcp-optimizer-guide branch from e0166c3 to eeadace Compare March 6, 2026 14:38

vercel bot deployed to Preview March 6, 2026 14:39 View deployment

aponcedeleonch force-pushed the add-vmcp-optimizer-guide branch 2 times, most recently from f98e33f to 74385c3 Compare March 6, 2026 14:48

vercel bot deployed to Preview March 6, 2026 14:52 View deployment

aponcedeleonch force-pushed the add-vmcp-optimizer-guide branch from 74385c3 to 78cfdd0 Compare March 6, 2026 14:52

vercel bot deployed to Preview March 6, 2026 14:55 View deployment

aponcedeleonch changed the title ~~Add vMCP optimizer guide~~ Add vMCP optimizer guide for Kubernetes Mar 6, 2026

aponcedeleonch force-pushed the add-vmcp-optimizer-guide branch from 78cfdd0 to c972b44 Compare March 6, 2026 14:55

vercel bot deployed to Preview March 6, 2026 14:57 View deployment

aponcedeleonch requested review from danbarr, jerm-dro and lorr1 March 6, 2026 15:00

danbarr reviewed Mar 6, 2026

View reviewed changes

docs/toolhive/guides-vmcp/optimizer.mdx Outdated Show resolved Hide resolved

aponcedeleonch requested a review from danbarr March 6, 2026 15:58

danbarr previously approved these changes Mar 6, 2026

View reviewed changes

aponcedeleonch and others added 2 commits March 9, 2026 12:23

aponcedeleonch dismissed danbarr’s stale review via 3410623 March 9, 2026 10:53

aponcedeleonch force-pushed the add-vmcp-optimizer-guide branch from c972b44 to 3410623 Compare March 9, 2026 10:53

aponcedeleonch requested a review from danbarr March 9, 2026 10:53

vercel bot deployed to Preview March 9, 2026 10:56 View deployment

danbarr approved these changes Mar 9, 2026

View reviewed changes

aponcedeleonch merged commit 39f5b63 into main Mar 9, 2026
4 checks passed

aponcedeleonch deleted the add-vmcp-optimizer-guide branch March 9, 2026 13:10

Conversation

aponcedeleonch commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

vercel bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

danbarr commented Mar 5, 2026

Uh oh!

claude bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Editorial review

Summary

Primary issues

1. Missing first-use expansion of "vMCP" (line 8)

2. Discoverability gaps — optimizer isn't mentioned in key parent/sibling docs

3. Step 3 "Verify" has no success criteria (lines 113–123)

4. Unexplained production config decision in advanced example (line 226)

Secondary issues

Inline suggestions

Uh oh!

aponcedeleonch commented Mar 5, 2026

Uh oh!

claude bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review comment status

Primary issues — all fixed ✅

Secondary issues — 3 of 4 fixed ✅

Minor open item (optional)

Uh oh!

aponcedeleonch commented Mar 5, 2026

Uh oh!

claude bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Re-checking review comments

Primary issues — all fixed ✅

Secondary issues — 3 of 4 fixed ✅

Optional item (not addressed)

Uh oh!

aponcedeleonch commented Mar 5, 2026

Uh oh!

claude bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Re-checking review comments

Primary issues — all fixed ✅

Secondary issues — all fixed ✅

Optional item — now fixed ✅

Uh oh!

jerm-dro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

danbarr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

aponcedeleonch commented Mar 5, 2026 •

edited

Loading

vercel bot commented Mar 5, 2026 •

edited

Loading

claude bot commented Mar 5, 2026 •

edited

Loading

claude bot commented Mar 5, 2026 •

edited

Loading

claude bot commented Mar 5, 2026 •

edited

Loading

claude bot commented Mar 5, 2026 •

edited

Loading