Add vMCP optimizer guide for Kubernetes#588
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR adds a new how-to guide (docs/toolhive/guides-vmcp/optimizer.mdx) for enabling the optimizer feature in Virtual MCP Server (vMCP) on Kubernetes. The guide covers deploying an EmbeddingServer, referencing it from a VirtualMCPServer to enable the optimizer, tuning optimizer parameters, and a production-ready advanced example.
Changes:
- New documentation page covering EmbeddingServer setup, VirtualMCPServer configuration with optimizer, parameter tuning guidance, and an advanced production example
- Updated
sidebars.tsto include the new optimizer guide between composite-tools and failure-handling in the Virtual MCP Server sidebar section
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
docs/toolhive/guides-vmcp/optimizer.mdx |
New how-to guide for enabling and configuring the vMCP optimizer feature with EmbeddingServer and VirtualMCPServer CRDs |
sidebars.ts |
Adds the new optimizer guide to the Virtual MCP Server sidebar category |
You can also share your feedback on Copilot code review. Take the survey.
|
@claude use the docs-review skill to perform an editorial review of this PR. Also consider user journey and discoverability of this important new capability for the vMCP server. |
|
Claude finished @danbarr's task in 2m 27s —— View job Editorial review
SummaryThe guide is well-structured and delivers genuinely useful content: the quick-start flow is clean, the parameter table is excellent, and the tuning guidance admonition is practical and actionable. The main issues are a missing first-use expansion of "vMCP," an incomplete verification step, an unexplained production config detail, and — most importantly — several discoverability gaps that leave this feature invisible to users who aren't already reading the optimizer guide directly. Primary issues1. Missing first-use expansion of "vMCP" (line 8)The opening sentence reads "When vMCP aggregates many backend MCP servers…" but per the project style guide, the first use in the body text must be "Virtual MCP Server (vMCP)." Every other guide in the Fix: Change line 8 to: "When Virtual MCP Server (vMCP) aggregates many backend MCP servers…" 2. Discoverability gaps — optimizer isn't mentioned in key parent/sibling docsThis is the most impactful issue. A user learning about vMCP has several natural entry points, none of which surface the optimizer:
Fix: Add brief references in each of the above docs pointing to this guide. For 3. Step 3 "Verify" has no success criteria (lines 113–123)The step tells users to run Fix: Add expected output or a description of what to look for, e.g.: "Look for 4. Unexplained production config decision in advanced example (line 226)The advanced example sets Fix: Add a brief inline comment or a sentence explaining the rationale. For example: "A shorter timeout (15s) is appropriate when the EmbeddingServer is co-located in the same namespace and low-latency responses are expected. Increase this if the embedding service is remote or under high load." Secondary issues
Inline suggestionsLine 34 (Step count): Lines 63–75 (Step 1 YAML): Consider adding a code comment Lines 168–172 (string-encoded floats note): This is a well-executed design decision explanation — keeps it in a Lines 241–245 (Related information): The links are accurate and well-chosen. One addition worth considering: a link to the Overall this is solid, publish-ready content with targeted fixes needed. The discoverability issue (#2 above) is the most important to address before merge, as it affects whether users can find this capability at all. |
|
@claude check if your comments have been addressed in the latest commit |
|
Claude finished @aponcedeleonch's task in 1m 14s —— View job Review comment status
All four primary issues and three of the four secondary issues have been addressed. Here's the full breakdown: Primary issues — all fixed ✅
Secondary issues — 3 of 4 fixed ✅
Minor open item (optional)The "Related information" section of the optimizer guide (lines 246–253) links to Overall this is in great shape. The PR addresses all blocking feedback and is ready to merge. |
0bc2e4f to
6b82116
Compare
|
@claude check again if all comments have been addressed |
|
Claude finished @aponcedeleonch's task in 1m 9s —— View job Re-checking review comments
The status is unchanged from the previous check. Here's the full breakdown: Primary issues — all fixed ✅
Secondary issues — 3 of 4 fixed ✅
Optional item (not addressed)The optimizer guide's "Related information" section (lines 244–251) still doesn't include a link to All blocking feedback has been addressed. The PR is ready to merge. |
|
@claude check again if all your comments have been addressed |
|
Claude finished @aponcedeleonch's task in 55s —— View job Re-checking review comments
One new item has been addressed since the last check. Here's the full updated status: Primary issues — all fixed ✅
Secondary issues — all fixed ✅
Optional item — now fixed ✅The optimizer guide's "Related information" section (lines 251–252) now includes a link to All blocking feedback has been addressed. The one remaining soft item (intro/Benefits overlap) is not a blocker. This PR is ready to merge. |
jerm-dro
left a comment
There was a problem hiding this comment.
LGTM, but let's nail the Benefits section. It's the first and potentially only thing people will read, so it's important.
e0166c3 to
eeadace
Compare
f98e33f to
74385c3
Compare
74385c3 to
78cfdd0
Compare
78cfdd0 to
c972b44
Compare
danbarr
left a comment
There was a problem hiding this comment.
Good guide overall. One substantive issue to address before or shortly after merge:
ARM64 workaround is incomplete
The ARM64 warning block shows how to pre-pull a pinned image (cpu-1.7) and load it into a kind cluster, but never shows how to configure the EmbeddingServer to actually use that image. Without setting spec.image, the EmbeddingServer will still default to cpu-latest (the amd64-only image the workaround is trying to avoid), making the workaround a no-op.
The block needs a follow-up YAML snippet showing how to pin the image in the EmbeddingServer resource:
spec:
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.7Secondary concern in the same block: kind load docker-image is kind-specific. Since ARM64 also applies to EKS on Graviton and other distributions, at minimum add a note like "adjust the load command for your cluster distribution."
Minor
In intro.mdx, the new "Tool optimization" bullet is missing a period at the end of the sentence.
- Add a new how-to guide for enabling the vMCP optimizer via the Kubernetes operator (EmbeddingServer + VirtualMCPServer CRDs) - Cover quick start, parameter tuning reference, ARM64 compatibility, and a full production example - Add cross-links from 6 existing pages (vMCP concepts, intro, configuration, tool aggregation, tool optimization concepts, and the MCP Optimizer tutorial) - Add optimizer as a key benefit in the vMCP concept page and expand the vMCP intro with a new "Optimize tool discovery" section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add spec.image pin to EmbeddingServer YAML so the pre-pulled amd64 image is actually used, explain root cause (Intel MKL), and note that native ARM64 support is in progress upstream. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
c972b44 to
3410623
Compare
Summary
Kubernetes operator (EmbeddingServer + VirtualMCPServer CRDs)
and a full production example
configuration, tool aggregation, tool optimization concepts,
and the MCP Optimizer tutorial)
expand the vMCP intro with a new "Optimize tool discovery" section
Test plan
npm run buildpasses🤖 Generated with Claude Code