-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Problem
When accessing the voice app remotely over plain HTTP (e.g., http://monad:8400), navigator.mediaDevices is undefined because the browser is not in a secure context. This causes getUserMedia to throw a TypeError at voice/static/index.html:729 with no user-facing guidance.
Console errors observed
[useWebRTC] getUserMedia failed: TypeError: Cannot read properties of undefined (reading 'getUserMedia')
at voice/:729:47
[VoiceApp] Connection failed: TypeError: Cannot read properties of undefined (reading 'getUserMedia')
at voice/:729:47
Root cause
Browsers restrict navigator.mediaDevices.getUserMedia to secure contexts:
http://localhost/http://127.0.0.1-- always secure (works)https://*-- secure (works)http://<any-other-hostname>-- not secure (fails silently --navigator.mediaDevicesisundefined)
Since 0.0.0.0 binding was recently made the default, remote access is now a common scenario. Accessing via a LAN hostname or Tailscale IP over plain HTTP hits this wall.
Attempted workaround
Accessing via https://<tailscale-hostname>:8400 results in Invalid HTTP request received from uvicorn because the server only listens on plain HTTP -- the browser sends a TLS ClientHello that uvicorn can't parse. Tailscale's WireGuard tunnel encrypts at the network layer but does not provide HTTP-level TLS that the browser recognizes as a secure context.
Analysis
Three issues compound here:
1. No secure context detection in the browser
The connect() function at index.html:729 calls navigator.mediaDevices.getUserMedia() without first checking window.isSecureContext. The catch block logs and rethrows but provides no user-facing message explaining why it failed or how to fix it.
2. CSRF origin check is too restrictive for remote access
voice/__init__.py:133-139 only allows localhost/127.0.0.1 origins:
async def _check_origin(origin: str | None = Header(default=None)) -> None:
if origin is None:
return
if "localhost" in origin or "127.0.0.1" in origin:
return
raise HTTPException(status_code=403, detail="CSRF: origin not allowed")Even with a working HTTPS reverse proxy (e.g., Tailscale serve), the browser would send an origin like https://monad.tail-abc.ts.net and receive a 403.
3. Tailscale serve integration exists but is not discoverable
tailscale.py already implements start_serve(port) which runs tailscale serve --bg <port> and returns a proper https://<dns_name> URL with real Let's Encrypt certs. However, this isn't surfaced to users who need remote access, so they don't know it's available.
Recommendations
Browser-side secure context guard (UX improvement)
Before attempting getUserMedia, check window.isSecureContext. If false, display a clear message explaining the situation and how to resolve it (e.g., use a reverse proxy, access via localhost, or set up HTTPS). This is valuable regardless of which TLS approach is used.
Fix CSRF origin check for remote access
Update _check_origin to also allow known-safe remote origins (e.g., Tailscale domains, configured allowed origins). This is required for any remote access scenario to work.
Surface HTTPS/reverse proxy options in settings
Rather than hardcoding Tailscale as the only path, consider:
- Supporting Tailscale alternatives -- other reverse proxy / tunnel solutions (e.g., Cloudflare Tunnel, ngrok, Caddy, nginx with Let's Encrypt) should be equally viable
- Surfacing these options in settings -- let users configure their preferred HTTPS approach through the settings UI/config rather than requiring CLI flags or code knowledge
- Auto-detection with guidance -- detect available tools (Tailscale, Caddy, etc.) and suggest the appropriate setup
Reconsider 0.0.0.0 default binding
The recent change to bind 0.0.0.0 by default makes remote access the common case, but voice (and potentially other features requiring secure contexts) breaks without HTTPS. This tension should be considered -- either:
- Keep
0.0.0.0default but make HTTPS setup frictionless and well-guided - Or bind to
localhostby default with--remoteopting into0.0.0.0+ HTTPS setup
Context
- Branch
fix/voice-remote-getusermediawas created for initial investigation - The
tailscale.pyintegration at lines 43-78 already provides the serve/reverse-proxy machinery - The voice architecture is browser-direct-to-OpenAI for audio (server is not in the audio path), so HTTPS is only needed for the initial page load and API calls