-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
Summary
Code review of the current main file at bigquery_agent_analytics_plugin.py identified 6 issues, all validated against HEAD (dd0851ac).
Priority Findings
1. High — LLM_ERROR and TOOL_ERROR events logged with status="OK" instead of "ERROR"
Evidence:
EventData.statusdefaults to"OK"(L1798)_log_eventwritesevent_data.statusto the row (L2539)on_model_error_callbacksetserror_messagebut notstatus(L2987–2996)on_tool_error_callbacksetserror_messagebut notstatus(L3107–3118)
Impact: Dashboards/queries filtering on status='ERROR' will undercount failures.
Suggested fix: Set status="ERROR" in both error callbacks, and optionally enforce in _log_event when event_type.endswith("_ERROR").
2. Medium — Auto schema upgrade only checks top-level columns; nested schema evolution not handled
Evidence:
_maybe_upgrade_schemadiffs only top-level field names (L2157–2158)- Version label is stamped unconditionally, even if full expected schema is not reached (L2169–2172)
Impact: Future nested field additions can cause Write API schema mismatch while the version label says "upgraded", preventing retries.
Suggested fix: Recursive schema diff for nested RECORD fields, or avoid stamping the new version if the full expected schema is not reached.
3. Medium — Multi-loop shutdown drains only current loop queue; other loops may lose buffered events
Evidence:
- Only the current loop calls
batch_processor.shutdown()(L2243–2244) - Other loops get raw
transport.close()without draining (L2247–2253)
Impact: Cross-loop runs can silently drop pending rows at shutdown.
Suggested fix: Coordinate per-loop shutdown on owning loops (e.g. with run_coroutine_threadsafe), then close transports.
4. Medium — Session metadata/state logged without truncation or redaction, enabled by default
Evidence:
log_session_metadatadefaults toTrue(L494)session.stateis directly included in_enrich_attributeswithout any truncation (L2444–2446), unlikeusage_metadatawhich goes through_recursive_smart_truncate
Impact: Potential PII leakage and oversized rows in BigQuery.
Suggested fix: Add caps/redaction controls (max_session_state_bytes, redact_keys, allowlist_keys) and apply _recursive_smart_truncate before serialization.
5. Low — Large system instruction strings not truncated in LlmRequest parsing path
Evidence:
- String
system_instructionis directly assigned without truncation (L1409–1410) - Other content paths apply truncation: dict/list via
_recursive_smart_truncate(L1422–1425), plain strings viaprocess_text()(L1426–1427), Content objects via_parse_content_object(L1412–1416)
Impact: Inconsistent size controls; string system prompts bypass all truncation/offloading, increasing row-size risk.
Suggested fix: Pass the string through _truncate (and optionally offload if configured) like other text content.
6. Low — _HITL_TOOL_NAMES is unused dead code
Evidence:
- Declared at L81 with zero references elsewhere in the file
- The related
_HITL_EVENT_MAP(L86) is used, but_HITL_TOOL_NAMESitself is never referenced
Suggested fix: Remove, or wire into validation/filtering logic.
Feature Requests That Would Improve Reliability
- Plugin health telemetry counters:
queue_dropped_count,batch_retry_count,batch_drop_count,offload_fail_count - Dead-letter sink for failed batches (GCS/BQ table) with sampled payload/error context
- Conformance tests for invariants:
_ERRORevents must producestatus='ERROR', schema upgrade handles nested additive changes, and multi-loop shutdown drains all queues