Adding SGP-related extensible configuration to the model engine helm chart by arniechops · Pull Request #762 · scaleapi/llm-engine

arniechops · 2026-02-20T22:35:10Z

Pull Request Summary

What is this PR changing? Why is this change being made? Any caveats you'd like to highlight? Link any relevant documents, links, or screenshots here if applicable.

Test Plan and Usage Guide

How did you validate that your PR works correctly? How do you run or demo the code? Provide enough detail so a reviewer can reasonably reproduce the testing procedure. Paste example command line invocations if applicable.

Greptile Summary

This PR adds a broad set of SGP/on-prem extensibility hooks to the model-engine Helm chart, bumping the chart version to 0.2.0. The changes make the chart deployable in airgapped and GCP-based environments without forking.

Key changes:

Service account token control: automountServiceAccountToken value gates a new modelEngine.tokenVolume/tokenVolumeMount helper that manually projects the SA token when auto-mounting is disabled. Applied to most pods — but populate_fine_tuning_repository_job.yaml and restart_keda_operator.yaml are missing the pod-level automountServiceAccountToken field (see comments).
Extensibility hooks: extraEnvVars, extraVolumes, extraVolumeMounts (for main service containers) and extraPodEnvFrom (for inference pods via envFrom) enable wrapper charts to inject credentials and config without patching core templates.
Configurable utility images: utilityImages.busybox, .curl, .ubuntu replace hardcoded public.ecr.aws / curlimages references, enabling airgapped deployments.
Helm hook improvements: All hook jobs now append .Release.Revision to their names and use before-hook-creation,hook-succeeded delete policy, preventing stale job conflicts on re-deploy. Hook weights are correctly ordered (database_init_job at -2, database_migration_job at -1).
Boolean config map fix: kindIs "bool" guards in service_config_map.yaml prevent boolean values from being quoted as strings.
Redis auth-secret reference: New redis.authSecretName/redis.authSecretKey fields allow the auth token to be sourced from a Kubernetes Secret rather than a plain value.

Minor fix needed: add automountServiceAccountToken: {{ .Values.automountServiceAccountToken }} to the pod specs of populate_fine_tuning_repository_job.yaml and restart_keda_operator.yaml for consistency with other modified pods.

Confidence Score: 4/5

Safe to merge with one minor fix — two job templates are missing the pod-level automountServiceAccountToken field.
The overall approach is well-structured and the core mechanics (token projection, extensibility hooks, hook weight ordering) are correct. The missing automountServiceAccountToken on populate_fine_tuning_repository_job.yaml and restart_keda_operator.yaml is a genuine inconsistency affecting configuration clarity, but not a runtime failure since the ServiceAccount-level setting still governs auto-mounting. All other changes are clean and backward-compatible.
charts/model-engine/templates/populate_fine_tuning_repository_job.yaml and charts/model-engine/templates/restart_keda_operator.yaml need automountServiceAccountToken: {{ .Values.automountServiceAccountToken }} added to the pod spec.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[values.yaml\nautomountServiceAccountToken] -->|false| B[modelEngine.tokenVolume\nprojected SA token]
    A -->|true| C[K8s auto-mounts token\nno extra volume needed]

    B --> D[modelEngine.volumes\nappends token-volume]
    B --> E[modelEngine.volumeMounts\nappends token-volume mount]

    D --> F[gateway / builder / cacher\ndeployments]
    D --> G[database_init_job\ndatabase_migration_job]
    D --> H[spellbook_init_job]
    D --> I[celery_autoscaler\nstateful set]
    D --> J[populate_fine_tuning_job\nrestart_keda_job\n⚠️ missing pod-level field]

    K[values.yaml\nredis.authSecretName] -->|set| L[REDIS_AUTH_TOKEN\nfrom secretKeyRef]
    K -->|not set| M[redis.auth plain value\nfallback]

    N[values.yaml\nextraEnvVars] --> F
    O[values.yaml\nextraVolumes / extraVolumeMounts] --> F
    P[values.yaml\nextraPodEnvFrom] --> Q[inference pods\nenvFrom in service templates]

    R[values.yaml\nutilityImages] --> S[busybox / curl / ubuntu\nconfigurable for airgap]
    T[values.yaml\ngpuTolerations] --> U[balloon + proportional\nautoscaler + service templates]

Comments Outside Diff (2)

charts/model-engine/templates/populate_fine_tuning_repository_job.yaml, line 44-45 (link)

Missing automountServiceAccountToken on pod spec.

Every other job/deployment modified in this PR (database_migration_job, database_init_job, spellbook_init_job, gateway_deployment, endpoint_builder_deployment, cacher_deployment, celery_autoscaler_stateful_set) explicitly sets automountServiceAccountToken: {{ .Values.automountServiceAccountToken }} on the pod spec. This job was missed.

Without the pod-level field, when automountServiceAccountToken: false in values, the ServiceAccount-level setting prevents auto-mounting — but the projected token volume/mount ARE added via modelEngine.volumes/modelEngine.volumeMounts, creating an inconsistency: the token is mounted manually but there's no pod-level declaration that auto-mounting is disabled.
charts/model-engine/templates/restart_keda_operator.yaml, line 43-44 (link)

Missing automountServiceAccountToken on pod spec.

Same issue as populate_fine_tuning_repository_job.yaml: every other job/deployment modified in this PR sets automountServiceAccountToken: {{ .Values.automountServiceAccountToken }} at the pod spec level, but this job was missed. The projected token volume and mount are still injected via modelEngine.volumes/modelEngine.volumeMounts when automountServiceAccountToken: false, so the pod-level field should be set consistently.

_{Last reviewed commit: 85a7735}

greptile-apps

_{22 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-20T22:38:17Z

Additional Comments (1)

charts/model-engine/templates/proportional_a100_autoscaler_deployment.yaml
Hardcoded image breaks airgapped deployments

This PR parameterizes ubuntu, busybox, and curl images via utilityImages for airgapped/on-prem environments, but the cluster-proportional-autoscaler image from registry.k8s.io remains hardcoded here and in the a10/t4 autoscaler deployments. In an airgapped environment, registry.k8s.io would be unreachable just like the other registries that were parameterized.

Consider adding a utilityImages.clusterProportionalAutoscaler entry (or similar) to values.yaml and referencing it here:

        - image: {{ .Values.utilityImages.clusterProportionalAutoscaler.repository }}:{{ .Values.utilityImages.clusterProportionalAutoscaler.tag }}

Prompt To Fix With AI

This is a comment left during a code review.
Path: charts/model-engine/templates/proportional_a100_autoscaler_deployment.yaml
Line: 30

Comment:
**Hardcoded image breaks airgapped deployments**

This PR parameterizes ubuntu, busybox, and curl images via `utilityImages` for airgapped/on-prem environments, but the `cluster-proportional-autoscaler` image from `registry.k8s.io` remains hardcoded here and in the a10/t4 autoscaler deployments. In an airgapped environment, `registry.k8s.io` would be unreachable just like the other registries that were parameterized.

Consider adding a `utilityImages.clusterProportionalAutoscaler` entry (or similar) to `values.yaml` and referencing it here:

```suggestion
        - image: {{ .Values.utilityImages.clusterProportionalAutoscaler.repository }}:{{ .Values.utilityImages.clusterProportionalAutoscaler.tag }}
```

How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-25T22:33:08Z

charts/model-engine/templates/service_account.yaml

+{{- with $.Values.automountServiceAccountToken }}
+automountServiceAccountToken: {{ . }}
+{{- end }}


with skips block when value is false

Helm's with treats false as falsy, so when automountServiceAccountToken is set to false (the exact scenario where you want to disable automount on the ServiceAccount), this block will be skipped entirely and automountServiceAccountToken won't be rendered into the YAML. Only when the value is true will it appear — which is the opposite of the critical behavior.

In contrast, the pod-level specs in other templates correctly use {{ .Values.automountServiceAccountToken }} directly without a with gate.

Suggested change

{{- with $.Values.automountServiceAccountToken }}

automountServiceAccountToken: {{ . }}

{{- end }}

automountServiceAccountToken: {{ $.Values.automountServiceAccountToken }}

Prompt To Fix With AI

This is a comment left during a code review. Path: charts/model-engine/templates/service_account.yaml Line: 24-26 Comment: **`with` skips block when value is `false`** Helm's `with` treats `false` as falsy, so when `automountServiceAccountToken` is set to `false` (the exact scenario where you want to disable automount on the ServiceAccount), this block will be skipped entirely and `automountServiceAccountToken` won't be rendered into the YAML. Only when the value is `true` will it appear — which is the opposite of the critical behavior. In contrast, the pod-level specs in other templates correctly use `{{ .Values.automountServiceAccountToken }}` directly without a `with` gate. ```suggestion automountServiceAccountToken: {{ $.Values.automountServiceAccountToken }} ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-25T23:32:27Z

charts/model-engine/templates/database_init_job.yaml

+    {{- include "modelEngine.labels" . | nindent 4 }}
+  annotations:
+    "helm.sh/hook": pre-install,pre-upgrade
+    "helm.sh/hook-weight": "-1"


Hook weight collision with migration job

Both database_init_job.yaml and database_migration_job.yaml use "helm.sh/hook-weight": "-1" on pre-install,pre-upgrade. This means Helm does not guarantee execution order between them. Since init_database.py creates schemas (create schema if not exists) and tables (Base.metadata.create_all), and the migration job applies Alembic migrations that depend on those schemas existing, the migration could run first and fail.

Consider setting this job's hook weight to "-2" (or lower) so it is guaranteed to run before the migration job at weight "-1".

Suggested change

"helm.sh/hook-weight": "-1"

"helm.sh/hook-weight": "-2"

Prompt To Fix With AI

This is a comment left during a code review. Path: charts/model-engine/templates/database_init_job.yaml Line: 10 Comment: **Hook weight collision with migration job** Both `database_init_job.yaml` and `database_migration_job.yaml` use `"helm.sh/hook-weight": "-1"` on `pre-install,pre-upgrade`. This means Helm does not guarantee execution order between them. Since `init_database.py` creates schemas (`create schema if not exists`) and tables (`Base.metadata.create_all`), and the migration job applies Alembic migrations that depend on those schemas existing, the migration could run first and fail. Consider setting this job's hook weight to `"-2"` (or lower) so it is guaranteed to run before the migration job at weight `"-1"`. ```suggestion "helm.sh/hook-weight": "-2" ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-25T23:32:36Z

Additional Comments (2)

charts/model-engine/templates/populate_fine_tuning_repository_job.yaml
Missing automountServiceAccountToken setting

All other pod specs in this PR (gateway, builder, cacher, celery autoscaler, database migration, database init, spellbook init, and the proportional autoscalers) explicitly set automountServiceAccountToken: {{ .Values.automountServiceAccountToken }}. This job and restart_keda_operator.yaml are the only ones missing it.

When automountServiceAccountToken is false, the modelEngine.volumes helper will inject the projected token-volume, but without disabling the default automount on the pod spec, Kubernetes will also mount its own default token — resulting in a conflict at /var/run/secrets/kubernetes.io/serviceaccount.

      automountServiceAccountToken: {{ .Values.automountServiceAccountToken }}
      serviceAccountName: {{ include "modelEngine.fullname" . }}

Prompt To Fix With AI

This is a comment left during a code review.
Path: charts/model-engine/templates/populate_fine_tuning_repository_job.yaml
Line: 44

Comment:
**Missing `automountServiceAccountToken` setting**

All other pod specs in this PR (gateway, builder, cacher, celery autoscaler, database migration, database init, spellbook init, and the proportional autoscalers) explicitly set `automountServiceAccountToken: {{ .Values.automountServiceAccountToken }}`. This job and `restart_keda_operator.yaml` are the only ones missing it.

When `automountServiceAccountToken` is `false`, the `modelEngine.volumes` helper will inject the projected `token-volume`, but without disabling the default automount on the pod spec, Kubernetes will also mount its own default token — resulting in a conflict at `/var/run/secrets/kubernetes.io/serviceaccount`.

```suggestion
      automountServiceAccountToken: {{ .Values.automountServiceAccountToken }}
      serviceAccountName: {{ include "modelEngine.fullname" . }}
```

How can I resolve this? If you propose a fix, please make it concise.

charts/model-engine/templates/restart_keda_operator.yaml
Missing automountServiceAccountToken setting

Same inconsistency as populate_fine_tuning_repository_job.yaml — this job is missing automountServiceAccountToken: {{ .Values.automountServiceAccountToken }} which every other pod spec in this PR has. This will cause issues when automountServiceAccountToken is set to false.

      automountServiceAccountToken: {{ .Values.automountServiceAccountToken }}
      serviceAccountName: {{ include "modelEngine.fullname" . }}

Prompt To Fix With AI

This is a comment left during a code review.
Path: charts/model-engine/templates/restart_keda_operator.yaml
Line: 43

Comment:
**Missing `automountServiceAccountToken` setting**

Same inconsistency as `populate_fine_tuning_repository_job.yaml` — this job is missing `automountServiceAccountToken: {{ .Values.automountServiceAccountToken }}` which every other pod spec in this PR has. This will cause issues when `automountServiceAccountToken` is set to `false`.

```suggestion
      automountServiceAccountToken: {{ .Values.automountServiceAccountToken }}
      serviceAccountName: {{ include "modelEngine.fullname" . }}
```

How can I resolve this? If you propose a fix, please make it concise.

…lity-hooks

first commit

a08b381

greptile-apps bot reviewed Feb 20, 2026

View reviewed changes

Merge branch 'main' into arnavchopra/extensibility-hooks

32ef0de

dmchoiboi approved these changes Feb 25, 2026

View reviewed changes

automount service account support

76a0b1e

lilyz-ai approved these changes Feb 25, 2026

View reviewed changes

made it a beta tag

648be45

greptile-apps bot reviewed Feb 25, 2026

View reviewed changes

merging more downstream changes

2a720ad

greptile-apps bot reviewed Feb 25, 2026

View reviewed changes

arniechops added 3 commits February 25, 2026 17:53

new downstream changes

bf4155c

some more changes

f8ab5a8

Merge remote-tracking branch 'origin/main' into arnavchopra/extensibi…

85a7735

…lity-hooks

arniechops merged commit a6070ec into main Mar 6, 2026
8 checks passed

arniechops deleted the arnavchopra/extensibility-hooks branch March 6, 2026 23:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding SGP-related extensible configuration to the model engine helm chart#762

Adding SGP-related extensible configuration to the model engine helm chart#762
arniechops merged 8 commits intomainfrom
arnavchopra/extensibility-hooks

arniechops commented Feb 20, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot left a comment •

edited

Loading

Uh oh!

greptile-apps bot commented Feb 20, 2026

Uh oh!

greptile-apps bot Feb 25, 2026

Uh oh!

greptile-apps bot Feb 25, 2026

Uh oh!

greptile-apps bot commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

arniechops commented Feb 20, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Summary

Test Plan and Usage Guide

Greptile Summary

Confidence Score: 4/5

Flowchart

Comments Outside Diff (2)

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 20, 2026

Uh oh!

greptile-apps bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

arniechops commented Feb 20, 2026 •

edited by greptile-apps bot

Loading

greptile-apps bot left a comment •

edited

Loading