fix: remediate app-layer CVEs in agentex services#164
Open
scale-ballen wants to merge 11 commits intomainfrom
Open
fix: remediate app-layer CVEs in agentex services#164scale-ballen wants to merge 11 commits intomainfrom
scale-ballen wants to merge 11 commits intomainfrom
Conversation
Migrate from Docker Hub base images to ECR-mirrored Chainguard golden images: - agentex: python:3.12-slim → golden/chainguard/python:3.12-dev - agentex-ui: node:20 → golden/chainguard/node:20-dev Mirrors the pattern established in the FIPS Dockerfiles (PR #308). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace blanket COPY --from=base /usr/bin with targeted copies of only the console_scripts needed at runtime (uvicorn, ddtrace-run, python3, python3.12), preventing build tools (gcc, make) from leaking into the production image - Switch docs-builder from uv sync --group docs to uv pip install --system --group docs for deterministic builds and consistency with the rest of the Dockerfile - Use mkdocs build directly instead of uv run mkdocs build since packages are now installed to system Python Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The golden Chainguard base image requires ECR authentication which is unavailable in integration test CI (scale-agentex repo lacks the IAM role). Add configurable BASE_IMAGE ARG defaulting to golden image for production builds, with docker-compose overriding to python:3.12-alpine for local dev and CI. Also adds bash to system deps for docker-compose command compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Revert the BASE_IMAGE workaround and instead properly authenticate with ECR in CI. Adds AWS credentials config (github-action-agentex role) and egp-prod ECR login to integration-tests.yml so docker compose can pull golden Chainguard base images. Requires Terracode-Infra change to add scaleapi/scale-agentex:* to the github-action-agentex IAM role OIDC subjects. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch from the overprivileged github-action-agentex role to the new github-action-scale-agentex-ecr-read role which only grants ECR read access to golden/* repos. Addresses Greptile review finding about excessive permissions for a public repository. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…atibility The Dockerfile defaults SOURCE_DIR=public/agentex (for CI builds from repo root), but docker-compose builds from the scale-agentex repo root where the path is agentex/. Override the arg so integration tests can find source files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Chainguard Python images set ENTRYPOINT ["python"], so docker-compose commands like `bash -c "..."` get interpreted as `python bash -c "..."`. Clear the entrypoint on the dev stage so shell commands work correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove busybox from base stage apk install — Chainguard deliberately excludes it to minimize attack surface; bash alone is sufficient - Move id-token: write from workflow-level to run-integration-tests job only, following principle of least privilege (Greptile review) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use `uv sync --no-dev` in base stage so dev-only packages (test runners, linters, debug tools) don't leak into production via the COPY --from=base of /usr/lib/python3.12. Dev stage still gets them via `uv sync --group dev`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
agentex backend: - python-multipart 0.0.12 → 0.0.22 (CVE-2024-53981, CVE-2026-24486) - temporalio 1.18.0 → 1.23.0 (CVE-2026-31812 quinn-proto, pending re-scan) agentex-ui: - next 15.5.9 → 15.5.10 (GHSA-h25m-26qc-wcjf) - minimatch, tar, rollup, flatted transitive deps (13 CVEs via npm audit fix) Remaining: - starlette CVE-2025-62727: blocked on agentex-sdk widening fastapi constraint (scaleapi/scale-agentex-python#285) - libvips x3 + python-3.12: auto-fix on next rebuild against latest Chainguard golden base Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Upgrades vulnerable dependencies across agentex backend and agentex-ui to remediate 17 HIGH-severity CVEs.
Changes
Blocked — requires follow-up
agentex-sdkpinsfastapi<0.116which caps starlette. Fix: scale-agentex-python#285Once SDK PR merges
Bump
agentex-sdkversion inpyproject.toml, re-lock → starlette resolves to 0.52.1, closing CVE-2025-62727.Runtime test evidence
Package versions installed
agentex backend — FastAPI runtime tests (15/15 PASS)
agentex backend — Temporal SDK tests (16/16 PASS)
agentex backend — Module imports (9/9 PASS)
agentex backend — Unit tests: 79 passed (identical to main)
108 errors are pre-existing (require Docker/Postgres testcontainers), verified same on
main.agentex-ui — npm audit: 0 vulnerabilities
agentex-ui — TypeScript typecheck: PASS
agentex-ui — Unit tests: 37/37 passed
agentex-ui — Production build: PASS
Test plan
🤖 Generated with Claude Code
Greptile Summary
This PR remediates 17 HIGH-severity CVEs across the
agentexPython backend andagentex-uiNext.js frontend, while simultaneously migrating both service Dockerfiles from public base images (python:3.12-slim,node:20) to hardened Chainguard images hosted in a private AWS ECR registry.Key changes:
agentex/pyproject.toml/uv.lock:python-multipartpinned to0.0.22(fixes CVE-2024-53981 / CVE-2026-24486);temporaliolocked to1.23.0inuv.lock(fixes CVE-2026-31812). However, the lower bound inpyproject.tomlfortemporaliois still>=1.18.0, meaning fresh dependency resolution without the lock file could install a vulnerable version — the constraint should be advanced to>=1.23.0to be a proper safety net.agentex/Dockerfile: Migrated to Chainguard's alpine-based Python image; production stage copies only the necessary Python packages and console-script binaries from the build stage instead of runninguv syncagain, which avoids havinguvor build tools in production.agentex-ui/Dockerfile: Migrated to Chainguard's Node 20 image; introduces aSOURCE_DIRbuild arg (defaulting topublic/agentex-uifor CI) to support multi-repo layout. The existingagentex-ui/.dockerignorethat excludesnode_moduleswill not apply if the build context is the repository root, which could affect local developer builds.agentex/docker-compose.yml: AddsSOURCE_DIR: agentexbuild arg so local compose builds resolve correctly after the Dockerfile default changed..github/workflows/integration-tests.yml: Adds OIDC-based AWS credential exchange and ECR login step to therun-integration-testsjob so the CI runner can pull the now-private Chainguard base images.agentex-ui/package.json/package-lock.json:nextbumped to 15.5.10; transitive depsrollup,flatted,minimatch, andtarpatched via lock-file update.Confidence Score: 4/5
uv.lockandpackage-lock.jsoncorrectly pin every patched package, 79 Python unit tests pass, npm audit reports 0 vulnerabilities, and the Chainguard base-image migration is well-structured. The only concrete issue is that thetemporaliolower bound inpyproject.tomlstill permits vulnerable versions if the lock file is bypassed, which is a low-likelihood but real security gap in a CVE-remediation PR.temporalioversion constraint lower bound should be advanced from>=1.18.0to>=1.23.0.Important Files Changed
python-multipartto>=0.0.22(CVE fix) and removes its upper bound;temporaliois only updated inuv.lock— the lower bound inpyproject.tomlstill permits vulnerable versions<1.23.0.temporalioto 1.23.0,python-multipartto 0.0.22, and transitively bumpsnexus-rpc(1.1.0→1.3.0) andtyping-inspection(0.4.1→0.4.2).python:3.12-slim(apt-based) to private Chainguard ECR image (apk-based); production stage manually copies Python packages and select console scripts from the build stage rather than runninguv syncagain — functional but requiresuv.lockto be copied for fully deterministic builds.node:20to Chainguard ECR image; introducesSOURCE_DIRbuild arg; existing.dockerignoreatagentex-ui/won't apply if the build context is the repo root, potentially including localnode_modules.run-integration-testsjob so the Chainguard base images (now hosted in private ECR) can be pulled during CI runs.Sequence Diagram
sequenceDiagram participant GHA as GitHub Actions CI participant ECR as AWS ECR<br/>(022465994601) participant GHCR as GitHub GHCR participant Docker as Docker Build GHA->>GHA: OIDC token exchange (role 307185671274) GHA->>ECR: ECR login ECR-->>GHA: credentials GHA->>ECR: Pull golden/chainguard/python:3.12-dev (base + production) ECR-->>GHA: image layers GHA->>ECR: Pull golden/chainguard/node:20-dev ECR-->>GHA: image layers GHA->>GHCR: Pull uv:0.6.9 (COPY --from) GHCR-->>GHA: uv binary GHA->>Docker: docker build agentex/Dockerfile Docker->>Docker: uv sync --no-dev (resolves from pyproject.toml) Docker->>Docker: COPY /usr/lib/python3.12 → production stage Docker-->>GHA: agentex image GHA->>Docker: docker build agentex-ui/Dockerfile Docker->>Docker: npm ci --omit=dev Docker->>Docker: npm run build Docker-->>GHA: agentex-ui imageComments Outside Diff (1)
agentex/pyproject.toml, line 12 (link)temporaliolower bound still permits vulnerable versionsThe CVE-2026-31812 fix requires
temporalio>=1.23.0, butpyproject.tomlstill declares>=1.18.0. Theuv.lockpins the resolved version to 1.23.0, so normal locked builds are safe. However, any scenario that bypasses the lock file —uv lock --upgrade, generating a fresh environment from scratch, or the existing Dockerfile pattern that copies onlypyproject.toml(notuv.lock) into the build stage and runsuv sync— could resolve to any version in the[1.18.0, 1.23.0)range and silently re-introduce the vulnerability.For security CVE fixes, it's best practice to advance the lower bound to the patched version so the
pyproject.tomlitself acts as a safety net.Prompt To Fix All With AI
Last reviewed commit: ba4db7d