Skip to content

fix: remediate app-layer CVEs in agentex services#164

Open
scale-ballen wants to merge 11 commits intomainfrom
sec/fix-app-vulns
Open

fix: remediate app-layer CVEs in agentex services#164
scale-ballen wants to merge 11 commits intomainfrom
sec/fix-app-vulns

Conversation

@scale-ballen
Copy link
Contributor

@scale-ballen scale-ballen commented Mar 16, 2026

Summary

Upgrades vulnerable dependencies across agentex backend and agentex-ui to remediate 17 HIGH-severity CVEs.

Changes

Service Package Before → After CVEs Fixed
agentex python-multipart 0.0.12 → 0.0.22 CVE-2024-53981, CVE-2026-24486
agentex temporalio 1.18.0 → 1.23.0 CVE-2026-31812 (quinn-proto, pending re-scan)
agentex-ui next 15.5.9 → 15.5.10 GHSA-h25m-26qc-wcjf
agentex-ui minimatch 3.1.2 → 3.1.4+ CVE-2026-26996, CVE-2026-27903, CVE-2026-27904
agentex-ui tar 7.5.6 → 7.5.11 CVE-2026-24842, CVE-2026-26960, CVE-2026-29786, CVE-2026-31802
agentex-ui rollup 4.52.5 → 4.59.0 CVE-2026-27606
agentex-ui flatted 3.3.3 → 3.4.0 CVE-2026-32141

Blocked — requires follow-up

CVE Package Blocker
CVE-2025-62727 starlette 0.46.2 → 0.49.1 agentex-sdk pins fastapi<0.116 which caps starlette. Fix: scale-agentex-python#285
CVE-2026-2913, CVE-2026-3145, CVE-2026-3147 libvips Auto-fixes on rebuild against latest Chainguard base
CVE-2026-1299 python-3.12 Auto-fixes on rebuild against latest Chainguard base

Once SDK PR merges

Bump agentex-sdk version in pyproject.toml, re-lock → starlette resolves to 0.52.1, closing CVE-2025-62727.

Runtime test evidence

Package versions installed

fastapi:          0.115.14
starlette:        0.46.2
python-multipart: 0.0.22 (was 0.0.12)
temporalio:       1.23.0 (was 1.18.0)
next:             15.5.10 (was 15.5.9)

agentex backend — FastAPI runtime tests (15/15 PASS)

[PASS] GET /health: HTTP 200
[PASS] CORS preflight: access-control headers present
[PASS] Auth middleware reject (no key): HTTP 401
[PASS] Auth middleware accept (with key): HTTP 200
[PASS] HTTPException handler: HTTP 404
[PASS] ValidationError handler: HTTP 422
[PASS] File upload (19 bytes): HTTP 200
[PASS] Large file upload (1MB): HTTP 200
[PASS] Form data parsing: HTTP 200
[PASS] Unicode filename upload: HTTP 200
[PASS] Empty file upload: HTTP 200
[PASS] StreamingResponse: 3 chunks received
[PASS] starlette.datastructures imports
[PASS] starlette.staticfiles imports
[PASS] starlette.responses imports

agentex backend — Temporal SDK tests (16/16 PASS)

[PASS] temporalio 1.23.0 imported
[PASS] @workflow.defn with run/signal/query
[PASS] @activity.defn async activities
[PASS] RetryPolicy construction
[PASS] DataConverter.default
[PASS] Worker params (client, task_queue, workflows, activities)
[PASS] Client.connect params (target_host, namespace)
[PASS] import temporalio.{client,worker,workflow,activity,common,converter,exceptions,runtime}
[PASS] All codebase-specific imports resolve

agentex backend — Module imports (9/9 PASS)

[PASS] api.app
[PASS] api.logged_api_route
[PASS] api.health_interceptor
[PASS] api.RequestLoggingMiddleware
[PASS] api.authentication_middleware
[PASS] utils.request_utils
[PASS] temporal.workflows.healthcheck_workflow
[PASS] temporal.activities.healthcheck_activities
[PASS] temporal.run_worker

agentex backend — Unit tests: 79 passed (identical to main)

108 errors are pre-existing (require Docker/Postgres testcontainers), verified same on main.

agentex-ui — npm audit: 0 vulnerabilities

agentex-ui — TypeScript typecheck: PASS

agentex-ui — Unit tests: 37/37 passed

agentex-ui — Production build: PASS

Test plan

  • python-multipart: file upload, form data, unicode, empty file, 1MB file
  • temporalio: workflow/activity defs, worker config, client config, all module imports
  • All 9 production modules import with upgraded deps
  • 79 unit tests pass (identical to main)
  • npm audit: 0 vulnerabilities
  • tsc --noEmit: pass
  • vitest: 37/37 pass
  • next build: pass
  • Deploy to dev and verify end-to-end (post-merge)

🤖 Generated with Claude Code

Greptile Summary

This PR remediates 17 HIGH-severity CVEs across the agentex Python backend and agentex-ui Next.js frontend, while simultaneously migrating both service Dockerfiles from public base images (python:3.12-slim, node:20) to hardened Chainguard images hosted in a private AWS ECR registry.

Key changes:

  • agentex/pyproject.toml / uv.lock: python-multipart pinned to 0.0.22 (fixes CVE-2024-53981 / CVE-2026-24486); temporalio locked to 1.23.0 in uv.lock (fixes CVE-2026-31812). However, the lower bound in pyproject.toml for temporalio is still >=1.18.0, meaning fresh dependency resolution without the lock file could install a vulnerable version — the constraint should be advanced to >=1.23.0 to be a proper safety net.
  • agentex/Dockerfile: Migrated to Chainguard's alpine-based Python image; production stage copies only the necessary Python packages and console-script binaries from the build stage instead of running uv sync again, which avoids having uv or build tools in production.
  • agentex-ui/Dockerfile: Migrated to Chainguard's Node 20 image; introduces a SOURCE_DIR build arg (defaulting to public/agentex-ui for CI) to support multi-repo layout. The existing agentex-ui/.dockerignore that excludes node_modules will not apply if the build context is the repository root, which could affect local developer builds.
  • agentex/docker-compose.yml: Adds SOURCE_DIR: agentex build arg so local compose builds resolve correctly after the Dockerfile default changed.
  • .github/workflows/integration-tests.yml: Adds OIDC-based AWS credential exchange and ECR login step to the run-integration-tests job so the CI runner can pull the now-private Chainguard base images.
  • agentex-ui/package.json / package-lock.json: next bumped to 15.5.10; transitive deps rollup, flatted, minimatch, and tar patched via lock-file update.

Confidence Score: 4/5

  • Safe to merge — the lock file correctly pins all patched versions and the runtime test evidence is thorough; one actionable tightening needed in pyproject.toml.
  • The uv.lock and package-lock.json correctly pin every patched package, 79 Python unit tests pass, npm audit reports 0 vulnerabilities, and the Chainguard base-image migration is well-structured. The only concrete issue is that the temporalio lower bound in pyproject.toml still permits vulnerable versions if the lock file is bypassed, which is a low-likelihood but real security gap in a CVE-remediation PR.
  • agentex/pyproject.toml — the temporalio version constraint lower bound should be advanced from >=1.18.0 to >=1.23.0.

Important Files Changed

Filename Overview
agentex/pyproject.toml Bumps python-multipart to >=0.0.22 (CVE fix) and removes its upper bound; temporalio is only updated in uv.lock — the lower bound in pyproject.toml still permits vulnerable versions <1.23.0.
uv.lock Lock file correctly pins temporalio to 1.23.0, python-multipart to 0.0.22, and transitively bumps nexus-rpc (1.1.0→1.3.0) and typing-inspection (0.4.1→0.4.2).
agentex/Dockerfile Migrates from python:3.12-slim (apt-based) to private Chainguard ECR image (apk-based); production stage manually copies Python packages and select console scripts from the build stage rather than running uv sync again — functional but requires uv.lock to be copied for fully deterministic builds.
agentex-ui/Dockerfile Migrates from node:20 to Chainguard ECR image; introduces SOURCE_DIR build arg; existing .dockerignore at agentex-ui/ won't apply if the build context is the repo root, potentially including local node_modules.
.github/workflows/integration-tests.yml Adds OIDC-based AWS credential step and ECR login to run-integration-tests job so the Chainguard base images (now hosted in private ECR) can be pulled during CI runs.

Sequence Diagram

sequenceDiagram
    participant GHA as GitHub Actions CI
    participant ECR as AWS ECR<br/>(022465994601)
    participant GHCR as GitHub GHCR
    participant Docker as Docker Build

    GHA->>GHA: OIDC token exchange (role 307185671274)
    GHA->>ECR: ECR login
    ECR-->>GHA: credentials

    GHA->>ECR: Pull golden/chainguard/python:3.12-dev (base + production)
    ECR-->>GHA: image layers
    GHA->>ECR: Pull golden/chainguard/node:20-dev
    ECR-->>GHA: image layers

    GHA->>GHCR: Pull uv:0.6.9 (COPY --from)
    GHCR-->>GHA: uv binary

    GHA->>Docker: docker build agentex/Dockerfile
    Docker->>Docker: uv sync --no-dev (resolves from pyproject.toml)
    Docker->>Docker: COPY /usr/lib/python3.12 → production stage
    Docker-->>GHA: agentex image

    GHA->>Docker: docker build agentex-ui/Dockerfile
    Docker->>Docker: npm ci --omit=dev
    Docker->>Docker: npm run build
    Docker-->>GHA: agentex-ui image
Loading

Comments Outside Diff (1)

  1. agentex/pyproject.toml, line 12 (link)

    temporalio lower bound still permits vulnerable versions

    The CVE-2026-31812 fix requires temporalio>=1.23.0, but pyproject.toml still declares >=1.18.0. The uv.lock pins the resolved version to 1.23.0, so normal locked builds are safe. However, any scenario that bypasses the lock file — uv lock --upgrade, generating a fresh environment from scratch, or the existing Dockerfile pattern that copies only pyproject.toml (not uv.lock) into the build stage and runs uv sync — could resolve to any version in the [1.18.0, 1.23.0) range and silently re-introduce the vulnerability.

    For security CVE fixes, it's best practice to advance the lower bound to the patched version so the pyproject.toml itself acts as a safety net.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: agentex/pyproject.toml
Line: 12

Comment:
**`temporalio` lower bound still permits vulnerable versions**

The CVE-2026-31812 fix requires `temporalio>=1.23.0`, but `pyproject.toml` still declares `>=1.18.0`. The `uv.lock` pins the resolved version to 1.23.0, so normal locked builds are safe. However, any scenario that bypasses the lock file — `uv lock --upgrade`, generating a fresh environment from scratch, or the existing Dockerfile pattern that copies only `pyproject.toml` (not `uv.lock`) into the build stage and runs `uv sync` — could resolve to any version in the `[1.18.0, 1.23.0)` range and silently re-introduce the vulnerability.

For security CVE fixes, it's best practice to advance the lower bound to the patched version so the `pyproject.toml` itself acts as a safety net.

```suggestion
    "temporalio>=1.23.0,<2",
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: agentex-ui/Dockerfile
Line: 34

Comment:
**`.dockerignore` may not apply when build context is repo root**

The existing `agentex-ui/.dockerignore` (which excludes `node_modules/` and `.next/`) is only applied by Docker when the **build context root** is `agentex-ui/`. With the new `SOURCE_DIR` pattern, if the image is built from the repository root (e.g., `docker build -f agentex-ui/Dockerfile --build-arg SOURCE_DIR=agentex-ui .`), Docker will look for `.dockerignore` at the repo root — which doesn't exist — and `agentex-ui/node_modules/` would be sent into the build context and copied into the image by this `COPY ${SOURCE_DIR} .` instruction.

In CI where the workspace is a clean checkout this is typically harmless, but for local builds by developers it could result in a bloated image or stale `node_modules` being used at runtime.

Consider adding a root-level `.dockerignore` that excludes the relevant directories, or documenting that this Dockerfile must be built with `agentex-ui/` as the build context.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: ba4db7d

scale-ballen and others added 11 commits March 11, 2026 14:16
Migrate from Docker Hub base images to ECR-mirrored Chainguard golden images:
- agentex: python:3.12-slim → golden/chainguard/python:3.12-dev
- agentex-ui: node:20 → golden/chainguard/node:20-dev

Mirrors the pattern established in the FIPS Dockerfiles (PR #308).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace blanket COPY --from=base /usr/bin with targeted copies of
  only the console_scripts needed at runtime (uvicorn, ddtrace-run,
  python3, python3.12), preventing build tools (gcc, make) from leaking
  into the production image
- Switch docs-builder from uv sync --group docs to uv pip install
  --system --group docs for deterministic builds and consistency with
  the rest of the Dockerfile
- Use mkdocs build directly instead of uv run mkdocs build since
  packages are now installed to system Python

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The golden Chainguard base image requires ECR authentication which is
unavailable in integration test CI (scale-agentex repo lacks the IAM
role). Add configurable BASE_IMAGE ARG defaulting to golden image for
production builds, with docker-compose overriding to python:3.12-alpine
for local dev and CI. Also adds bash to system deps for docker-compose
command compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Revert the BASE_IMAGE workaround and instead properly authenticate
with ECR in CI. Adds AWS credentials config (github-action-agentex
role) and egp-prod ECR login to integration-tests.yml so docker
compose can pull golden Chainguard base images.

Requires Terracode-Infra change to add scaleapi/scale-agentex:* to
the github-action-agentex IAM role OIDC subjects.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
uv pip install does not support --group flag. Revert to uv sync
(matching original Dockerfile) with UV_PROJECT_ENVIRONMENT=/usr
for Chainguard's Python prefix. Addresses Greptile findings #3 and #4.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch from the overprivileged github-action-agentex role to the new
github-action-scale-agentex-ecr-read role which only grants ECR read
access to golden/* repos. Addresses Greptile review finding about
excessive permissions for a public repository.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…atibility

The Dockerfile defaults SOURCE_DIR=public/agentex (for CI builds from repo
root), but docker-compose builds from the scale-agentex repo root where the
path is agentex/. Override the arg so integration tests can find source files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Chainguard Python images set ENTRYPOINT ["python"], so docker-compose
commands like `bash -c "..."` get interpreted as `python bash -c "..."`.
Clear the entrypoint on the dev stage so shell commands work correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove busybox from base stage apk install — Chainguard deliberately
  excludes it to minimize attack surface; bash alone is sufficient
- Move id-token: write from workflow-level to run-integration-tests job
  only, following principle of least privilege (Greptile review)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use `uv sync --no-dev` in base stage so dev-only packages (test runners,
linters, debug tools) don't leak into production via the COPY --from=base
of /usr/lib/python3.12. Dev stage still gets them via `uv sync --group dev`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
agentex backend:
- python-multipart 0.0.12 → 0.0.22 (CVE-2024-53981, CVE-2026-24486)
- temporalio 1.18.0 → 1.23.0 (CVE-2026-31812 quinn-proto, pending re-scan)

agentex-ui:
- next 15.5.9 → 15.5.10 (GHSA-h25m-26qc-wcjf)
- minimatch, tar, rollup, flatted transitive deps (13 CVEs via npm audit fix)

Remaining:
- starlette CVE-2025-62727: blocked on agentex-sdk widening fastapi
  constraint (scaleapi/scale-agentex-python#285)
- libvips x3 + python-3.12: auto-fix on next rebuild against latest
  Chainguard golden base

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@scale-ballen scale-ballen requested a review from a team as a code owner March 16, 2026 23:51
@socket-security
Copy link

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatednpm/​next@​15.5.9 ⏵ 15.5.1068 +699 +17919770
Updatedpypi/​temporalio@​1.18.0 ⏵ 1.23.074 -7100100100100
Updatedpypi/​python-multipart@​0.0.12 ⏵ 0.0.22100 +1100 +22100100100

View full report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant