fix: use numeric UID in agentex-ui Dockerfile to fix CreateContainerError#171
fix: use numeric UID in agentex-ui Dockerfile to fix CreateContainerError#171sayakmaity merged 1 commit intomainfrom
Conversation
agentex-ui/Dockerfile
Outdated
| # Install dependencies with better reliability settings | ||
| RUN npm config set registry https://registry.npmjs.org/ && \ | ||
| npm ci --verbose | ||
| npm ci --omit=dev --verbose | ||
|
|
||
| # Copy source code (node_modules and .next excluded by .dockerignore) | ||
| COPY ${SOURCE_DIR} . | ||
| COPY LICENSE /app/LICENSE | ||
|
|
||
| # Build the application (creates fresh .next directory) | ||
| ENV NODE_ENV=production | ||
| RUN npm run build |
There was a problem hiding this comment.
Dev dependencies omitted before build step — may break
npm run build
The previous Dockerfile explicitly commented "Install all dependencies (including dev) needed for build" and ran npm ci --verbose (without --omit=dev) before setting NODE_ENV=production. The build step then had access to TypeScript, ESLint plugins, CSS processors, and any other build-time devDependencies.
This PR moves ENV NODE_ENV=production to line 20 (before the install step) and adds --omit=dev to npm ci. Both changes together mean devDependencies are never installed, yet RUN npm run build (line 38) still expects them. If any devDependency is consumed during the Next.js build (e.g. typescript, @types/*, Sass/CSS-in-JS processors, custom ESLint configs), the build will silently fail or error out.
The correct pattern for a multi-stage-style single-stage build is:
- Install all deps (omit
NODE_ENV=production/--omit=devat this step) - Run the build
- Prune dev deps after the build
| # Install dependencies with better reliability settings | |
| RUN npm config set registry https://registry.npmjs.org/ && \ | |
| npm ci --verbose | |
| npm ci --omit=dev --verbose | |
| # Copy source code (node_modules and .next excluded by .dockerignore) | |
| COPY ${SOURCE_DIR} . | |
| COPY LICENSE /app/LICENSE | |
| # Build the application (creates fresh .next directory) | |
| ENV NODE_ENV=production | |
| RUN npm run build | |
| # Install all dependencies (including dev) needed for build | |
| RUN npm config set registry https://registry.npmjs.org/ && \ | |
| npm ci --verbose | |
| # Copy source code (node_modules and .next excluded by .dockerignore) | |
| COPY ${SOURCE_DIR} . | |
| COPY LICENSE /app/LICENSE | |
| # Build the application (creates fresh .next directory) | |
| RUN npm run build | |
| # Remove dev dependencies after build | |
| RUN npm prune --omit=dev |
If the project has already confirmed that no devDependencies are consumed during next build (e.g. Next.js SWC handles all transpilation and all type/lint checks are CI-only), this can be left as-is — but that should be validated against the test plan item "Image builds successfully."
Prompt To Fix With AI
This is a comment left during a code review.
Path: agentex-ui/Dockerfile
Line: 29-38
Comment:
**Dev dependencies omitted before build step — may break `npm run build`**
The previous Dockerfile explicitly commented "Install all dependencies (including dev) needed for build" and ran `npm ci --verbose` (without `--omit=dev`) *before* setting `NODE_ENV=production`. The build step then had access to TypeScript, ESLint plugins, CSS processors, and any other build-time devDependencies.
This PR moves `ENV NODE_ENV=production` to line 20 (before the install step) and adds `--omit=dev` to `npm ci`. Both changes together mean devDependencies are **never installed**, yet `RUN npm run build` (line 38) still expects them. If any devDependency is consumed during the Next.js build (e.g. `typescript`, `@types/*`, Sass/CSS-in-JS processors, custom ESLint configs), the build will silently fail or error out.
The correct pattern for a multi-stage-style single-stage build is:
1. Install **all** deps (omit `NODE_ENV=production` / `--omit=dev` at this step)
2. Run the build
3. Prune dev deps after the build
```suggestion
# Install all dependencies (including dev) needed for build
RUN npm config set registry https://registry.npmjs.org/ && \
npm ci --verbose
# Copy source code (node_modules and .next excluded by .dockerignore)
COPY ${SOURCE_DIR} .
COPY LICENSE /app/LICENSE
# Build the application (creates fresh .next directory)
RUN npm run build
# Remove dev dependencies after build
RUN npm prune --omit=dev
```
If the project has already confirmed that no devDependencies are consumed during `next build` (e.g. Next.js SWC handles all transpilation and all type/lint checks are CI-only), this can be left as-is — but that should be validated against the test plan item "Image builds successfully."
How can I resolve this? If you propose a fix, please make it concise.
The golden base image has user 'node' at UID 65532, while the public Chainguard image uses 'nonroot' at the same UID. Using `USER nonroot` causes "no users found" CreateContainerError on the golden image. Switch to `USER 65532` which works with both base images.
74721e3 to
f2a90ef
Compare
Summary
The public Chainguard base image change (#170) uses
USER nonroot, but the golden base image has the user namednode(notnonroot) at UID 65532. This causesCreateContainerError: no users foundon the dev cluster.Switches to
USER 65532(numeric UID) which works with both base images.This unblocks deployment of the SGPINF-1217 fix (#165).
Test plan
Greptile Summary
This PR fixes the
CreateContainerError: no users foundregression on the dev cluster by changingUSER nonroottoUSER 65532(numeric UID) inagentex-ui/Dockerfile. Using a numeric UID avoids a/etc/passwdlookup, which is the correct approach for minimal/distroless-style images like Chainguard where named users may not be registered.Important discrepancy: The PR title and description say this restores the golden ECR base image (
022465994601.dkr.ecr.us-west-2.amazonaws.com/golden/chainguard/node:20-dev), but theFROMline is not changed — the image remainscgr.dev/chainguard/node:latest-dev. The only actual code change is theUSERdirective on line 53. This should be clarified to avoid misleading git history.Key points:
USER 65532numeric-UID fix directly addresses theno users founderror and is technically sound.cgr.dev/chainguard/node:latest-dev) uses a floatinglatest-devtag, so builds remain non-reproducible — this was already the case before and is not introduced by this PR.FROMline still needs to be updated.Confidence Score: 4/5
USER nonroot→USER 65532) and directly resolves the describedCreateContainerError: no users found. Numeric UIDs are the idiomatic fix for Chainguard distroless images. The only concern is that the PR description claims a base-image revert that did not actually happen, which could cause confusion. Once that intent is confirmed/clarified, the risk is very low.FROMimage is still the public Chainguard image, not the golden ECR image described in the PR.Important Files Changed
USER nonroottoUSER 65532(numeric UID) to fixCreateContainerError: no users found; base image (publiccgr.dev/chainguard/node:latest-dev) is unchanged despite PR description claiming a revert to the golden ECR image.Sequence Diagram
sequenceDiagram participant Docker as Docker Build participant Image as cgr.dev/chainguard/node:latest-dev participant App as /app (Node.js) Docker->>Image: FROM cgr.dev/chainguard/node:latest-dev Docker->>App: USER root → apk add libvips-dev, python3, etc. Docker->>App: npm ci (all deps incl. dev) Docker->>App: npm run build Docker->>App: npm prune --omit=dev Docker->>App: chown -R 65532:65532 /app Note over Docker,App: PR #171 change: USER nonroot → USER 65532 Docker->>App: USER 65532 (numeric UID — no /etc/passwd lookup) App-->>Docker: EXPOSE 3000, CMD ["npm", "start"]Prompt To Fix All With AI
Last reviewed commit: "fix: use numeric UID..."