Skip to content

feat: Migrate to UBI 9 base images for enterprise compliance#1707

Open
gnPy wants to merge 8 commits into
langflow-ai:mainfrom
gnPy:fix_ubi9-migration
Open

feat: Migrate to UBI 9 base images for enterprise compliance#1707
gnPy wants to merge 8 commits into
langflow-ai:mainfrom
gnPy:fix_ubi9-migration

Conversation

@gnPy
Copy link
Copy Markdown

@gnPy gnPy commented May 29, 2026

This PR migrates OpenRag's Docker infrastructure from Debian/Alpine-based images to Red Hat Universal Base Images (UBI) 9 to meet enterprise compliance requirements and leverage Red Hat's official support and security updates. The migration affects all three core services: backend, frontend, and Langflow.

Summary by CodeRabbit

  • Chores
    • Migrated backend, frontend, and Langflow containers to Red Hat UBI9 base images.
    • Backend: standardized on Python 3.11 UBI runtime, pinned a runtime dependency, and created a non-root app user.
    • Frontend: moved to UBI9 Node.js 20 and standardized working directory and runtime user.
    • Langflow: rebuilt on UBI9 Python 3.11 and enforced non-root runtime.
    • Entrypoints: improved privilege-drop behavior for safer non-root startup.

Review Change Stack

@github-actions github-actions Bot added community docker enhancement 🔵 New feature or request and removed community labels May 29, 2026
@github-actions github-actions Bot added enhancement 🔵 New feature or request and removed enhancement 🔵 New feature or request labels May 29, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

Walkthrough

Migrate frontend, backend, and langflow Dockerfiles to Red Hat UBI9 images, replace apt-get with dnf, pin Python/node bases, create explicit non-root users, and update entrypoints to use runuser or guarded privilege-drop logic.

Changes

UBI9 Container Image Migration

Layer / File(s) Summary
Frontend: UBI9 Node.js 20 and standardized paths
Dockerfile.frontend
Switch base to UBI9 Node.js 20, set WORKDIR /opt/app-root/src, change COPY --chown to 1001:0, and set USER 1001; build and start steps preserved.
Backend: base, builder, and runtime stage changes
Dockerfile.backend
Base stage moves to UBI9 python-311 (pinned), uses dnf and pip install uv==${UV_VERSION}; builder installs git, gcc, gcc-c++, make via dnf; runtime stage creates appuser (UID/GID 1000), installs runtime deps via dnf, and copies /app from builder.
Backend: final USER and metadata
Dockerfile.backend
Final stage sets USER appuser before EXPOSE 8000, retains ENTRYPOINT ["/entrypoint.sh"] and CMD ["python","src/main.py"], and removes gosu-related comments.
Backend entrypoint: runuser privilege-drop
scripts/backend-entrypoint.sh
Header comments updated; root branch now uses runuser -u appuser --preserve-environment -- "$@" instead of gosu; chown and non-root exec paths unchanged.
Langflow: UBI9 Python 3.11, pip-installed langflow, and explicit user
Dockerfile.langflow
Replace langflow base with UBI9 python-311 (pinned), install OS deps via dnf, pip install langflow==1.9.0 and uv, create /app/langflow-data, install entrypoint, ensure UID/GID 1000 user/group if missing, chown /app, and switch to USER 1000.
Langflow entrypoint: guard root-only setup
scripts/langflow-entrypoint.py
Docstring updated; root-only permission fixes, UID lookup, and setgid/setuid privilege-drop now run only when os.getuid() == 0; otherwise the script execs the requested command directly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

refactor, containers

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main change: migrating from Debian/Alpine to UBI 9 base images for enterprise compliance, which is reflected across all three Dockerfiles and entrypoint scripts.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added enhancement 🔵 New feature or request and removed enhancement 🔵 New feature or request labels May 29, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scripts/backend-entrypoint.sh (1)

15-23: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve PATH when switching to appuser.

exec runuser -u appuser -- "$@" (no --preserve-environment) can reset PATH to a system default, which would drop Dockerfile.backend’s ENV PATH="/app/.venv/bin:$PATH". With CMD ["python", "src/main.py"], this can run the system python instead of the uv-synced /app/.venv/bin/python. Use --preserve-environment (or invoke the venv python by absolute path).

Suggested fix
-    exec runuser -u appuser -- "$@"
+    exec runuser --preserve-environment -u appuser -- "$@"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/backend-entrypoint.sh` around lines 15 - 23, The entrypoint resets
PATH when switching to appuser: update the exec runuser invocation in
scripts/backend-entrypoint.sh (the line calling exec runuser -u appuser -- "$@")
to preserve the virtualenv PATH by either adding the --preserve-environment flag
to runuser (exec runuser -u appuser --preserve-environment -- "$@") or by
invoking the venv Python explicitly (replace "$@" with the absolute venv python
path, e.g., /app/.venv/bin/python src/main.py) so the Dockerfile.backend ENV
PATH="/app/.venv/bin:$PATH" is honored.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Dockerfile.backend`:
- Around line 61-68: The Dockerfile currently leaves the final stage as USER 0
and relies on runtime chown, so change the image to create and switch to a
non-root user during the build (e.g., add a fixed UID/GID user/group and replace
USER 0 with that user), ensure all application files and directories the
container needs are chowned/chmodded at build time to that UID/GID (remove or
guard any runtime chown logic in the entrypoint), and update the
entrypoint/startup code to avoid root-only operations (eliminate runtime
chown/drop-privilege steps and rely on build-time ownership or orchestrator
volume-mount options). Specifically, remove the final USER 0, add user/group
creation and chown steps in the Dockerfile build stage, and remove or
conditionalize any runtime chown calls referenced in the entrypoint so the
container starts as the non-root user.
- Around line 13-18: The Dockerfile currently pipes
https://astral.sh/uv/install.sh directly into sh (the RUN that installs uv),
which allows mutable remote content to be executed; instead pin the release and
verify its checksum before running: add explicit UV version variables (e.g.,
UV_VERSION/UV_SHA256) and replace the piped curl with a download-to-file (curl
-fSL -o /tmp/install.sh https://astral.sh/uv/install.sh or better, download the
specific release archive/binary), compute and compare its SHA256 against the
pinned UV_SHA256, fail the build if the checksum mismatches, and only then run
the verified installer (or install the pinned release artifact directly).
Reference the RUN that calls curl -LsSf https://astral.sh/uv/install.sh | sh and
the uv installer script when making the change.

In `@Dockerfile.langflow`:
- Around line 35-37: The RUN line currently swallows failures by using "|| true"
for groupadd and useradd; change it to explicitly ensure the group and user
exist or fail the build: replace the chained "groupadd ... || true && useradd
... || true && chown -R 1000:1000 /app" with a sequence that checks and creates
the group/user (e.g., use getent group 1000 and getent passwd 1000 or checks by
name), call groupadd and useradd without "|| true" so errors surface, and then
run chown -R 1000:1000 /app; ensure the commands reference the same identifiers
(groupadd, useradd, chown -R 1000:1000 /app) so the build fails fast on real
UID/GID collisions instead of masking them.

---

Outside diff comments:
In `@scripts/backend-entrypoint.sh`:
- Around line 15-23: The entrypoint resets PATH when switching to appuser:
update the exec runuser invocation in scripts/backend-entrypoint.sh (the line
calling exec runuser -u appuser -- "$@") to preserve the virtualenv PATH by
either adding the --preserve-environment flag to runuser (exec runuser -u
appuser --preserve-environment -- "$@") or by invoking the venv Python
explicitly (replace "$@" with the absolute venv python path, e.g.,
/app/.venv/bin/python src/main.py) so the Dockerfile.backend ENV
PATH="/app/.venv/bin:$PATH" is honored.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5a70624a-3986-4b17-a19d-631e748e8860

📥 Commits

Reviewing files that changed from the base of the PR and between 1b25fd7 and a5e6a22.

📒 Files selected for processing (4)
  • Dockerfile.backend
  • Dockerfile.frontend
  • Dockerfile.langflow
  • scripts/backend-entrypoint.sh

Comment thread Dockerfile.backend Outdated
Comment thread Dockerfile.backend
Comment thread Dockerfile.langflow
Comment thread Dockerfile.langflow Outdated
@github-actions github-actions Bot added enhancement 🔵 New feature or request and removed enhancement 🔵 New feature or request labels May 29, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
Dockerfile.backend (2)

9-9: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Pin registry.access.redhat.com/ubi9/python-311 instead of using the floating :latest tag

Both the base and runtime stages use registry.access.redhat.com/ubi9/python-311:latest, which makes rebuilds non-deterministic. Red Hat errata/certification updates provide updated immutable digests for ubi9/python-311 (e.g., RHBA-2026:7284 references sha256:27c37bbd798f65415a8e1493e3e664af94cf5dee111d71833f80254ead2633b9). Pin that digest (or at least a fixed tag) in both stages to meet reproducibility/supply-chain goals.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.backend` at line 9, The Dockerfile uses floating images for the
base and runtime stages (the two FROM lines referencing
registry.access.redhat.com/ubi9/python-311:latest); replace the :latest tag with
a pinned immutable reference (either a fixed tag or the specific sha256 digest)
in both FROM statements so builds are deterministic and reproducible, ensuring
both the base stage and the runtime stage reference the same pinned identifier.

99-106: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix OpenShift/arbitrary-UID write permissions for the actual backend mount roots (/app/backend-data), not just /app/*.

Dockerfile.backend creates appuser (1000:1000) and chown -R appuser:appuser /app, but it only chmod +x /entrypoint.sh (no group-write widening). More importantly, the Kubernetes operator mounts backend storage at /app/backend-data and sets OPENRAG_* paths to /app/backend-data/{keys,config,flow-backups,...}—yet scripts/backend-entrypoint.sh root-mode chown only fixes /app/{keys,flows,config,data,openrag-documents} and does not chown /app/backend-data/*. So arbitrary-UID/OpenShift (or any PV default perms) can still fail writing under /app/backend-data.

Action: ensure permissions are applied to the operator/Helm write locations:

  • update backend-entrypoint.sh to also chown/chmod /app/backend-data/* when starting as root, and/or
  • align image/dir group ownership (e.g., group 0 + group-writable modes) for the mount roots actually used (/app/backend-data for operator; the /data and /shared mount paths for Helm), especially when SCC overrides runAsUser/fsGroup.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.backend` around lines 99 - 106, The container currently only
chowns/chmods /app/*, leaving the operator/Helm mount root /app/backend-data
unwritable for arbitrary UIDs; update the startup and image ownership so
operator mounts work: modify scripts/backend-entrypoint.sh (the root-mode
startup path) to also recursively chown and set group-write permissions on
/app/backend-data and its expected subpaths referenced by OPENRAG_* (e.g.,
/app/backend-data/keys, /app/backend-data/config,
/app/backend-data/flow-backups), and/or adjust Dockerfile.backend to set group
ownership (e.g., group 0) and group-writable mode for the mount roots
(/app/backend-data, /data, /shared) so SCC/runAsUser/fsGroup scenarios can write
without requiring chown after mount.
Dockerfile.langflow (2)

18-21: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t uninstall litellm after installing langflow==1.9.0
langflow==1.9.0 declares litellm as a required dependency (litellm>=1.60.2,<2.0.0), so uninstalling it will remove a core dependency: the LiteLLM proxy/routing integration will stop working and Langflow code paths that import/use litellm can fail at runtime (e.g., ModuleNotFoundError). If the intent is to control versions, pin/replace litellm instead of uninstalling it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.langflow` around lines 18 - 21, The Dockerfile currently installs
langflow==1.9.0 then unconditionally uninstalls litellm; remove the pip
uninstall -y litellm step so langflow's required dependency
(litellm>=1.60.2,<2.0.0) remains installed, or if you need a specific litellm
version, replace the uninstall with an explicit pip install/pin of the desired
litellm version after installing langflow (e.g., install langflow==1.9.0 and
then pip install litellm==<desired_version>), ensuring the Dockerfile's RUN
block references langflow==1.9.0 and litellm appropriately rather than
uninstalling litellm.

3-3: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin the base image instead of :latest.

A floating :latest base makes builds non-reproducible and complicates supply-chain/compliance assurance; pin registry.access.redhat.com/ubi9/python-311 to a specific tag or—preferably—a digest (Red Hat errata reference digests; use skopeo list-tags docker://registry.access.redhat.com/ubi9/python-311 if you need the available tags).

📌 Suggested change
-FROM registry.access.redhat.com/ubi9/python-311:latest
+# Pin to a specific UBI9 python-311 tag/digest for reproducible builds
+FROM registry.access.redhat.com/ubi9/python-311:<pinned-tag-or-digest>
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.langflow` at line 3, The Dockerfile currently uses a floating
image tag in the FROM instruction ("FROM
registry.access.redhat.com/ubi9/python-311:latest"); replace that with a pinned
tag or, preferably, an immutable digest (e.g., "FROM
registry.access.redhat.com/ubi9/python-311@sha256:...") to make builds
reproducible and supply-chain auditable; obtain the exact tag or digest using
your registry tooling (e.g., skopeo list-tags
docker://registry.access.redhat.com/ubi9/python-311 or the Red Hat errata/digest
list), then update the FROM line in the Dockerfile to reference that specific
tag or the sha256 digest.
🧹 Nitpick comments (2)
Dockerfile.backend (1)

81-82: ⚡ Quick win

Redundant re-copy of securityconfig/cloud_securityconfig.

These directories are already copied into /app in the builder stage (Lines 48-49) and arrive via COPY --from=builder /app /app on Line 80. Re-copying them from the build context here adds extra layers and creates two sources of truth for security config, risking drift if only one path is updated later.

♻️ Proposed cleanup
 COPY --from=builder /app /app
-COPY securityconfig/ ./securityconfig/
-COPY cloud_securityconfig/ ./cloud_securityconfig/
 COPY scripts/backend-entrypoint.sh /entrypoint.sh
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.backend` around lines 81 - 82, Remove the redundant COPY
instructions that re-copy securityconfig and cloud_securityconfig into the final
image: delete the two COPY lines referencing "COPY securityconfig/
./securityconfig/" and "COPY cloud_securityconfig/ ./cloud_securityconfig/"
because those directories are already provided via "COPY --from=builder /app
/app" from the builder stage; leaving only the builder-to-final COPY ensures a
single source of truth and avoids extra image layers and potential drift.
Dockerfile.langflow (1)

35-46: OpenShift arbitrary-UID compatibility: own /app by GID 0 and make it group-writable.

For a Red Hat/UBI enterprise target, containers typically run on OpenShift with an arbitrary (random, non-root) UID that is not 1000. With chown -R 1000:1000 /app and USER langflow, that random UID won't own /app and will be denied writes to langflow-data, flows, etc. The supported pattern is to keep group ownership as root group (GID 0) and grant group-equals-user permissions; the random UID always belongs to GID 0.

Also prefer the numeric USER 1000 over USER langflow: it removes the build-time dependency on the langflow name (which the getent-guarded block won't create if uid 1000 already exists under another name) and is the convention for UBI images.

♻️ Suggested adjustment
     && if ! getent passwd 1000 >/dev/null; then \
         useradd --uid 1000 --gid 1000 --no-create-home langflow; \
     fi \
-    && chown -R 1000:1000 /app
+    && chown -R 1000:0 /app \
+    && chmod -R g=u /app
@@
-USER langflow
+USER 1000
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.langflow` around lines 35 - 46, Change the /app ownership and
runtime user to be OpenShift-friendly: instead of chown -R 1000:1000 /app and
USER langflow, make /app owned by root group and group-writable (use chown -R
root:0 /app and ensure group write/execute with chmod g+rwX /app) so an
arbitrary UID in GID 0 can write; keep the build-time guarded groupadd/useradd
(groupadd, useradd) but switch the runtime USER to the numeric UID (USER 1000)
to avoid relying on the langflow name and to follow UBI/OpenShift conventions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/langflow-entrypoint.py`:
- Around line 39-41: The privilege drop is incomplete because supplementary
groups are not cleared; call os.setgroups([]) while still root before changing
identity, then perform os.setgid(1000) and os.setuid(1000) (i.e., add a call to
os.setgroups([]) prior to os.setgid/os.setuid in langflow-entrypoint.py to clear
supplementary groups while you still have privilege).
- Around line 22-27: The current block that sets permissions on data_dir uses
chmod(0o777) and swallows OSError; replace this with a recursive chown of
data_dir and its contents to the langflow runtime user (lookup uid/gid for
username "langflow" or use uid 1000 if that is the intended user) and remove the
silent except: instead catch exceptions around the chown operation and surface
them (log the error and exit non‑zero or re‑raise) so failures are visible;
update the code that references the pathlib.Path variable data_dir to perform
the recursive ownership change (e.g., iterate with rglob or os.walk) and ensure
errors are handled and reported rather than ignored.

---

Outside diff comments:
In `@Dockerfile.backend`:
- Line 9: The Dockerfile uses floating images for the base and runtime stages
(the two FROM lines referencing
registry.access.redhat.com/ubi9/python-311:latest); replace the :latest tag with
a pinned immutable reference (either a fixed tag or the specific sha256 digest)
in both FROM statements so builds are deterministic and reproducible, ensuring
both the base stage and the runtime stage reference the same pinned identifier.
- Around line 99-106: The container currently only chowns/chmods /app/*, leaving
the operator/Helm mount root /app/backend-data unwritable for arbitrary UIDs;
update the startup and image ownership so operator mounts work: modify
scripts/backend-entrypoint.sh (the root-mode startup path) to also recursively
chown and set group-write permissions on /app/backend-data and its expected
subpaths referenced by OPENRAG_* (e.g., /app/backend-data/keys,
/app/backend-data/config, /app/backend-data/flow-backups), and/or adjust
Dockerfile.backend to set group ownership (e.g., group 0) and group-writable
mode for the mount roots (/app/backend-data, /data, /shared) so
SCC/runAsUser/fsGroup scenarios can write without requiring chown after mount.

In `@Dockerfile.langflow`:
- Around line 18-21: The Dockerfile currently installs langflow==1.9.0 then
unconditionally uninstalls litellm; remove the pip uninstall -y litellm step so
langflow's required dependency (litellm>=1.60.2,<2.0.0) remains installed, or if
you need a specific litellm version, replace the uninstall with an explicit pip
install/pin of the desired litellm version after installing langflow (e.g.,
install langflow==1.9.0 and then pip install litellm==<desired_version>),
ensuring the Dockerfile's RUN block references langflow==1.9.0 and litellm
appropriately rather than uninstalling litellm.
- Line 3: The Dockerfile currently uses a floating image tag in the FROM
instruction ("FROM registry.access.redhat.com/ubi9/python-311:latest"); replace
that with a pinned tag or, preferably, an immutable digest (e.g., "FROM
registry.access.redhat.com/ubi9/python-311@sha256:...") to make builds
reproducible and supply-chain auditable; obtain the exact tag or digest using
your registry tooling (e.g., skopeo list-tags
docker://registry.access.redhat.com/ubi9/python-311 or the Red Hat errata/digest
list), then update the FROM line in the Dockerfile to reference that specific
tag or the sha256 digest.

---

Nitpick comments:
In `@Dockerfile.backend`:
- Around line 81-82: Remove the redundant COPY instructions that re-copy
securityconfig and cloud_securityconfig into the final image: delete the two
COPY lines referencing "COPY securityconfig/ ./securityconfig/" and "COPY
cloud_securityconfig/ ./cloud_securityconfig/" because those directories are
already provided via "COPY --from=builder /app /app" from the builder stage;
leaving only the builder-to-final COPY ensures a single source of truth and
avoids extra image layers and potential drift.

In `@Dockerfile.langflow`:
- Around line 35-46: Change the /app ownership and runtime user to be
OpenShift-friendly: instead of chown -R 1000:1000 /app and USER langflow, make
/app owned by root group and group-writable (use chown -R root:0 /app and ensure
group write/execute with chmod g+rwX /app) so an arbitrary UID in GID 0 can
write; keep the build-time guarded groupadd/useradd (groupadd, useradd) but
switch the runtime USER to the numeric UID (USER 1000) to avoid relying on the
langflow name and to follow UBI/OpenShift conventions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3c575aa6-3461-43be-a45b-669771142f86

📥 Commits

Reviewing files that changed from the base of the PR and between a5e6a22 and 6fc8420.

📒 Files selected for processing (4)
  • Dockerfile.backend
  • Dockerfile.langflow
  • scripts/backend-entrypoint.sh
  • scripts/langflow-entrypoint.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/backend-entrypoint.sh

Comment thread scripts/langflow-entrypoint.py Outdated
Comment thread scripts/langflow-entrypoint.py
@github-actions github-actions Bot added enhancement 🔵 New feature or request and removed enhancement 🔵 New feature or request labels May 29, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
Dockerfile.backend (1)

100-107: ⚡ Quick win

Backend OpenShift note: Helm writes go to /data + /shared (PVCs), so the /app/* chown mismatch isn’t a startup-break; still align for robustness

The OpenShift/arbitrary-UID write-failure described for /app/{keys,config,data,flows/backup,openrag-documents} doesn’t follow for the Helm chart: kubernetes/helm/openrag/templates/backend/backend-dotenv.yaml points the backend to /data/{keys,config,backup} and /shared/{flows,openrag-documents}, and kubernetes/helm/openrag/templates/backend/deployment.yaml mounts PVCs at /data and /shared. So the Dockerfile.backend ownership baked into /app won’t govern those startup writes in the default deployment.

For consistency and non-PVC/alternate runtimes, backend image/entrypoint still hardcode appuser:appuser perms under /app (and scripts/backend-entrypoint.sh only fixes /app/* when started as root). Matching Dockerfile.langflow’s group-0 + g+rwX pattern improves OpenShift arbitrary-UID robustness.

🔧 Optional alignment with Dockerfile.langflow
 RUN mkdir -p keys data config flows/backup openrag-documents \
-    && chown -R appuser:appuser /app \
+    && chown -R appuser:0 /app \
+    && chmod -R g+rwX /app \
     && chmod +x /entrypoint.sh

Note: if doing this, align scripts/backend-entrypoint.sh’s legacy-root /app/* chown target to the same group-0 ownership.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.backend` around lines 100 - 107, The Dockerfile.backend currently
chowns /app to appuser:appuser and switches to USER appuser which can fail under
OpenShift arbitrary-UID mounts; change the Dockerfile.backend to adopt
Dockerfile.langflow's approach: set group ownership to root (GID 0) and grant
group read/write/execute bits (use group 0 + g+rwX semantics) for directories
created (keys data config flows/backup openrag-documents and /app) instead of
appuser:appuser, and ensure the final USER remains non-root; also update
scripts/backend-entrypoint.sh to mirror this by changing its legacy chown target
from appuser:appuser to group 0 ownership (and preserve g+rwX perms) so the
image is robust both for PVC mounts (/data, /shared) and non-PVC runtimes.
scripts/langflow-entrypoint.py (1)

28-33: Recursive chown on every root start scales with volume size.

In the legacy root path this walks and chowns the entire langflow-data tree on each container start, which adds startup latency for large data volumes (flows, vector stores, DB files) even when ownership is already correct. As the docstring recommends, prefer host UID mapping (Podman :U, Docker --user, or Kubernetes fsGroup/runAsUser) so this fixup path is avoided entirely in production; consider gating or skipping the full walk when the top-level owner already matches 1000:1000.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/langflow-entrypoint.py` around lines 28 - 33, The recursive chown
loop (os.walk over data_dir with os.chown on root, dirs, files) should be
guarded to avoid walking large volumes on every start; first stat the top-level
data_dir and skip the entire walk if its uid/gid are already 1000/1000, or add
an opt-in flag/ENV (e.g., SKIP_CHOWN or FORCE_CHOWN) to control behavior; update
the code around os.walk(data_dir) to check os.stat(data_dir).st_uid/st_gid and
only perform the recursive os.chown when ownership differs (or when FORCE_CHOWN
is set).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Dockerfile.langflow`:
- Around line 20-22: The RUN pip install line pins langflow but leaves uv
floating; update that installation command to pin uv to a fixed version (for
example change the uv entry to uv==0.11.16) so the Dockerfile.langflow
reproducibly installs the same uv release alongside langflow==1.9.0.
- Around line 37-44: The Dockerfile currently leaves /app owned by root:0 and
only grants group write, so the runtime USER 1000 cannot write to
/app/langflow-data; modify the Dockerfile after the useradd/groupadd block (and
before any USER 1000 switch) to ensure the data directory is created and
owned/writable by uid/gid 1000: e.g. create /app/langflow-data if missing, run
chown -R 1000:1000 /app/langflow-data and chmod -R u+rwX /app/langflow-data (or
chown -R 1000:1000 /app && chmod -R g+rwX /app when intending to transfer
ownership), referencing the existing groupadd/useradd lines and the chown/chmod
commands so the container process (USER 1000) can write to /app/langflow-data at
runtime.

In `@scripts/langflow-entrypoint.py`:
- Around line 24-36: Remove the unused "import shutil", stop separately
iterating dirs, and make all os.chown calls non-following so symlinks inside
data_dir cannot redirect ownership outside the tree: when walking data_dir with
os.walk, call os.chown(root, 1000, 1000, follow_symlinks=False) for each root
and for each file path (os.path.join(root, f)) call os.chown(...,
follow_symlinks=False); drop the separate "for d in dirs" chown loop and keep
the existing OSError handling.

---

Nitpick comments:
In `@Dockerfile.backend`:
- Around line 100-107: The Dockerfile.backend currently chowns /app to
appuser:appuser and switches to USER appuser which can fail under OpenShift
arbitrary-UID mounts; change the Dockerfile.backend to adopt
Dockerfile.langflow's approach: set group ownership to root (GID 0) and grant
group read/write/execute bits (use group 0 + g+rwX semantics) for directories
created (keys data config flows/backup openrag-documents and /app) instead of
appuser:appuser, and ensure the final USER remains non-root; also update
scripts/backend-entrypoint.sh to mirror this by changing its legacy chown target
from appuser:appuser to group 0 ownership (and preserve g+rwX perms) so the
image is robust both for PVC mounts (/data, /shared) and non-PVC runtimes.

In `@scripts/langflow-entrypoint.py`:
- Around line 28-33: The recursive chown loop (os.walk over data_dir with
os.chown on root, dirs, files) should be guarded to avoid walking large volumes
on every start; first stat the top-level data_dir and skip the entire walk if
its uid/gid are already 1000/1000, or add an opt-in flag/ENV (e.g., SKIP_CHOWN
or FORCE_CHOWN) to control behavior; update the code around os.walk(data_dir) to
check os.stat(data_dir).st_uid/st_gid and only perform the recursive os.chown
when ownership differs (or when FORCE_CHOWN is set).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cbc880ab-4e5c-4f06-bae9-4efad3b419b3

📥 Commits

Reviewing files that changed from the base of the PR and between 6fc8420 and 0ec1bb3.

📒 Files selected for processing (3)
  • Dockerfile.backend
  • Dockerfile.langflow
  • scripts/langflow-entrypoint.py

Comment thread Dockerfile.langflow
Comment on lines +20 to +22
RUN pip install --no-cache-dir \
langflow==1.9.0 \
uv
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

What is the latest stable version of the Python uv package on PyPI?

💡 Result:

As of May 29, 2026, the latest stable version of the Python uv package on PyPI is 0.11.16 [1]. This version was published on May 21, 2026 [1].

Citations:


🏁 Script executed:

#!/bin/bash
sed -n '1,80p' Dockerfile.langflow | nl -ba | sed -n '1,120p'

Repository: langflow-ai/openrag

Length of output: 104


🏁 Script executed:

#!/bin/bash
cat -n Dockerfile.langflow | sed -n '1,120p'

Repository: langflow-ai/openrag

Length of output: 2122


Pin uv to a fixed version in Dockerfile.langflow
The base image digest and langflow==1.9.0 are pinned, but uv floats, weakening reproducibility. Pin uv (e.g., uv==0.11.16, latest stable on PyPI).

♻️ Proposed change
 RUN pip install --no-cache-dir \
     langflow==1.9.0 \
-    uv
+    uv==0.11.16
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN pip install --no-cache-dir \
langflow==1.9.0 \
uv
RUN pip install --no-cache-dir \
langflow==1.9.0 \
uv==0.11.16
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.langflow` around lines 20 - 22, The RUN pip install line pins
langflow but leaves uv floating; update that installation command to pin uv to a
fixed version (for example change the uv entry to uv==0.11.16) so the
Dockerfile.langflow reproducibly installs the same uv release alongside
langflow==1.9.0.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnPy This is important for reproducibility

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment thread Dockerfile.langflow
Comment on lines +37 to +44
RUN if ! getent group 1000 >/dev/null; then \
groupadd --gid 1000 langflow; \
fi \
&& if ! getent passwd 1000 >/dev/null; then \
useradd --uid 1000 --gid 1000 --no-create-home langflow; \
fi \
&& chown -R root:0 /app \
&& chmod -R g+rwX /app
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Look for how langflow-data is wired as the data/config dir, and any volume mounts.
rg -nP -C3 'langflow-data|LANGFLOW_(CONFIG|SAVE|DATA|HOME|CACHE)_DIR' \
  --glob '!**/node_modules/**'
fd -t f -i 'docker-compose' -x rg -nP -C3 'langflow' {}

Repository: langflow-ai/openrag

Length of output: 26404


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect Dockerfile.langflow around the referenced lines
echo "===== Dockerfile.langflow (around lines 1-120) ====="
nl -ba Dockerfile.langflow | sed -n '1,140p'

# 2) Inspect langflow entrypoint for the actual chown/write target
echo "===== scripts/langflow-entrypoint.py (first ~140 lines) ====="
nl -ba scripts/langflow-entrypoint.py | sed -n '1,200p'

# 3) Inspect docker-compose for mounts/env for langflow-data
echo "===== docker-compose.yml (langflow data-related bits) ====="
rg -n "LANGFLOW_(CONFIG_DIR|DATABASE_URL)|/app/langflow-data|LANGFLOW_DATA_PATH" docker-compose.yml
rg -n "volumes:|/app/langflow-data" docker-compose.yml -n -C2

# 4) Inspect Helm deployment template for langflow-data mount + securityContext
echo "===== kubernetes/helm/openrag/templates/langflow/deployment.yaml (around volume mounts) ====="
nl -ba kubernetes/helm/openrag/templates/langflow/deployment.yaml | sed -n '1,220p'
rg -n "securityContext|runAsUser|fsGroup|initContainers|langflow-data|mountPath" kubernetes/helm/openrag/templates/langflow/deployment.yaml

# 5) Inspect controller code that mounts langflow-data (might mount to /app/data)
echo "===== kubernetes/operator/internal/controller/openrag_controller.go (langflow-data mount) ====="
rg -n "langflow-data|mountPath: \"/app/" kubernetes/operator/internal/controller/openrag_controller.go -n
nl -ba kubernetes/operator/internal/controller/openrag_controller.go | sed -n '760,880p'

# 6) Inspect controller env vars relevant to langflow config/data dir
echo "===== kubernetes/operator/internal/controller/env.go (langflow env) ====="
nl -ba kubernetes/operator/internal/controller/env.go | sed -n '1,140p'
rg -n "LANGFLOW_CONFIG_DIR|LANGFLOW_DATABASE_URL|LANGFLOW_.*DATA" kubernetes/operator/internal/controller/env.go

Repository: langflow-ai/openrag

Length of output: 177


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "===== Dockerfile.langflow (lines 1-140) ====="
cat -n Dockerfile.langflow | sed -n '1,160p'

echo "===== Dockerfile.langflow (lines ~30-80) ====="
cat -n Dockerfile.langflow | sed -n '30,90p'

echo "===== scripts/langflow-entrypoint.py (first 220 lines) ====="
cat -n scripts/langflow-entrypoint.py | sed -n '1,220p'

echo "===== docker-compose.yml (langflow data/config bits) ====="
rg -n "LANGFLOW_(CONFIG_DIR|DATABASE_URL)|/app/langflow-data|LANGFLOW_DATA_PATH" docker-compose.yml
rg -n "volumes:|/app/langflow-data" docker-compose.yml -n -C2

echo "===== kubernetes/helm/openrag/templates/langflow/deployment.yaml (around langflow-data mount) ====="
cat -n kubernetes/helm/openrag/templates/langflow/deployment.yaml | sed -n '1,220p'
rg -n "securityContext|runAsUser|fsGroup|initContainers|langflow-data|mountPath" kubernetes/helm/openrag/templates/langflow/deployment.yaml

echo "===== kubernetes/operator/internal/controller/openrag_controller.go (langflow-data mount + mountPath) ====="
rg -n "langflow-data|mountPath: \"\/app\/" kubernetes/operator/internal/controller/openrag_controller.go
rg -n "LANGFLOW_CONFIG_DIR|LANGFLOW_DATABASE_URL" kubernetes/operator/internal/controller/env.go
cat -n kubernetes/operator/internal/controller/openrag_controller.go | sed -n '780,860p'

echo "===== kubernetes/operator/internal/controller/env.go (langflow env relevant vars) ====="
cat -n kubernetes/operator/internal/controller/env.go | sed -n '1,120p'
rg -n "LANGFLOW_CONFIG_DIR|LANGFLOW_DATABASE_URL|LANGFLOW_.*DATA|LANGFLOW_.*DIR" kubernetes/operator/internal/controller/env.go

Repository: langflow-ai/openrag

Length of output: 13355


Fix Dockerfile.langflow permissions so USER 1000 can write to /app/langflow-data

The image finishes with /app owned by root:0 and only grants group write (chmod -R g+rwX /app). With the default runtime USER 1000 (uid/gid 1000), the process isn’t in group 0, so it won’t be able to write. /app/langflow-data is the container’s data dir: the entrypoint’s root-mode permission fix targets /app/langflow-data, and docker-compose.yml configures LANGFLOW_CONFIG_DIR and LANGFLOW_DATABASE_URL under /app/langflow-data.

🛡️ Proposed fix
     && if ! getent passwd 1000 > /dev/null; then \
         useradd --uid 1000 --gid 1000 --no-create-home langflow; \
     fi \
-    && chown -R root:0 /app \
+    && chown -R 1000:0 /app \
     && chmod -R g+rwX /app
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile.langflow` around lines 37 - 44, The Dockerfile currently leaves
/app owned by root:0 and only grants group write, so the runtime USER 1000
cannot write to /app/langflow-data; modify the Dockerfile after the
useradd/groupadd block (and before any USER 1000 switch) to ensure the data
directory is created and owned/writable by uid/gid 1000: e.g. create
/app/langflow-data if missing, run chown -R 1000:1000 /app/langflow-data and
chmod -R u+rwX /app/langflow-data (or chown -R 1000:1000 /app && chmod -R g+rwX
/app when intending to transfer ownership), referencing the existing
groupadd/useradd lines and the chown/chmod commands so the container process
(USER 1000) can write to /app/langflow-data at runtime.

Comment on lines +24 to +36
# Recursively chown data directory to langflow user
try:
import shutil
# Change ownership of data_dir and all contents to uid 1000, gid 1000
for root, dirs, files in os.walk(data_dir):
os.chown(root, 1000, 1000)
for d in dirs:
os.chown(os.path.join(root, d), 1000, 1000)
for f in files:
os.chown(os.path.join(root, f), 1000, 1000)
except OSError as e:
print(f"ERROR: Failed to change ownership of {data_dir}: {e}", file=sys.stderr)
sys.exit(1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Does Python's os.chown support the follow_symlinks keyword argument on Linux?

💡 Result:

Yes. On Linux, Python’s os.chown supports the keyword-only follow_symlinks argument. The official Python docs for os.chown show the signature os.chown(path, uid, gid, *, dir_fd=None, follow_symlinks=True) and describe the behavior of follow_symlinks=False (“operate on the symbolic link itself rather than the file pointed to by the link”). They also state you can check whether follow_symlinks is supported on your platform using os.supports_follow_symlinks, and if it’s unavailable, using it raises NotImplementedError [1]. So on Linux specifically, you should be able to pass follow_symlinks (and if you are on a configuration where it isn’t supported, detect that via os.supports_follow_symlinks and handle NotImplementedError) [1].

Citations:


Harden root recursive chown against symlinks

os.chown follows symlinks by default on Linux; the proposed change (drop the redundant for d in dirs chown + pass follow_symlinks=False for the files paths) prevents a symlink in the mounted volume from redirecting ownership outside langflow-data. Also remove the unused import shutil.

🛡️ Suggested fix
-    # Recursively chown data directory to langflow user
     try:
-        import shutil
-        # Change ownership of data_dir and all contents to uid 1000, gid 1000
+        # Change ownership of data_dir and all contents to uid 1000, gid 1000.
+        # follow_symlinks=False prevents a symlink planted in the volume from
+        # redirecting chown to files outside langflow-data.
         for root, dirs, files in os.walk(data_dir):
-            os.chown(root, 1000, 1000)
-            for d in dirs:
-                os.chown(os.path.join(root, d), 1000, 1000)
-            for f in files:
-                os.chown(os.path.join(root, f), 1000, 1000)
+            os.chown(root, 1000, 1000)
+            for name in files:
+                os.chown(os.path.join(root, name), 1000, 1000, follow_symlinks=False)
     except OSError as e:
         print(f"ERROR: Failed to change ownership of {data_dir}: {e}", file=sys.stderr)
         sys.exit(1)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/langflow-entrypoint.py` around lines 24 - 36, Remove the unused
"import shutil", stop separately iterating dirs, and make all os.chown calls
non-following so symlinks inside data_dir cannot redirect ownership outside the
tree: when walking data_dir with os.walk, call os.chown(root, 1000, 1000,
follow_symlinks=False) for each root and for each file path (os.path.join(root,
f)) call os.chown(..., follow_symlinks=False); drop the separate "for d in dirs"
chown loop and keep the existing OSError handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docker enhancement 🔵 New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants