feat: Migrate to UBI 9 base images for enterprise compliance and security improvements#1738
feat: Migrate to UBI 9 base images for enterprise compliance and security improvements#1738lucaseduoli wants to merge 4 commits into
Conversation
…rity improvements (#1714) * feat: migrate to UBI 9 base images for enterprise compliance * fix: update scripts/backend-entrypoint.sh for UBI9 compliance * fix: changed microdnf to dnf for ubi9 compatability * fix: Added --allowerasing flag to resolve curl-minimal conflict * fix: updated files * fix: mitigate backend and frontend CVEs by tightening dependency versions and removing runtime pip * fix: remediate frontend CVE dependency overrides and harden backend runtime image * fix: remediate dependency CVEs and reduce runtime image surface * fix: harden backend and frontend runtime images for CVE remediation * fix: apply comprehensive CVE fixes for backend and frontend containers * fix: explicitly copy package-lock.json to resolve npm ci build failure * fix: allow package.json and package-lock.json in Docker build context * fix: resolve CVEs across backend and frontend containers * fix: resolve CVE's * fix: resolve OSS CVE's * fix: resolve frontend CVEs * fix: upgrade system packages and nodemon dependencies to resolve issues * fix: add pysqlite3-binary to resolve SQLite version incompatibility * fix: resolved sqlite error * Update scripts/langflow-entrypoint.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * fix: fixed path issues * Update Dockerfile.langflow Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * fix: resolve UBI9 mem0 permission crash, preserve entrypoint, and pin base images * fix: UBI9 mem0 crash * fix: resolved dotenv issue * Fixed permission and wheel issue with langflow and backend * style: ruff autofix (auto) * changed langflow to 1.9.6.rc0 * fix image version * Reduce images size --------- Co-authored-by: Gautham N Pai <gauthamnpai@ibm.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Lucas Oliveira <lucas.edu.oli@hotmail.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
WalkthroughMigrates the project to Python 3.12 and modernizes backend, frontend, and Langflow containers to UBI9 multi-stage builds; updates entrypoints (privilege handling), adjusts dependency constraints and tooling targets, expands container ignore rules, and updates Helm args/env for Langflow. ChangesPython 3.12 and UBI9 Container Migration
Sequence Diagram(s)sequenceDiagram
participant Builder
participant Runtime
participant Entrypoint
Builder->>Runtime: copy built artifacts (with --chown)
Runtime->>Entrypoint: install entrypoint, pre-create & chown dirs, set ENV
Entrypoint->>Runtime: fix perms/chown, create HOME, drop privileges if root, exec app
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Dockerfile.frontend`:
- Around line 14-18: The nodemon CVE remediation is being applied only in the
deps stage, so it does not affect the shipped runtime image. Update the
Dockerfile.frontend logic around the nodemon install/removal block to either
remove it entirely if it is just scanner noise, or move the fix into the runtime
stage where the final image is built. Make sure the remediation targets the
actual nodemon package paths present in the image and avoid masking failures
with || true so broken or no-op patches are visible.
In `@Dockerfile.langflow`:
- Around line 20-26: Replace the direct curl|tar pipeline with a safe
download-and-verify flow: curl -o the tarball (sqlite-autoconf-3460000.tar.gz)
and also fetch the corresponding SHA3-HASH file from the SQLite site, then
verify the tarball's SHA3-256 digest against the official SHA3-HASH (using a
tool that supports SHA3-256, e.g., sha3sum or openssl dgst -sha3-256) before
extracting; only run ./configure/make/install and rm -rf sqlite-autoconf-3460000
after the checksum passes, and fail the build if verification fails.
In `@kubernetes/helm/openrag/templates/langflow/deployment.yaml`:
- Around line 53-61: Replace the hardcoded /app/data values for the environment
variables HOME, XDG_CACHE_HOME, MEM0_DIR, and LANGFLOW_CONFIG_DIR in the
Langflow container spec with the Helm chart value for the data mount path (use
the existing langflow.persistence.mountPath value used elsewhere in the chart);
update the env entries (names HOME, XDG_CACHE_HOME, MEM0_DIR,
LANGFLOW_CONFIG_DIR) to reference that template variable so they follow any
overridden PVC mountPath at deploy time.
In `@scripts/langflow-entrypoint.py`:
- Around line 19-23: The code hardcodes data_dir = pathlib.Path("/app/data") and
unconditionally attempts shutil.chown and overwrites HOME/USER even when not
dropping privileges; change this to respect arbitrary UIDs by (1) using an
environment-provided writable path fallback (e.g.,
os.environ.get("APP_DATA_DIR") or keep existing default but don't force
ownership), (2) only perform shutil.chown(data_dir, ...) and chmod when running
as root (check os.geteuid() == 0) and when the target uid/gid exist, and (3)
only set os.environ["HOME"] and os.environ["USER"] after you actually drop
privileges (i.e., inside the branch where you perform the privilege change),
leaving runtime-provided HOME/USER untouched otherwise; reference the data_dir,
shutil.chown, chmod calls and the HOME/USER environment assignments to locate
changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 25706348-1651-4417-8093-2a02ff32993e
⛔ Files ignored due to path filters (2)
frontend/package-lock.jsonis excluded by!**/package-lock.jsonuv.lockis excluded by!**/*.lock
📒 Files selected for processing (10)
.dockerignore.python-versionDockerfile.backendDockerfile.frontendDockerfile.langflowfrontend/package.jsonkubernetes/helm/openrag/templates/langflow/deployment.yamlpyproject.tomlscripts/backend-entrypoint.shscripts/langflow-entrypoint.py
| RUN cd /usr/lib/node_modules/nodemon 2>/dev/null && npm install debug@4.3.7 glob-parent@5.1.2 minimatch@10.2.3 semver@7.5.4 --save 2>/dev/null || true \ | ||
| && rm -rf /usr/lib/node_modules/nodemon/node_modules/debug-3.0.1 2>/dev/null || true \ | ||
| && rm -rf /usr/lib/node_modules/nodemon/node_modules/glob-parent-3.0.1 2>/dev/null || true \ | ||
| && rm -rf /usr/lib/node_modules/nodemon/node_modules/minimatch-3.0.1 2>/dev/null || true \ | ||
| && rm -rf /usr/lib/node_modules/nodemon/node_modules/semver-3.0.1 2>/dev/null || true |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm whether the minimal runtime base even contains nodemon, and inspect the layout the patch assumes.
fd -t f 'Dockerfile.frontend' --exec rg -n 'nodejs-20|nodemon|node_modules' {}
# Heuristic: npm uses flat dirs (node_modules/debug), not versioned (debug-3.0.1).
rg -n 'node_modules/.*-3\.0\.1' -g 'Dockerfile.frontend'Repository: langflow-ai/openrag
Length of output: 1481
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect the full Dockerfile.frontend around deps/builder/runtime + nodemon mentions
echo "=== Dockerfile.frontend (relevant excerpts) ==="
nl -ba Dockerfile.frontend | sed -n '1,120p' | cat
# 2) Find every nodemon occurrence in the repo (esp. package.json / lockfiles)
echo "=== nodemon occurrences ==="
rg -n "nodemon" -S . --glob '!**/node_modules/**' --glob '!**/.git/**' --glob '!**/dist/**'Repository: langflow-ai/openrag
Length of output: 172
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Dockerfile.frontend (with line numbers, first 120 lines) ==="
cat -n Dockerfile.frontend | sed -n '1,120p'
echo "=== nodemon occurrences (excluding node_modules/dist/.git) ==="
rg -n "nodemon" -S . \
--glob '!**/node_modules/**' \
--glob '!**/dist/**' \
--glob '!**/.git/**'Repository: langflow-ai/openrag
Length of output: 4135
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect frontend deps for nodemon
if [ -f frontend/package.json ]; then
echo "=== frontend/package.json (nodemon mentions) ==="
rg -n "nodemon" frontend/package.json || true
fi
# Inspect lockfile for nodemon
for f in frontend/package-lock.json package-lock.json; do
if [ -f "$f" ]; then
echo "=== $f (nodemon mentions) ==="
rg -n "nodemon" "$f" || true
fi
doneRepository: langflow-ai/openrag
Length of output: 166
🌐 Web query:
Does the Red Hat UBI9 Node.js 20 minimal image (registry.access.redhat.com/ubi9/nodejs-20-minimal) include nodemon by default (as a global npm package under /usr/lib/node_modules)?
💡 Result:
No, the Red Hat UBI9 Node.js 20 minimal image (registry.access.redhat.com/ubi9/nodejs-20-minimal) does not include nodemon by default [1][2][3]. The UBI minimal images are designed to be extremely lightweight, production-ready runtime environments that contain only the necessary dependencies to run a Node.js application, excluding build-time tools, development dependencies, and extra global packages like nodemon [2][4]. While some Red Hat errata advisories mention nodejs-nodemon as a separate package available within the broader RHEL/UBI software ecosystem [5][6][7], it is not pre-installed in the nodejs-20-minimal container image. If you require nodemon for your development workflow, it should be installed as a devDependencies in your project's package.json [8] or installed within your Dockerfile if needed, although it is standard practice to exclude such development tools from final production images [4][9].
Citations:
- 1: https://catalog.redhat.com/en/software/containers/ubi9/nodejs-20-minimal/64770ddd0e699534bb564b3b
- 2: https://github.com/sclorg/s2i-nodejs-container/blob/master/20-minimal/README.md
- 3: https://docs.redhat.com/en/documentation/red_hat_build_of_node.js/20/html-single/release_notes_for_node.js_20/index
- 4: https://dev.to/zyvop/docker-for-developers-stop-it-works-on-my-machine-forever-35pj
- 5: https://access.redhat.com/errata/RHEA-2024:9453
- 6: https://access.redhat.com/errata/RHBA-2025:4482
- 7: https://access.redhat.com/errata/RHSA-2026:8339
- 8: https://dev.to/pawper/on-the-shoulders-of-giants-package-registries-node-npm-46m2
- 9: https://dev.to/raju_dandigam/optimizing-docker-images-for-typescript-ai-agents-with-dive-and-multi-stage-builds-3gho
Fix/relocate the nodemon CVE remediation so it affects the shipped runtime image
- The patch modifies system nodemon in the
depsstage (/usr/lib/node_modules/nodemon), but the finalruntimeimage is based onregistry.access.redhat.com/ubi9/nodejs-20-minimal, which does not include globalnodemon; the app also doesn’t listnodemoninfrontend/package*.json, so this block won’t change the runtime image CVE posture. - The
rm -rf /usr/lib/node_modules/nodemon/node_modules/<pkg>-3.0.1 ... || trueremovals target version-suffixed paths and mask failures, so even if nodemon were present they can silently do nothing.
Either remove the block if it’s only for scanner appeasement, or move the remediation into the runtime stage (and patch the actual directories/packages present there) without || true.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Dockerfile.frontend` around lines 14 - 18, The nodemon CVE remediation is
being applied only in the deps stage, so it does not affect the shipped runtime
image. Update the Dockerfile.frontend logic around the nodemon install/removal
block to either remove it entirely if it is just scanner noise, or move the fix
into the runtime stage where the final image is built. Make sure the remediation
targets the actual nodemon package paths present in the image and avoid masking
failures with || true so broken or no-op patches are visible.
| # Install SQLite >= 3.35.0 (UBI9 has 3.34.1) from source | ||
| RUN curl -L https://www.sqlite.org/2024/sqlite-autoconf-3460000.tar.gz | tar -xz \ | ||
| && cd sqlite-autoconf-3460000 \ | ||
| && ./configure --prefix=/usr --libdir=/usr/lib64 \ | ||
| && make -j$(nproc) install \ | ||
| && cd .. \ | ||
| && rm -rf sqlite-autoconf-3460000 |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
sqlite-autoconf-3460000.tar.gz sha256 checksum
💡 Result:
The SHA256 checksum for sqlite-autoconf-3460000.tar.gz is 6f8e6a7b335273748816f9b3b62bbdc372a889de8782d7f048c653a447417a7d [1]. This checksum has been verified in third-party supply chain and package management records [1].
Citations:
🌐 Web query:
sqlite-autoconf-3460000.tar.gz SHA256 site:sqlite.org
💡 Result:
The official SQLite download page publishes SHA3-256 hashes for all distributed files, rather than SHA256 hashes [1][2]. For the file sqlite-autoconf-3460000.tar.gz specifically, an official SHA256 hash is not provided by the SQLite project, as they have standardized on SHA3-256 for integrity verification of their distribution artifacts [1][3]. If you are looking for the integrity verification value for that specific version's autoconf package, you should consult the official SQLite download page (sqlite.org/download.html) and locate the SHA3-256 hash associated with that file [1]. Be aware that SQLite occasionally updates distribution files (such as precompiled binaries) if errors are found, so always verify the hash against the current official source [4]. For automated verification, the SQLite download page contains an embedded HTML comment with a CSV table (containing headers: PRODUCT,VERSION,RELATIVE-URL,SIZE-IN-BYTES,SHA3-HASH) that can be parsed to programmatically retrieve the correct, up-to-date SHA3-256 hash for any current release [5][6].
Citations:
- 1: https://sqlite.org/download.html
- 2: https://www2.sqlite.org/matrix/download.html
- 3: https://www.sqlite.org/see/help/sha3sum
- 4: https://sqlite.org/forum/info/07d5a3587a5f9b9d070ad0e82d6af8be4128f3db43ba210e7dde6ef781a8a402
- 5: http://sqlite.org/draft/download.html
- 6: https://sqlite.org/forum/forumpost/93746e9c63?t=h
🌐 Web query:
sqlite-autoconf-3460000.tar.gz SHA3-256
💡 Result:
The SHA3-256 hash for the sqlite-autoconf-3460000.tar.gz file is 83d2acf79453deb7d6520338b1f4585f12e39b27cd370fb08593afa198f471fc [1][2]. This hash is publicly cited in technical documentation and build logs, such as those maintained by supply chain security services tracking the sqlite3-ruby gem, to verify the integrity of the source archive [1][2].
Citations:
- 1: https://my.diffend.io/gems/sqlite3/prev/2.0.2
- 2: https://my.diffend.io/gems/sqlite3/prev/2.0.2-aarch64-linux-musl
🏁 Script executed:
#!/bin/bash
set -euo pipefail
ls -la
sed -n '1,120p' Dockerfile.langflow
echo '---'
nl -ba Dockerfile.langflow | sed -n '1,120p'
echo '---'
nl -ba Dockerfile.langflow | sed -n '120,240p'Repository: langflow-ai/openrag
Length of output: 6508
🏁 Script executed:
#!/bin/bash
set -euo pipefail
url="https://sqlite.org/download.html"
tmp="/tmp/sqlite-download.html"
curl -fsSL "$url" -o "$tmp"
# Extract the embedded CSV rows (SQLite hides it in an HTML comment on the page)
# Then find the row for sqlite-autoconf-3460000.tar.gz and print VERSION + SHA3-HASH.
python3 - <<'PY'
import re, csv, sys, pathlib
html = pathlib.Path("/tmp/sqlite-download.html").read_text(errors="ignore")
# Pull the CSV table from an HTML comment if present
# Example markers we’ve seen: "<!--" ... "PRODUCT,VERSION,RELATIVE-URL,SIZE-IN-BYTES,SHA3-HASH" ... "-->"
m = re.search(r'<!--.*?PRODUCT,VERSION,RELATIVE-URL,SIZE-IN-BYTES,SHA3-HASH.*?-->',
html, flags=re.S)
if not m:
# Fallback: maybe CSV is not in comment on this variant of the page
m = re.search(r'PRODUCT,VERSION,RELATIVE-URL,SIZE-IN-BYTES,SHA3-HASH.*', html, flags=re.S)
block = m.group(0)
# Extract the CSV lines starting from header until blank line (best-effort)
header = "PRODUCT,VERSION,RELATIVE-URL,SIZE-IN-BYTES,SHA3-HASH"
start = block.find(header)
if start == -1:
print("Could not find CSV header in extracted block", file=sys.stderr)
sys.exit(1)
csv_text = block[start:]
# Trim any trailing non-CSV garbage
csv_text = csv_text.split("-->")[0] if "-->" in csv_text else csv_text
lines = [ln for ln in csv_text.splitlines() if ln.strip()]
# Parse CSV with python's CSV reader
reader = csv.reader(lines)
rows = list(reader)
# Skip until header row
hdr_idx = None
for i,r in enumerate(rows):
if r and r[0] == "PRODUCT":
hdr_idx = i
break
if hdr_idx is None:
print("Header row not found after CSV extraction", file=sys.stderr)
sys.exit(1)
rows = rows[hdr_idx+1:]
target = "sqlite-autoconf-3460000.tar.gz"
found = False
for r in rows:
if len(r) < 5:
continue
relative_url = r[2].strip()
sha3 = r[4].strip()
if relative_url.endswith("/"+target) or relative_url.endswith(target) or target in relative_url:
# VERSION is column 1
version = r[1].strip()
print(f"{target} VERSION={version} SHA3-256={sha3}")
found = True
break
if not found:
print(f"Did not find {target} in extracted SQLite download CSV", file=sys.stderr)
sys.exit(2)
PYRepository: langflow-ai/openrag
Length of output: 140
🏁 Script executed:
#!/bin/bash
set -euo pipefail
target="sqlite-autoconf-3460000.tar.gz"
curl -fsSL "https://sqlite.org/download.html" -o /tmp/sqlite-download.html
echo "== grep occurrences =="
rg -n --fixed-string "$target" /tmp/sqlite-download.html || true
echo "== show surrounding context (first hit) =="
line=$(rg -n --fixed-string "$target" /tmp/sqlite-download.html | head -n1 | cut -d: -f1 || true)
if [ -n "${line:-}" ]; then
start=$((line-20))
end=$((line+20))
sed -n "${start},${end}p" /tmp/sqlite-download.html
fi
echo "== attempt SHA3 value nearby (first 200 chars after match) =="
if [ -n "${line:-}" ]; then
# Print the line containing the match and the next 2 lines; hash may be in same row/cell
sed -n "${line},$((line+2))p" /tmp/sqlite-download.html
fiRepository: langflow-ai/openrag
Length of output: 1105
🏁 Script executed:
#!/bin/bash
set -euo pipefail
target="sqlite-autoconf-3460000.tar.gz"
curl -fsSL "https://sqlite.org/download.html" -o /tmp/sqlite-download.html
echo "== grep occurrences =="
rg -n -F "$target" /tmp/sqlite-download.html || true
echo "== show surrounding context (first 3 hits) =="
mapfile -t lines < <(rg -n -F "$target" /tmp/sqlite-download.html | head -n 3 | cut -d: -f1)
for line in "${lines[@]}"; do
start=$((line-30))
end=$((line+30))
echo "--- around line $line ---"
sed -n "${start},${end}p" /tmp/sqlite-download.html | cat
done
echo "== attempt extract SHA3-ish values near matches =="
if [ "${`#lines`[@]}" -gt 0 ]; then
python3 - <<'PY'
import re, pathlib
html = pathlib.Path("/tmp/sqlite-download.html").read_text(errors="ignore")
target = "sqlite-autoconf-3460000.tar.gz"
for m in re.finditer(re.escape(target), html):
i = m.start()
snippet = html[i:i+800]
# SHA3 hashes in the page are 64 hex chars (sha3-256)
hashes = re.findall(r'\b[0-9a-f]{64}\b', snippet, flags=re.I)
print("match at", i, "first_hashes", hashes[:5])
break
PY
fiRepository: langflow-ai/openrag
Length of output: 208
🏁 Script executed:
#!/bin/bash
set -euo pipefail
curl -fsSL "https://sqlite.org/download.html" -o /tmp/sqlite-download.html
echo "bytes: $(wc -c </tmp/sqlite-download.html)"
echo "head:"
sed -n '1,60p' /tmp/sqlite-download.html | cat
echo "---- string searches (HTML) ----"
rg -n "SHA3-HASH|PRODUCT,VERSION|sqlite-autoconf|3460000|autoconf" /tmp/sqlite-download.html || true
echo "---- show small context around 3460000 occurrences ----"
python3 - <<'PY'
import pathlib, re
p = pathlib.Path("/tmp/sqlite-download.html")
s = p.read_text(errors="ignore")
for m in re.finditer(r'3460000', s):
i = m.start()
print("\n--- occurrence @", i, "---")
print(s[i-200:i+400])
break
else:
print("\nNo 3460000 literal found in HTML text.")
PYRepository: langflow-ai/openrag
Length of output: 2564
🏁 Script executed:
#!/bin/bash
set -euo pipefail
for url in \
"https://www.sqlite.org/matrix/download.html" \
"https://sqlite.org/matrix/download.html" \
"https://www.sqlite.org/sha3sum.html" \
"https://sqlite.org/sha3sum.html" \
"https://www.sqlite.org/2024/sha3sum.html" \
"https://www.sqlite.org/2024/sha3sum" \
"https://www.sqlite.org/2024/sha3sum.txt" \
"https://www.sqlite.org/2024/sha3sum.csv"
do
echo "==== $url ===="
f="/tmp/$(echo "$url" | sed 's#https\?://##; s#[/]+#_#g').html"
if curl -fsSL "$url" -o "$f" 2>/dev/null; then
echo "downloaded: $(wc -c < "$f") bytes"
rg -n "sqlite-autoconf-3460000\.tar\.gz|3460000|sqlite-autoconf-3460000" "$f" || true
rg -n "SHA3-HASH|SHA3" "$f" | head -n 5 || true
else
echo "failed"
fi
doneRepository: langflow-ai/openrag
Length of output: 508
🌐 Web query:
sqlite-autoconf-3460000.tar.gz SHA3-256 SHA3-HASH sqlite.org download
💡 Result:
I could not find an authoritative, up-to-date source on sqlite.org that lists the SHA3-256 hash for the specific file sqlite-autoconf-3460000.tar.gz. 1) The canonical file download URL for a given autoconf tarball is typically https://sqlite.org//sqlite-autoconf-.tar.gz, and mirrors are reachable elsewhere, e.g. a directory index for sqlite3-3460000 shows sqlite-autoconf-3460000.tar.gz but does not expose a SHA3-256 in the snippet I could retrieve [1]. 2) The sqlite.org download pages I found in the search results only clearly show SHA3-256 values for other autoconf tarballs (e.g., sqlite-autoconf-3530100.tar.gz, sqlite-autoconf-3410000.tar.gz) rather than sqlite-autoconf-3460000.tar.gz [2]. If you want, paste the SHA3-256 line you see on your sqlite.org download page (or the specific sqlite.org download.html/matrix page content around sqlite-autoconf-3460000), and I’ll verify it matches the referenced file name/size and show you how to check it locally with sha3sum.
Citations:
- 1: https://repository.timesys.com/buildsources/s/sqlite3/sqlite3-3460000/
- 2: https://sqlite.org/download.html
Add integrity verification for the SQLite source tarball download (SQLite publishes SHA3-256/SHA3-HASH)
Current Dockerfile pipes curl .../sqlite-autoconf-3460000.tar.gz straight into tar with no integrity check, so a tampered/corrupt archive would be compiled silently. Download the tarball to a file, verify its SHA3-256 against the official SHA3-HASH value for sqlite-autoconf-3460000.tar.gz from the SQLite download page, then extract and clean up.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Dockerfile.langflow` around lines 20 - 26, Replace the direct curl|tar
pipeline with a safe download-and-verify flow: curl -o the tarball
(sqlite-autoconf-3460000.tar.gz) and also fetch the corresponding SHA3-HASH file
from the SQLite site, then verify the tarball's SHA3-256 digest against the
official SHA3-HASH (using a tool that supports SHA3-256, e.g., sha3sum or
openssl dgst -sha3-256) before extracting; only run ./configure/make/install and
rm -rf sqlite-autoconf-3460000 after the checksum passes, and fail the build if
verification fails.
| env: | ||
| - name: HOME | ||
| value: /app/data | ||
| - name: XDG_CACHE_HOME | ||
| value: /app/data/.cache | ||
| - name: MEM0_DIR | ||
| value: /app/data/.mem0 | ||
| - name: LANGFLOW_CONFIG_DIR | ||
| value: /app/data |
There was a problem hiding this comment.
Template env paths from langflow.persistence.mountPath instead of hardcoding /app/data.
Line 73 already makes the data mount path configurable, but Line 55-Line 61 hardcode /app/data. If mountPath is overridden, HOME/cache/config paths won’t follow the PVC mount.
Suggested fix
env:
+ - name: LANGFLOW_DATA_DIR
+ value: {{ .Values.langflow.persistence.mountPath | quote }}
- name: HOME
- value: /app/data
+ value: {{ .Values.langflow.persistence.mountPath | quote }}
- name: XDG_CACHE_HOME
- value: /app/data/.cache
+ value: {{ printf "%s/.cache" .Values.langflow.persistence.mountPath | quote }}
- name: MEM0_DIR
- value: /app/data/.mem0
+ value: {{ printf "%s/.mem0" .Values.langflow.persistence.mountPath | quote }}
- name: LANGFLOW_CONFIG_DIR
- value: /app/data
+ value: {{ .Values.langflow.persistence.mountPath | quote }}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| env: | |
| - name: HOME | |
| value: /app/data | |
| - name: XDG_CACHE_HOME | |
| value: /app/data/.cache | |
| - name: MEM0_DIR | |
| value: /app/data/.mem0 | |
| - name: LANGFLOW_CONFIG_DIR | |
| value: /app/data | |
| env: | |
| - name: LANGFLOW_DATA_DIR | |
| value: {{ .Values.langflow.persistence.mountPath | quote }} | |
| - name: HOME | |
| value: {{ .Values.langflow.persistence.mountPath | quote }} | |
| - name: XDG_CACHE_HOME | |
| value: {{ printf "%s/.cache" .Values.langflow.persistence.mountPath | quote }} | |
| - name: MEM0_DIR | |
| value: {{ printf "%s/.mem0" .Values.langflow.persistence.mountPath | quote }} | |
| - name: LANGFLOW_CONFIG_DIR | |
| value: {{ .Values.langflow.persistence.mountPath | quote }} |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@kubernetes/helm/openrag/templates/langflow/deployment.yaml` around lines 53 -
61, Replace the hardcoded /app/data values for the environment variables HOME,
XDG_CACHE_HOME, MEM0_DIR, and LANGFLOW_CONFIG_DIR in the Langflow container spec
with the Helm chart value for the data mount path (use the existing
langflow.persistence.mountPath value used elsewhere in the chart); update the
env entries (names HOME, XDG_CACHE_HOME, MEM0_DIR, LANGFLOW_CONFIG_DIR) to
reference that template variable so they follow any overridden PVC mountPath at
deploy time.
| data_dir = pathlib.Path("/app/data") | ||
| try: | ||
| data_dir.chmod(0o777) | ||
| except OSError: | ||
| shutil.chown(data_dir, user=1000, group=1000) | ||
| except (OSError, PermissionError): |
There was a problem hiding this comment.
Avoid forcing uid-1000 home/path settings when running as arbitrary non-root UID.
Line 19 hardcodes /app/data, and Line 56/Line 57 always overwrite HOME/USER even when Line 51 skips privilege dropping. That can break OpenShift-style arbitrary UID runs and override runtime-provided writable paths.
Suggested fix
-# Ensure data directory is writable by the langflow user
-data_dir = pathlib.Path("/app/data")
+# Keep data path aligned with runtime/chart configuration.
+data_dir = pathlib.Path(os.environ.get("LANGFLOW_DATA_DIR", "/app/data"))
@@
if os.getuid() == 0:
os.setgid(1000)
os.setuid(1000)
-
-# Restore environment variables to reflect the unprivileged user.
-os.environ["HOME"] = home
-os.environ["USER"] = user
+ # Restore env only when we actually switched identities.
+ os.environ["HOME"] = home
+ os.environ["USER"] = user
+else:
+ os.environ.setdefault("HOME", str(data_dir))
+ os.environ.setdefault("USER", str(os.getuid()))Also applies to: 51-57
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/langflow-entrypoint.py` around lines 19 - 23, The code hardcodes
data_dir = pathlib.Path("/app/data") and unconditionally attempts shutil.chown
and overwrites HOME/USER even when not dropping privileges; change this to
respect arbitrary UIDs by (1) using an environment-provided writable path
fallback (e.g., os.environ.get("APP_DATA_DIR") or keep existing default but
don't force ownership), (2) only perform shutil.chown(data_dir, ...) and chmod
when running as root (check os.geteuid() == 0) and when the target uid/gid
exist, and (3) only set os.environ["HOME"] and os.environ["USER"] after you
actually drop privileges (i.e., inside the branch where you perform the
privilege change), leaving runtime-provided HOME/USER untouched otherwise;
reference the data_dir, shutil.chown, chmod calls and the HOME/USER environment
assignments to locate changes.
edwinjosechittilappilly
left a comment
There was a problem hiding this comment.
LGTM awaiting integration test
This pull request introduces major updates to the Dockerfiles, deployment scripts, and dependency management across the backend, frontend, and Langflow components to improve enterprise compatibility, security, and OpenShift readiness. The changes focus on migrating to Red Hat UBI 9 minimal images, enforcing non-root user execution, upgrading dependencies for security, and improving container build and runtime practices.
Containerization and Security Enhancements
Dockerfile.backend,Dockerfile.frontend,Dockerfile.langflow) from Debian/Ubuntu-based images to Red Hat UBI 9 minimal images for better enterprise and OpenShift compatibility, and updated build steps to usemicrodnfinstead ofapt-get. [1] [2] [3]COPY, and pre-creating writable directories with OpenShift-safe permissions. [1] [2]debug,glob-parent,minimatch,semverin Node.js, and system libraries likeglibc,openssl). [1] [2]Dependency and Compatibility Updates
.python-version,pyproject.toml, Dockerfiles). [1] [2] [3] [4] [5]agentd,python-multipart, and additional packages likeauthlib,protobuf,idna, andurllib3. [1] [2] [3] [4]Langflow and SQLite Improvements
Build and Ignore File Adjustments
.dockerignoreto allowpackage.jsonandpackage-lock.jsoninto the build context, and improved exclusion patterns for build artifacts. [1] [2]Entrypoint and Script Refinements
gosuwithrunuserin backend entrypoint for privilege dropping, and improved environment variable handling for non-root execution.These changes collectively modernize the build and runtime environments for OpenRAG, increase security, and ensure compatibility with enterprise and OpenShift platforms.
Summary by CodeRabbit