-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
Add Hadolint to the build-docker CI job to enforce Dockerfile best practices as a permanent lint gate. Trivy (image vulnerability scanning) is already fully implemented and well-covered — this issue is Hadolint only.
Background
The project's CI pipeline has comprehensive Docker security coverage via Trivy:
- PR builds: CRITICAL/HIGH vulnerability gate + secret scan
- Scheduled: full CRITICAL/HIGH/MEDIUM + misconfiguration + secret scan on production image
- Release: Trivy gates the push to GHCR
What's missing is static Dockerfile linting — catching best-practice violations before the image is even built.
What Hadolint provides
| Property | Detail |
|---|---|
| Maintained by | hadolint/hadolint (active, widely adopted) |
| Rule set | Dockerfile best practices + ShellCheck for RUN commands |
| CI action | hadolint/hadolint-action@v3.1.0 |
| Config file | .hadolint.yaml at repo root — global thresholds only |
| Exit code | Non-zero on any violation above configured threshold |
| Cost | Zero — open source, runs in ~5 seconds |
Current Dockerfile — known Hadolint findings
| Rule | Severity | Stage | Finding |
|---|---|---|---|
DL3008 |
Warning | builder | apt-get install gcc unpinned (line 17) |
DL3008 |
Warning | production | apt-get install poppler-utils unpinned (line 37) |
All other common rules (DL3007, DL3015, DL3042, DL3006) already pass.
DL3008 — Split Treatment Decision
gcc and poppler-utils are not equivalent risks and must be treated differently.
gcc — builder stage (lines 15–19)
gcc is discarded at the multi-stage boundary. It never ships in the production image and cannot be scanned by Trivy. The risk is zero. Suppress via inline comment.
# hadolint ignore=DL3008
# gcc is builder-stage only — never ships to production; exact version pinning not required
RUN apt-get update && apt-get install -y --no-install-recommends \
gcc \
&& rm -rf /var/lib/apt/lists/*poppler-utils — production stage (lines 35–39)
poppler-utils ships in every released image and is runtime-critical (pdfplumber depends on it for PDF rendering). Suppressing DL3008 here is a maintainability-over-reproducibility decision, not a security control. It is only acceptable if the following compensating controls are in place:
- Base image pinned by digest —
python:3.12-slim@sha256:...ensures the OS layer is reproducible even ifpoppler-utilsdrifts between builds. - Trivy enforces fail thresholds in CI — already in place via
security-scan.yml(CRITICAL/HIGH gate on PR builds; CRITICAL/HIGH/MEDIUM on schedule). - Decision documented inline — the Dockerfile comment must make the trade-off explicit.
# hadolint ignore=DL3008
# poppler-utils: apt version pinning intentionally omitted — maintainability over reproducibility.
# Compensated by: (1) base image pinned by digest, (2) Trivy CI gate on production image.
RUN apt-get update && apt-get install -y --no-install-recommends \
poppler-utils \
&& rm -rf /var/lib/apt/lists/*Suppress via inline comment (not blanket .hadolint.yaml rule) — this makes the split rationale visible in the Dockerfile itself rather than hiding it in config.
DL3009 — Remove from pre-commit ignore list
The pre-commit Hadolint hook (currently disabled) suppresses DL3009 (apt cache not cleaned). This suppression is incorrect and should be removed.
Verification (confirmed):
- Builder stage (lines 17–19):
apt-get update && apt-get install gcc && rm -rf /var/lib/apt/lists/*— cleanup is in the sameRUNlayer as the install. ✅ - Production stage (lines 37–39):
apt-get update && apt-get install poppler-utils && rm -rf /var/lib/apt/lists/*— cleanup is in the sameRUNlayer as the install. ✅
The same-layer requirement matters: if cleanup were in a separate RUN step, the package lists would be committed to an intermediate layer and the image-size benefit would be lost even though the files are eventually deleted. The current Dockerfile does this correctly.
DL3009 already passes. Suppressing it hides that compliance and removes the regression guard — if a future commit splits the cleanup into a separate RUN layer, the rule will no longer catch it. Since .hadolint.yaml sets failure-threshold: error and DL3009 is an informational rule, re-enabling it produces warnings only and never blocks CI. Zero noise cost, one regression safeguard gained.
Proposed Changes
1. Add .hadolint.yaml at repo root
failure-threshold: error
# DL3008 (apt version pinning) is handled via inline per-RUN suppression in the Dockerfile
# with documented rationale. See Dockerfile lines 17 and 37.No blanket DL3008 ignore — suppression is applied inline per-stage with justification.
2. Update Dockerfile with inline suppressions
Add # hadolint ignore=DL3008 comments above each apt-get install block with a rationale comment (see above).
3. Add Hadolint step to build-docker job in .github/workflows/ci.yml
Insert before the Build image (no push) step:
- name: Lint Dockerfile
uses: hadolint/hadolint-action@v3.1.0
with:
dockerfile: Dockerfile
config: .hadolint.yaml4. Enable Hadolint in .pre-commit-config.yaml
Uncomment the Hadolint hook and remove DL3009 from the args (it already passes):
- repo: https://github.com/hadolint/hadolint
rev: v2.12.0
hooks:
- id: hadolint-docker
args: [--ignore, DL3008] # DL3008 handled inline in Dockerfile with justification5. Pin trivy-action to full commit SHA
security-scan.yml currently uses aquasecurity/trivy-action@master (floating ref — 7 occurrences). Semver tag pinning is not sufficient here.
Reason: In March 2026, Aquasec's ecosystem was compromised with malicious force-pushes of trivy-action tags. Semver tag pinning (@0.30.0) would have been vulnerable to that attack. Semver tags are mutable Git refs — they can be force-pushed silently. Full commit SHAs cannot be moved.
Replace all 7 occurrences with the full-length commit SHA and an inline version comment:
uses: aquasecurity/trivy-action@57a97c7e7821a5776cebc9bb87c984fa69cba8f1 # v0.35.0SHA verification (2026-03-27): v0.35.0 (released 2026-03-20) resolves to commit 57a97c7e7821a5776cebc9bb87c984fa69cba8f1. Re-verify at time of PR in case a newer stable release has been cut.
Dependabot note: GitHub Dependabot alerts for GitHub Actions are only generated for semver refs, not SHA refs. Once pinned to SHAs, Dependabot version updates (not alerts) must be relied on to keep the pin current. Verify .github/dependabot.yml includes a github-actions ecosystem entry, or add one as part of this PR.
6. Track base image digest pinning
Base image python:3.12-slim should be pinned by digest as a compensating control for the poppler-utils DL3008 suppression. If not done in this PR, raise a follow-up issue with a blocker relationship to this one.
Acceptance Criteria
-
.hadolint.yamladded at repo root withfailure-threshold: error(no blanketDL3008ignore) - Dockerfile has inline
# hadolint ignore=DL3008suppression above buildergccblock with justification comment - Dockerfile has inline
# hadolint ignore=DL3008suppression above productionpoppler-utilsblock with maintainability-over-reproducibility justification -
build-dockerCI job runs Hadolint before building the image - CI passes with current Dockerfile (no violations above error threshold)
- Hadolint hook enabled in
.pre-commit-config.yamlwithDL3009removed from ignore args - All 7
aquasecurity/trivy-action@masterrefs insecurity-scan.ymlreplaced with full commit SHA57a97c7e7821a5776cebc9bb87c984fa69cba8f1and inline# v0.35.0comment (re-verify SHA at time of PR) -
.github/dependabot.ymlincludes agithub-actionsecosystem entry to automate future SHA updates - Base image digest pinning either completed in this PR or tracked as a follow-up issue with a blocker note added here