Skip to content

SEC-02: add Hadolint to Dockerfile linting in CI #87

@longieirl

Description

@longieirl

Summary

Add Hadolint to the build-docker CI job to enforce Dockerfile best practices as a permanent lint gate. Trivy (image vulnerability scanning) is already fully implemented and well-covered — this issue is Hadolint only.

Background

The project's CI pipeline has comprehensive Docker security coverage via Trivy:

  • PR builds: CRITICAL/HIGH vulnerability gate + secret scan
  • Scheduled: full CRITICAL/HIGH/MEDIUM + misconfiguration + secret scan on production image
  • Release: Trivy gates the push to GHCR

What's missing is static Dockerfile linting — catching best-practice violations before the image is even built.

What Hadolint provides

Property Detail
Maintained by hadolint/hadolint (active, widely adopted)
Rule set Dockerfile best practices + ShellCheck for RUN commands
CI action hadolint/hadolint-action@v3.1.0
Config file .hadolint.yaml at repo root — global thresholds only
Exit code Non-zero on any violation above configured threshold
Cost Zero — open source, runs in ~5 seconds

Current Dockerfile — known Hadolint findings

Rule Severity Stage Finding
DL3008 Warning builder apt-get install gcc unpinned (line 17)
DL3008 Warning production apt-get install poppler-utils unpinned (line 37)

All other common rules (DL3007, DL3015, DL3042, DL3006) already pass.

DL3008 — Split Treatment Decision

gcc and poppler-utils are not equivalent risks and must be treated differently.

gcc — builder stage (lines 15–19)

gcc is discarded at the multi-stage boundary. It never ships in the production image and cannot be scanned by Trivy. The risk is zero. Suppress via inline comment.

# hadolint ignore=DL3008
# gcc is builder-stage only — never ships to production; exact version pinning not required
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    && rm -rf /var/lib/apt/lists/*

poppler-utils — production stage (lines 35–39)

poppler-utils ships in every released image and is runtime-critical (pdfplumber depends on it for PDF rendering). Suppressing DL3008 here is a maintainability-over-reproducibility decision, not a security control. It is only acceptable if the following compensating controls are in place:

  1. Base image pinned by digestpython:3.12-slim@sha256:... ensures the OS layer is reproducible even if poppler-utils drifts between builds.
  2. Trivy enforces fail thresholds in CI — already in place via security-scan.yml (CRITICAL/HIGH gate on PR builds; CRITICAL/HIGH/MEDIUM on schedule).
  3. Decision documented inline — the Dockerfile comment must make the trade-off explicit.
# hadolint ignore=DL3008
# poppler-utils: apt version pinning intentionally omitted — maintainability over reproducibility.
# Compensated by: (1) base image pinned by digest, (2) Trivy CI gate on production image.
RUN apt-get update && apt-get install -y --no-install-recommends \
    poppler-utils \
    && rm -rf /var/lib/apt/lists/*

Suppress via inline comment (not blanket .hadolint.yaml rule) — this makes the split rationale visible in the Dockerfile itself rather than hiding it in config.

DL3009 — Remove from pre-commit ignore list

The pre-commit Hadolint hook (currently disabled) suppresses DL3009 (apt cache not cleaned). This suppression is incorrect and should be removed.

Verification (confirmed):

  • Builder stage (lines 17–19): apt-get update && apt-get install gcc && rm -rf /var/lib/apt/lists/* — cleanup is in the same RUN layer as the install. ✅
  • Production stage (lines 37–39): apt-get update && apt-get install poppler-utils && rm -rf /var/lib/apt/lists/* — cleanup is in the same RUN layer as the install. ✅

The same-layer requirement matters: if cleanup were in a separate RUN step, the package lists would be committed to an intermediate layer and the image-size benefit would be lost even though the files are eventually deleted. The current Dockerfile does this correctly.

DL3009 already passes. Suppressing it hides that compliance and removes the regression guard — if a future commit splits the cleanup into a separate RUN layer, the rule will no longer catch it. Since .hadolint.yaml sets failure-threshold: error and DL3009 is an informational rule, re-enabling it produces warnings only and never blocks CI. Zero noise cost, one regression safeguard gained.

Proposed Changes

1. Add .hadolint.yaml at repo root

failure-threshold: error
# DL3008 (apt version pinning) is handled via inline per-RUN suppression in the Dockerfile
# with documented rationale. See Dockerfile lines 17 and 37.

No blanket DL3008 ignore — suppression is applied inline per-stage with justification.

2. Update Dockerfile with inline suppressions

Add # hadolint ignore=DL3008 comments above each apt-get install block with a rationale comment (see above).

3. Add Hadolint step to build-docker job in .github/workflows/ci.yml

Insert before the Build image (no push) step:

- name: Lint Dockerfile
  uses: hadolint/hadolint-action@v3.1.0
  with:
    dockerfile: Dockerfile
    config: .hadolint.yaml

4. Enable Hadolint in .pre-commit-config.yaml

Uncomment the Hadolint hook and remove DL3009 from the args (it already passes):

- repo: https://github.com/hadolint/hadolint
  rev: v2.12.0
  hooks:
    - id: hadolint-docker
      args: [--ignore, DL3008]  # DL3008 handled inline in Dockerfile with justification

5. Pin trivy-action to full commit SHA

security-scan.yml currently uses aquasecurity/trivy-action@master (floating ref — 7 occurrences). Semver tag pinning is not sufficient here.

Reason: In March 2026, Aquasec's ecosystem was compromised with malicious force-pushes of trivy-action tags. Semver tag pinning (@0.30.0) would have been vulnerable to that attack. Semver tags are mutable Git refs — they can be force-pushed silently. Full commit SHAs cannot be moved.

Replace all 7 occurrences with the full-length commit SHA and an inline version comment:

uses: aquasecurity/trivy-action@57a97c7e7821a5776cebc9bb87c984fa69cba8f1  # v0.35.0

SHA verification (2026-03-27): v0.35.0 (released 2026-03-20) resolves to commit 57a97c7e7821a5776cebc9bb87c984fa69cba8f1. Re-verify at time of PR in case a newer stable release has been cut.

Dependabot note: GitHub Dependabot alerts for GitHub Actions are only generated for semver refs, not SHA refs. Once pinned to SHAs, Dependabot version updates (not alerts) must be relied on to keep the pin current. Verify .github/dependabot.yml includes a github-actions ecosystem entry, or add one as part of this PR.

6. Track base image digest pinning

Base image python:3.12-slim should be pinned by digest as a compensating control for the poppler-utils DL3008 suppression. If not done in this PR, raise a follow-up issue with a blocker relationship to this one.

Acceptance Criteria

  • .hadolint.yaml added at repo root with failure-threshold: error (no blanket DL3008 ignore)
  • Dockerfile has inline # hadolint ignore=DL3008 suppression above builder gcc block with justification comment
  • Dockerfile has inline # hadolint ignore=DL3008 suppression above production poppler-utils block with maintainability-over-reproducibility justification
  • build-docker CI job runs Hadolint before building the image
  • CI passes with current Dockerfile (no violations above error threshold)
  • Hadolint hook enabled in .pre-commit-config.yaml with DL3009 removed from ignore args
  • All 7 aquasecurity/trivy-action@master refs in security-scan.yml replaced with full commit SHA 57a97c7e7821a5776cebc9bb87c984fa69cba8f1 and inline # v0.35.0 comment (re-verify SHA at time of PR)
  • .github/dependabot.yml includes a github-actions ecosystem entry to automate future SHA updates
  • Base image digest pinning either completed in this PR or tracked as a follow-up issue with a blocker note added here

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions