ktg-plugin-marketplace/plugins/llm-security-copilot/knowledge/secrets-patterns.md
Kjell Tore Guttormsen f418a8fe08 feat(llm-security-copilot): port llm-security v5.1.0 to GitHub Copilot CLI
Full port of llm-security plugin for internal use on Windows with GitHub
Copilot CLI. Protocol translation layer (copilot-hook-runner.mjs)
normalizes Copilot camelCase I/O to Claude Code snake_case format — all
original hook scripts run unmodified.

- 8 hooks with protocol translation (stdin/stdout/exit code)
- 18 SKILL.md skills (Agent Skills Open Standard)
- 6 .agent.md agent definitions
- 20 scanners + 14 scanner lib modules (unchanged)
- 14 knowledge files (unchanged)
- 39 test files including copilot-port-verify.mjs (17 tests)
- Windows-ready: node:path, os.tmpdir(), process.execPath, no bash

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 21:56:10 +02:00

16 KiB

Secrets Detection Patterns

Usage

These patterns are used by:

  • pre-edit-secrets.mjs hook — blocks Write/Edit operations containing secrets before they reach disk
  • skill-scanner-agent — flags skills and commands that hardcode or expose secrets

Patterns are JavaScript-compatible regex strings. Apply with the g (global) and i (case-insensitive) flags unless noted otherwise.


Pattern Format

Each pattern includes:

  • id: Unique identifier for logging and suppression
  • regex: JavaScript-compatible regex (string form, apply with new RegExp(...))
  • description: What it detects
  • severity: critical / high / medium / low
  • false_positive_notes: When this pattern might false-match

Patterns

1. AWS

AWS Access Key ID

  • ID: aws-access-key-id
  • Regex: \bAKIA[0-9A-Z]{16}\b
  • Description: AWS Access Key ID. Always starts with AKIA followed by 16 uppercase alphanumeric characters.
  • Severity: critical
  • False Positive Notes: None — this prefix+length combination is highly specific to AWS. No known false positives in practice.

AWS Secret Access Key

  • ID: aws-secret-access-key
  • Regex: (?i)aws[_\-\s.]*secret[_\-\s.]*(?:access[_\-\s.]*)?key["'\s]*[:=]["'\s]*([A-Za-z0-9/+]{40})
  • Description: AWS Secret Access Key — 40-character base64 string following a label like aws_secret_key, AWS_SECRET_ACCESS_KEY, etc.
  • Severity: critical
  • False Positive Notes: Generic 40-char base64 strings can appear in other contexts. Require the aws + secret label context.

AWS Session Token

  • ID: aws-session-token
  • Regex: (?i)aws[_\-\s.]*session[_\-\s.]*token["'\s]*[:=]["'\s]*([A-Za-z0-9/+=]{100,})
  • Description: Temporary AWS session token (STS). Much longer than access keys — typically 200-400 characters.
  • Severity: critical
  • False Positive Notes: Long base64 blobs in unrelated contexts (e.g., test fixtures, encoded images). Require the session_token label.

2. Azure

Azure Storage Account Key

  • ID: azure-storage-key
  • Regex: (?i)AccountKey=([A-Za-z0-9+/]{86}==)
  • Description: Azure Storage Account key embedded in a connection string. Always exactly 88 characters ending in ==.
  • Severity: critical
  • False Positive Notes: None — the AccountKey= prefix plus exact length is highly specific.

Azure Storage Connection String

  • ID: azure-storage-connstr
  • Regex: DefaultEndpointsProtocol=https?;AccountName=[^;]+;AccountKey=[A-Za-z0-9+/]{86}==
  • Description: Full Azure Storage connection string including account name and key.
  • Severity: critical
  • False Positive Notes: None.

Azure SAS Token

  • ID: azure-sas-token
  • Regex: (?i)(?:sv|sig|se|sp|spr|srt)=[A-Za-z0-9%+/=&]{10,}(?:&(?:sv|sig|se|sp|spr|srt)=[A-Za-z0-9%+/=&]{1,}){3,}
  • Description: Azure Shared Access Signature (SAS) token — URL query string containing multiple SAS parameters.
  • Severity: high
  • False Positive Notes: URL-encoded query strings with similar parameter names. Require at least 4 distinct SAS parameters (sv, sig, se, sp).

Azure Client Secret

  • ID: azure-client-secret
  • Regex: (?i)client[_\-]?secret["'\s]*[:=]["'\s]*([A-Za-z0-9~._\-]{34,40})
  • Description: Azure AD / Entra ID application client secret — 34-40 character alphanumeric string.
  • Severity: critical
  • False Positive Notes: Generic password fields with similar length. Always flag and require human review.

Azure Service Bus Connection String

  • ID: azure-servicebus-connstr
  • Regex: Endpoint=sb://[^;]+;SharedAccessKeyName=[^;]+;SharedAccessKey=[A-Za-z0-9+/=]{43}=
  • Description: Azure Service Bus connection string with shared access key.
  • Severity: critical
  • False Positive Notes: None — format is highly specific.

3. Google Cloud Platform

GCP API Key

  • ID: gcp-api-key
  • Regex: \bAIza[0-9A-Za-z_\-]{35}\b
  • Description: Google Cloud / Firebase API key. Always starts with AIza followed by 35 alphanumeric characters.
  • Severity: high
  • False Positive Notes: None — prefix is specific. Note: GCP API keys have varying scopes; some are safe to expose (browser-restricted keys), but flag all for review.

GCP Service Account JSON Marker

  • ID: gcp-service-account-json
  • Regex: "type"\s*:\s*"service_account"
  • Description: Google Cloud service account JSON credential file marker. The presence of this key indicates a full service account credential object.
  • Severity: critical
  • False Positive Notes: Only matches within JSON credential blobs. If found alongside private_key, treat as confirmed credential leak.

4. GitHub

GitHub Personal Access Token (Classic)

  • ID: github-pat-classic
  • Regex: \bghp_[A-Za-z0-9]{36}\b
  • Description: GitHub classic personal access token (PAT). Prefix ghp_ followed by exactly 36 alphanumeric characters.
  • Severity: critical
  • False Positive Notes: None — prefix is specific to GitHub.

GitHub Fine-Grained Personal Access Token

  • ID: github-pat-fine-grained
  • Regex: \bgithub_pat_[A-Za-z0-9_]{82}\b
  • Description: GitHub fine-grained PAT introduced in 2022. Longer and more structured than classic PATs.
  • Severity: critical
  • False Positive Notes: None.

GitHub OAuth Token

  • ID: github-oauth-token
  • Regex: \bgho_[A-Za-z0-9]{36}\b
  • Description: GitHub OAuth access token issued via OAuth app flow.
  • Severity: critical
  • False Positive Notes: None.

GitHub Actions / Server Token

  • ID: github-server-token
  • Regex: \bghs_[A-Za-z0-9]{36}\b
  • Description: GitHub Apps installation token or Actions runner token.
  • Severity: high
  • False Positive Notes: None.

5. npm

npm Automation / Publish Token

  • ID: npm-token
  • Regex: \bnpm_[A-Za-z0-9]{36}\b
  • Description: npm registry automation or publish token. Prefix npm_ followed by 36 alphanumeric characters.
  • Severity: critical
  • False Positive Notes: None — prefix is specific to npm tokens issued after 2021. Older tokens in .npmrc are caught by the legacy pattern below.

npm Legacy Auth Token (.npmrc)

  • ID: npm-legacy-auth
  • Regex: //registry\.npmjs\.org/:_authToken\s*=\s*([a-f0-9\-]{36,})
  • Description: Legacy npm authentication token in .npmrc format.
  • Severity: critical
  • False Positive Notes: None.

6. Generic API Keys and Authorization Headers

Bearer Token in Authorization Header

  • ID: bearer-token
  • Regex: (?i)Authorization\s*[:=]\s*["']?Bearer\s+([A-Za-z0-9\-._~+/]+=*)\b
  • Description: HTTP Authorization header with Bearer scheme. Common in hardcoded fetch/axios calls.
  • Severity: high
  • False Positive Notes: High false positive rate when the value is a variable reference like Bearer ${token} or Bearer <your-token>. Skip matches containing $, <, >, or {.

Generic api_key / api-key Assignment

  • ID: generic-api-key
  • Regex: (?i)\bapi[_\-]?key\s*[:=]\s*["']([A-Za-z0-9\-._]{16,64})["']
  • Description: Generic API key assignment in config files, source code, or environment exports.
  • Severity: high
  • False Positive Notes: Placeholder values like your-api-key-here, <API_KEY>, REPLACE_ME, xxx.... Skip matches where the value is all-same-character or contains angle brackets.

OpenAI API Key (Legacy Format)

  • ID: openai-api-key-legacy
  • Regex: \bsk-[A-Za-z0-9]{20}T3BlbkFJ[A-Za-z0-9]{20}\b
  • Description: OpenAI API key in the legacy format. The substring T3BlbkFJ is base64 for OpenAI.
  • Severity: critical
  • False Positive Notes: None for the legacy format.

OpenAI Project-Scoped Key

  • ID: openai-project-key
  • Regex: \bsk-proj-[A-Za-z0-9\-_]{40,}\b
  • Description: OpenAI project-scoped API key introduced in 2024.
  • Severity: critical
  • False Positive Notes: None.

Anthropic API Key

  • ID: anthropic-api-key
  • Regex: \bsk-ant-api03-[A-Za-z0-9\-_]{93}\b
  • Description: Anthropic Claude API key.
  • Severity: critical
  • False Positive Notes: None — prefix plus exact length is highly specific.

7. Private Keys (PEM Format)

PEM header patterns detect private key material. The regex patterns below use escaped hyphens so they match the literal PEM markers in files at scan time.

RSA Private Key Header

  • ID: rsa-private-key
  • Regex: -{5}BEGIN RSA PRIVATE KEY-{5}
  • Description: PEM-encoded RSA private key. The header alone is sufficient to flag — do not require the full key body.
  • Severity: critical
  • False Positive Notes: Test fixtures and documentation examples sometimes include truncated PEM blocks. Flag regardless — a truncated key in committed code still indicates a process failure.

EC / DSA / OpenSSH Private Key Header

  • ID: ec-private-key
  • Regex: -{5}BEGIN (?:EC|DSA|OPENSSH|ENCRYPTED) PRIVATE KEY-{5}
  • Description: PEM-encoded elliptic curve, DSA, or OpenSSH private key.
  • Severity: critical
  • False Positive Notes: Same as RSA — flag all occurrences.

PKCS#8 Private Key Header

  • ID: pkcs8-private-key
  • Regex: -{5}BEGIN PRIVATE KEY-{5}
  • Description: PKCS#8 encoded private key (format-agnostic, covers RSA, EC, etc.).
  • Severity: critical
  • False Positive Notes: None.

Implementation note for pre-edit-secrets.mjs: Build these regexes at runtime using new RegExp('-{5}BEGIN RSA PRIVATE KEY-{5}') rather than as regex literals, so the hook script itself is not flagged by secret scanners.


8. Database Connection Strings

PostgreSQL Connection String

  • ID: postgres-connstr
  • Regex: postgres(?:ql)?://[^:]+:[^@]+@[^\s'"]+
  • Description: PostgreSQL connection URL with embedded credentials in the format postgresql://user:password@host/db.
  • Severity: critical
  • False Positive Notes: Matches any non-empty password portion. Skip if password segment is ${...}, <password>, or *.

MongoDB Connection String

  • ID: mongodb-connstr
  • Regex: mongodb(?:\+srv)?://[^:]+:[^@]+@[^\s'"]+
  • Description: MongoDB Atlas or local connection string with embedded username and password.
  • Severity: critical
  • False Positive Notes: Same exclusions as PostgreSQL.

MySQL / MariaDB Connection String

  • ID: mysql-connstr
  • Regex: mysql(?:2)?://[^:]+:[^@]+@[^\s'"]+
  • Description: MySQL or MariaDB connection URL with credentials.
  • Severity: critical
  • False Positive Notes: Same exclusions as PostgreSQL.

Redis Connection String with Password

  • ID: redis-connstr
  • Regex: redis://:[^@]+@[^\s'"]+
  • Description: Redis connection URL with password in the format redis://:password@host.
  • Severity: high
  • False Positive Notes: Passwordless Redis (redis://host:6379) does not match this pattern.

Generic JDBC Connection String with Password

  • ID: jdbc-connstr
  • Regex: (?i)jdbc:[a-z]+://[^\s"']+;[Pp]assword=[^;\s"']+
  • Description: Java JDBC connection string with a Password= parameter.
  • Severity: critical
  • False Positive Notes: None if Password= is present with a non-empty value.

9. Passwords in Configuration

password Assignment

  • ID: config-password
  • Regex: (?i)(?:^|[\s,;{(])\bpass(?:word|wd)?\s*[:=]\s*["']([^"'$<>{}\s]{6,})["']
  • Description: Password assignment in config files (YAML, TOML, JSON, .env, INI). Matches password: "secret", passwd=hunter2, etc.
  • Severity: high
  • False Positive Notes: High false positive rate in documentation and test fixtures. Skip if value matches common placeholders: your-password, changeme, example, test, placeholder, <...>, ***, xxx.

secret Key Assignment

  • ID: config-secret
  • Regex: (?i)(?:^|[\s,;{(])\bsecret\b\s*[:=]\s*["']([^"'$<>{}\s]{8,})["']
  • Description: Generic secret key assignment in config or environment files. Django SECRET_KEY with a real value is a valid finding.
  • Severity: high
  • False Positive Notes: Same exclusions as config-password.

Sensitive Environment Variable Assignment

  • ID: dotenv-secret
  • Regex: (?i)^(?:export\s+)?[A-Z][A-Z0-9_]*(?:SECRET|KEY|TOKEN|PASSWORD|PASSWD|CREDENTIAL|AUTH)[A-Z0-9_]*\s*=\s*(?!["']?\s*["']?)([A-Za-z0-9+/=\-_.@!#%^&*]{8,})
  • Description: Environment variable with a security-sensitive name (contains SECRET, KEY, TOKEN, PASSWORD, etc.) assigned a non-empty literal value. Matches .env file lines.
  • Severity: high
  • False Positive Notes: Variables pointing to file paths (e.g., KEY_FILE=/etc/ssl/key.pem) or URLs without credentials. Skip values that are all-uppercase (likely a variable reference like ${DATABASE_URL}).

10. JWT Tokens

JWT Pattern

  • ID: jwt-token
  • Regex: \beyJ[A-Za-z0-9\-_]+\.[A-Za-z0-9\-_]+\.[A-Za-z0-9\-_]+\b
  • Description: JSON Web Token in its three-part base64url format (header.payload.signature). The header always starts with eyJ (base64url encoding of {").
  • Severity: medium
  • False Positive Notes: High false positive rate. JWTs are frequently used in tests, documentation, and mock data. Many JWTs are intentionally short-lived or scope-limited. Flag for human review rather than hard-blocking. Skip matches in files under tests/, fixtures/, __mocks__/, *.test.*, *.spec.*. Escalate to critical only if the payload segment decodes to contain an exp claim more than one year in the future.

False Positive Suppression Rules

Apply these globally before reporting any match:

  1. Placeholder values — Skip if the matched value contains: your-, <, >, example, placeholder, replace, changeme, xxx, ***, TODO, FIXME
  2. Variable references — Skip if the matched value contains: ${, $(, %{, ENV[, os.environ
  3. Test files — Lower severity by one level for matches in: *.test.ts, *.spec.js, fixtures/, __mocks__/, testdata/
  4. Documentation — Lower severity for matches in: *.md, *.txt, docs/, README* — but never suppress critical patterns (PEM key headers, real AWS Access Key IDs)
  5. All-same-character values — Skip if the value is a repetition of a single character (e.g., xxxxxxxx, 00000000)
  6. Short values — Skip generic patterns if the matched secret value is fewer than 8 characters

Implementation Notes for pre-edit-secrets.mjs

// Build PEM patterns at runtime to avoid triggering hook self-detection:
const PEM_RSA = new RegExp('-{5}BEGIN RSA PRIVATE KEY-{5}');
const PEM_GENERIC = new RegExp('-{5}BEGIN (?:EC|DSA|OPENSSH|ENCRYPTED) PRIVATE KEY-{5}');
const PEM_PKCS8 = new RegExp('-{5}BEGIN PRIVATE KEY-{5}');

const CRITICAL_PATTERNS = [
  { id: 'aws-access-key-id',    regex: /\bAKIA[0-9A-Z]{16}\b/g },
  { id: 'github-pat-classic',   regex: /\bghp_[A-Za-z0-9]{36}\b/g },
  { id: 'github-pat-fine',      regex: /\bgithub_pat_[A-Za-z0-9_]{82}\b/g },
  { id: 'npm-token',            regex: /\bnpm_[A-Za-z0-9]{36}\b/g },
  { id: 'openai-project-key',   regex: /\bsk-proj-[A-Za-z0-9\-_]{40,}\b/g },
  { id: 'anthropic-api-key',    regex: /\bsk-ant-api03-[A-Za-z0-9\-_]{93}\b/g },
  { id: 'rsa-private-key',      regex: PEM_RSA },
  { id: 'ec-private-key',       regex: PEM_GENERIC },
  { id: 'pkcs8-private-key',    regex: PEM_PKCS8 },
];

// Hard-block on any critical match:
for (const { id, regex } of CRITICAL_PATTERNS) {
  if (regex.test(fileContent)) {
    console.error(`BLOCKED: ${id} detected. Remove secret before editing.`);
    process.exit(2); // Non-zero exit blocks the Write/Edit tool use
  }
}

For high/medium severity patterns, emit a warning via console.error but exit with 0 (allow the operation to proceed with a visible warning).


References