Full port of llm-security plugin for internal use on Windows with GitHub Copilot CLI. Protocol translation layer (copilot-hook-runner.mjs) normalizes Copilot camelCase I/O to Claude Code snake_case format — all original hook scripts run unmodified. - 8 hooks with protocol translation (stdin/stdout/exit code) - 18 SKILL.md skills (Agent Skills Open Standard) - 6 .agent.md agent definitions - 20 scanners + 14 scanner lib modules (unchanged) - 14 knowledge files (unchanged) - 39 test files including copilot-port-verify.mjs (17 tests) - Windows-ready: node:path, os.tmpdir(), process.execPath, no bash Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
16 KiB
16 KiB
Secrets Detection Patterns
Usage
These patterns are used by:
pre-edit-secrets.mjshook — blocks Write/Edit operations containing secrets before they reach diskskill-scanner-agent— flags skills and commands that hardcode or expose secrets
Patterns are JavaScript-compatible regex strings. Apply with the g (global) and i (case-insensitive) flags unless noted otherwise.
Pattern Format
Each pattern includes:
id: Unique identifier for logging and suppressionregex: JavaScript-compatible regex (string form, apply withnew RegExp(...))description: What it detectsseverity:critical/high/medium/lowfalse_positive_notes: When this pattern might false-match
Patterns
1. AWS
AWS Access Key ID
- ID:
aws-access-key-id - Regex:
\bAKIA[0-9A-Z]{16}\b - Description: AWS Access Key ID. Always starts with
AKIAfollowed by 16 uppercase alphanumeric characters. - Severity: critical
- False Positive Notes: None — this prefix+length combination is highly specific to AWS. No known false positives in practice.
AWS Secret Access Key
- ID:
aws-secret-access-key - Regex:
(?i)aws[_\-\s.]*secret[_\-\s.]*(?:access[_\-\s.]*)?key["'\s]*[:=]["'\s]*([A-Za-z0-9/+]{40}) - Description: AWS Secret Access Key — 40-character base64 string following a label like
aws_secret_key,AWS_SECRET_ACCESS_KEY, etc. - Severity: critical
- False Positive Notes: Generic 40-char base64 strings can appear in other contexts. Require the
aws+secretlabel context.
AWS Session Token
- ID:
aws-session-token - Regex:
(?i)aws[_\-\s.]*session[_\-\s.]*token["'\s]*[:=]["'\s]*([A-Za-z0-9/+=]{100,}) - Description: Temporary AWS session token (STS). Much longer than access keys — typically 200-400 characters.
- Severity: critical
- False Positive Notes: Long base64 blobs in unrelated contexts (e.g., test fixtures, encoded images). Require the
session_tokenlabel.
2. Azure
Azure Storage Account Key
- ID:
azure-storage-key - Regex:
(?i)AccountKey=([A-Za-z0-9+/]{86}==) - Description: Azure Storage Account key embedded in a connection string. Always exactly 88 characters ending in
==. - Severity: critical
- False Positive Notes: None — the
AccountKey=prefix plus exact length is highly specific.
Azure Storage Connection String
- ID:
azure-storage-connstr - Regex:
DefaultEndpointsProtocol=https?;AccountName=[^;]+;AccountKey=[A-Za-z0-9+/]{86}== - Description: Full Azure Storage connection string including account name and key.
- Severity: critical
- False Positive Notes: None.
Azure SAS Token
- ID:
azure-sas-token - Regex:
(?i)(?:sv|sig|se|sp|spr|srt)=[A-Za-z0-9%+/=&]{10,}(?:&(?:sv|sig|se|sp|spr|srt)=[A-Za-z0-9%+/=&]{1,}){3,} - Description: Azure Shared Access Signature (SAS) token — URL query string containing multiple SAS parameters.
- Severity: high
- False Positive Notes: URL-encoded query strings with similar parameter names. Require at least 4 distinct SAS parameters (
sv,sig,se,sp).
Azure Client Secret
- ID:
azure-client-secret - Regex:
(?i)client[_\-]?secret["'\s]*[:=]["'\s]*([A-Za-z0-9~._\-]{34,40}) - Description: Azure AD / Entra ID application client secret — 34-40 character alphanumeric string.
- Severity: critical
- False Positive Notes: Generic password fields with similar length. Always flag and require human review.
Azure Service Bus Connection String
- ID:
azure-servicebus-connstr - Regex:
Endpoint=sb://[^;]+;SharedAccessKeyName=[^;]+;SharedAccessKey=[A-Za-z0-9+/=]{43}= - Description: Azure Service Bus connection string with shared access key.
- Severity: critical
- False Positive Notes: None — format is highly specific.
3. Google Cloud Platform
GCP API Key
- ID:
gcp-api-key - Regex:
\bAIza[0-9A-Za-z_\-]{35}\b - Description: Google Cloud / Firebase API key. Always starts with
AIzafollowed by 35 alphanumeric characters. - Severity: high
- False Positive Notes: None — prefix is specific. Note: GCP API keys have varying scopes; some are safe to expose (browser-restricted keys), but flag all for review.
GCP Service Account JSON Marker
- ID:
gcp-service-account-json - Regex:
"type"\s*:\s*"service_account" - Description: Google Cloud service account JSON credential file marker. The presence of this key indicates a full service account credential object.
- Severity: critical
- False Positive Notes: Only matches within JSON credential blobs. If found alongside
private_key, treat as confirmed credential leak.
4. GitHub
GitHub Personal Access Token (Classic)
- ID:
github-pat-classic - Regex:
\bghp_[A-Za-z0-9]{36}\b - Description: GitHub classic personal access token (PAT). Prefix
ghp_followed by exactly 36 alphanumeric characters. - Severity: critical
- False Positive Notes: None — prefix is specific to GitHub.
GitHub Fine-Grained Personal Access Token
- ID:
github-pat-fine-grained - Regex:
\bgithub_pat_[A-Za-z0-9_]{82}\b - Description: GitHub fine-grained PAT introduced in 2022. Longer and more structured than classic PATs.
- Severity: critical
- False Positive Notes: None.
GitHub OAuth Token
- ID:
github-oauth-token - Regex:
\bgho_[A-Za-z0-9]{36}\b - Description: GitHub OAuth access token issued via OAuth app flow.
- Severity: critical
- False Positive Notes: None.
GitHub Actions / Server Token
- ID:
github-server-token - Regex:
\bghs_[A-Za-z0-9]{36}\b - Description: GitHub Apps installation token or Actions runner token.
- Severity: high
- False Positive Notes: None.
5. npm
npm Automation / Publish Token
- ID:
npm-token - Regex:
\bnpm_[A-Za-z0-9]{36}\b - Description: npm registry automation or publish token. Prefix
npm_followed by 36 alphanumeric characters. - Severity: critical
- False Positive Notes: None — prefix is specific to npm tokens issued after 2021. Older tokens in
.npmrcare caught by the legacy pattern below.
npm Legacy Auth Token (.npmrc)
- ID:
npm-legacy-auth - Regex:
//registry\.npmjs\.org/:_authToken\s*=\s*([a-f0-9\-]{36,}) - Description: Legacy npm authentication token in
.npmrcformat. - Severity: critical
- False Positive Notes: None.
6. Generic API Keys and Authorization Headers
Bearer Token in Authorization Header
- ID:
bearer-token - Regex:
(?i)Authorization\s*[:=]\s*["']?Bearer\s+([A-Za-z0-9\-._~+/]+=*)\b - Description: HTTP Authorization header with Bearer scheme. Common in hardcoded fetch/axios calls.
- Severity: high
- False Positive Notes: High false positive rate when the value is a variable reference like
Bearer ${token}orBearer <your-token>. Skip matches containing$,<,>, or{.
Generic api_key / api-key Assignment
- ID:
generic-api-key - Regex:
(?i)\bapi[_\-]?key\s*[:=]\s*["']([A-Za-z0-9\-._]{16,64})["'] - Description: Generic API key assignment in config files, source code, or environment exports.
- Severity: high
- False Positive Notes: Placeholder values like
your-api-key-here,<API_KEY>,REPLACE_ME,xxx.... Skip matches where the value is all-same-character or contains angle brackets.
OpenAI API Key (Legacy Format)
- ID:
openai-api-key-legacy - Regex:
\bsk-[A-Za-z0-9]{20}T3BlbkFJ[A-Za-z0-9]{20}\b - Description: OpenAI API key in the legacy format. The substring
T3BlbkFJis base64 forOpenAI. - Severity: critical
- False Positive Notes: None for the legacy format.
OpenAI Project-Scoped Key
- ID:
openai-project-key - Regex:
\bsk-proj-[A-Za-z0-9\-_]{40,}\b - Description: OpenAI project-scoped API key introduced in 2024.
- Severity: critical
- False Positive Notes: None.
Anthropic API Key
- ID:
anthropic-api-key - Regex:
\bsk-ant-api03-[A-Za-z0-9\-_]{93}\b - Description: Anthropic Claude API key.
- Severity: critical
- False Positive Notes: None — prefix plus exact length is highly specific.
7. Private Keys (PEM Format)
PEM header patterns detect private key material. The regex patterns below use escaped hyphens so they match the literal PEM markers in files at scan time.
RSA Private Key Header
- ID:
rsa-private-key - Regex:
-{5}BEGIN RSA PRIVATE KEY-{5} - Description: PEM-encoded RSA private key. The header alone is sufficient to flag — do not require the full key body.
- Severity: critical
- False Positive Notes: Test fixtures and documentation examples sometimes include truncated PEM blocks. Flag regardless — a truncated key in committed code still indicates a process failure.
EC / DSA / OpenSSH Private Key Header
- ID:
ec-private-key - Regex:
-{5}BEGIN (?:EC|DSA|OPENSSH|ENCRYPTED) PRIVATE KEY-{5} - Description: PEM-encoded elliptic curve, DSA, or OpenSSH private key.
- Severity: critical
- False Positive Notes: Same as RSA — flag all occurrences.
PKCS#8 Private Key Header
- ID:
pkcs8-private-key - Regex:
-{5}BEGIN PRIVATE KEY-{5} - Description: PKCS#8 encoded private key (format-agnostic, covers RSA, EC, etc.).
- Severity: critical
- False Positive Notes: None.
Implementation note for pre-edit-secrets.mjs: Build these regexes at runtime using new RegExp('-{5}BEGIN RSA PRIVATE KEY-{5}') rather than as regex literals, so the hook script itself is not flagged by secret scanners.
8. Database Connection Strings
PostgreSQL Connection String
- ID:
postgres-connstr - Regex:
postgres(?:ql)?://[^:]+:[^@]+@[^\s'"]+ - Description: PostgreSQL connection URL with embedded credentials in the format
postgresql://user:password@host/db. - Severity: critical
- False Positive Notes: Matches any non-empty password portion. Skip if password segment is
${...},<password>, or*.
MongoDB Connection String
- ID:
mongodb-connstr - Regex:
mongodb(?:\+srv)?://[^:]+:[^@]+@[^\s'"]+ - Description: MongoDB Atlas or local connection string with embedded username and password.
- Severity: critical
- False Positive Notes: Same exclusions as PostgreSQL.
MySQL / MariaDB Connection String
- ID:
mysql-connstr - Regex:
mysql(?:2)?://[^:]+:[^@]+@[^\s'"]+ - Description: MySQL or MariaDB connection URL with credentials.
- Severity: critical
- False Positive Notes: Same exclusions as PostgreSQL.
Redis Connection String with Password
- ID:
redis-connstr - Regex:
redis://:[^@]+@[^\s'"]+ - Description: Redis connection URL with password in the format
redis://:password@host. - Severity: high
- False Positive Notes: Passwordless Redis (
redis://host:6379) does not match this pattern.
Generic JDBC Connection String with Password
- ID:
jdbc-connstr - Regex:
(?i)jdbc:[a-z]+://[^\s"']+;[Pp]assword=[^;\s"']+ - Description: Java JDBC connection string with a
Password=parameter. - Severity: critical
- False Positive Notes: None if
Password=is present with a non-empty value.
9. Passwords in Configuration
password Assignment
- ID:
config-password - Regex:
(?i)(?:^|[\s,;{(])\bpass(?:word|wd)?\s*[:=]\s*["']([^"'$<>{}\s]{6,})["'] - Description: Password assignment in config files (YAML, TOML, JSON, .env, INI). Matches
password: "secret",passwd=hunter2, etc. - Severity: high
- False Positive Notes: High false positive rate in documentation and test fixtures. Skip if value matches common placeholders:
your-password,changeme,example,test,placeholder,<...>,***,xxx.
secret Key Assignment
- ID:
config-secret - Regex:
(?i)(?:^|[\s,;{(])\bsecret\b\s*[:=]\s*["']([^"'$<>{}\s]{8,})["'] - Description: Generic
secretkey assignment in config or environment files. DjangoSECRET_KEYwith a real value is a valid finding. - Severity: high
- False Positive Notes: Same exclusions as
config-password.
Sensitive Environment Variable Assignment
- ID:
dotenv-secret - Regex:
(?i)^(?:export\s+)?[A-Z][A-Z0-9_]*(?:SECRET|KEY|TOKEN|PASSWORD|PASSWD|CREDENTIAL|AUTH)[A-Z0-9_]*\s*=\s*(?!["']?\s*["']?)([A-Za-z0-9+/=\-_.@!#%^&*]{8,}) - Description: Environment variable with a security-sensitive name (contains SECRET, KEY, TOKEN, PASSWORD, etc.) assigned a non-empty literal value. Matches
.envfile lines. - Severity: high
- False Positive Notes: Variables pointing to file paths (e.g.,
KEY_FILE=/etc/ssl/key.pem) or URLs without credentials. Skip values that are all-uppercase (likely a variable reference like${DATABASE_URL}).
10. JWT Tokens
JWT Pattern
- ID:
jwt-token - Regex:
\beyJ[A-Za-z0-9\-_]+\.[A-Za-z0-9\-_]+\.[A-Za-z0-9\-_]+\b - Description: JSON Web Token in its three-part base64url format (
header.payload.signature). The header always starts witheyJ(base64url encoding of{"). - Severity: medium
- False Positive Notes: High false positive rate. JWTs are frequently used in tests, documentation, and mock data. Many JWTs are intentionally short-lived or scope-limited. Flag for human review rather than hard-blocking. Skip matches in files under
tests/,fixtures/,__mocks__/,*.test.*,*.spec.*. Escalate tocriticalonly if the payload segment decodes to contain anexpclaim more than one year in the future.
False Positive Suppression Rules
Apply these globally before reporting any match:
- Placeholder values — Skip if the matched value contains:
your-,<,>,example,placeholder,replace,changeme,xxx,***,TODO,FIXME - Variable references — Skip if the matched value contains:
${,$(,%{,ENV[,os.environ - Test files — Lower severity by one level for matches in:
*.test.ts,*.spec.js,fixtures/,__mocks__/,testdata/ - Documentation — Lower severity for matches in:
*.md,*.txt,docs/,README*— but never suppresscriticalpatterns (PEM key headers, real AWS Access Key IDs) - All-same-character values — Skip if the value is a repetition of a single character (e.g.,
xxxxxxxx,00000000) - Short values — Skip generic patterns if the matched secret value is fewer than 8 characters
Implementation Notes for pre-edit-secrets.mjs
// Build PEM patterns at runtime to avoid triggering hook self-detection:
const PEM_RSA = new RegExp('-{5}BEGIN RSA PRIVATE KEY-{5}');
const PEM_GENERIC = new RegExp('-{5}BEGIN (?:EC|DSA|OPENSSH|ENCRYPTED) PRIVATE KEY-{5}');
const PEM_PKCS8 = new RegExp('-{5}BEGIN PRIVATE KEY-{5}');
const CRITICAL_PATTERNS = [
{ id: 'aws-access-key-id', regex: /\bAKIA[0-9A-Z]{16}\b/g },
{ id: 'github-pat-classic', regex: /\bghp_[A-Za-z0-9]{36}\b/g },
{ id: 'github-pat-fine', regex: /\bgithub_pat_[A-Za-z0-9_]{82}\b/g },
{ id: 'npm-token', regex: /\bnpm_[A-Za-z0-9]{36}\b/g },
{ id: 'openai-project-key', regex: /\bsk-proj-[A-Za-z0-9\-_]{40,}\b/g },
{ id: 'anthropic-api-key', regex: /\bsk-ant-api03-[A-Za-z0-9\-_]{93}\b/g },
{ id: 'rsa-private-key', regex: PEM_RSA },
{ id: 'ec-private-key', regex: PEM_GENERIC },
{ id: 'pkcs8-private-key', regex: PEM_PKCS8 },
];
// Hard-block on any critical match:
for (const { id, regex } of CRITICAL_PATTERNS) {
if (regex.test(fileContent)) {
console.error(`BLOCKED: ${id} detected. Remove secret before editing.`);
process.exit(2); // Non-zero exit blocks the Write/Edit tool use
}
}
For high/medium severity patterns, emit a warning via console.error but exit with 0 (allow the operation to proceed with a visible warning).