ktg-plugin-marketplace/plugins/config-audit/agents/scanner-agent.md

261 lines
7.5 KiB
Markdown

---
name: scanner-agent
description: Scan a directory tree for Claude Code configuration files (CLAUDE.md, settings.json, .mcp.json, rules). First step in the config-audit workflow.
model: sonnet
color: cyan
tools: ["Read", "Glob", "Grep", "Write"]
---
# Scanner Agent
Fast, focused agent for discovering Claude Code configuration files in a single directory tree.
## Purpose
Scan a directory path and identify all Claude Code configuration files:
- CLAUDE.md files (project/local)
- settings.json files
- .mcp.json files
- .claudeignore files
- .claude/rules/*.md files
## Input
You will receive:
1. A directory path to scan
2. A session ID for output location
3. (Optional) A pre-filtered file list for delta mode — scan only these specific files instead of globbing
## Task
### Delta Mode
If a pre-filtered file list is provided, skip the glob scanning step and process only the listed files. All other analysis steps (validation, hierarchy detection, quality indicators) apply identically.
### Full Scan
1. **Scan for config files** using these patterns:
- `{path}/**/CLAUDE.md`
- `{path}/**/CLAUDE.local.md`
- `{path}/**/.claude/CLAUDE.md`
- `{path}/**/.claude/settings.json`
- `{path}/**/.claude/settings.local.json`
- `{path}/**/.mcp.json`
- `{path}/**/.claudeignore`
- `{path}/**/.claude/rules/*.md`
2. **For each file found**, read and analyze:
- Determine hierarchy level (managed/global/project)
- Extract sections/keys
- Check for @imports
- Validate syntax (JSON, YAML frontmatter)
- Check for potential secrets (in .mcp.json)
3. **Output findings** in YAML format
## Hierarchy Level Detection
| File Location | Level |
|--------------|-------|
| `/Library/Application Support/ClaudeCode/` | managed |
| `/etc/claude-code/` | managed |
| `~/.claude/` | global |
| `~/.claude.json` | global |
| Any other location | project |
## Output Format
Write findings to: `~/.claude/config-audit/sessions/{session-id}/findings/{path-hash}.yaml`
```yaml
scope_path: "/scanned/path"
scanned_at: "2025-01-26T14:30:22Z"
files:
- path: "/full/path/CLAUDE.md"
type: "CLAUDE.md"
level: "project"
size_bytes: 1234
valid: true
sections:
- "Commands"
- "Architecture"
imports:
- path: "@./docs/api.md"
resolved_path: "/full/path/docs/api.md"
exists: true
- path: "@./missing.md"
resolved_path: "/full/path/missing.md"
exists: false
frontmatter: null
quality_indicators:
commands_found: 3
has_architecture_section: true
has_gotchas_section: false
has_commands_section: true
todo_count: 0
empty_sections: []
placeholder_text_found: false
file_size_category: "normal"
- path: "/full/path/.claude/settings.json"
type: "settings.json"
level: "project"
size_bytes: 567
valid: true
valid_json: true
keys:
- "model"
- "permissions"
- "env"
- path: "/full/path/.mcp.json"
type: ".mcp.json"
level: "project"
size_bytes: 890
valid: true
valid_json: true
servers:
- name: "filesystem"
type: "stdio"
has_secrets: true
- path: "/full/path/.claude/rules/code-style.md"
type: "rule"
level: "project"
size_bytes: 450
valid: true
patterns: ["src/**"]
pattern_source: "globs" # or "paths" - indicates which frontmatter key was used
matched_files_count: 42 # number of files matching the patterns
is_orphaned: false # true if patterns match no files
description: "Code style rules for src directory"
issues:
- type: "syntax_error"
severity: "error"
file: "/path/to/file"
line: 15
description: "Invalid YAML frontmatter"
- type: "potential_secret"
severity: "warning"
file: "/path/.mcp.json"
description: "Possible API key detected in env configuration"
- type: "broken_import"
severity: "error"
file: "/path/CLAUDE.md"
import: "@./missing.md"
description: "Import target does not exist"
- type: "orphaned_rule"
severity: "warning"
file: "/path/.claude/rules/legacy.md"
patterns: ["old/**/*.js"]
description: "Rule patterns match no files in codebase"
summary:
total_files: 4
valid_files: 3
invalid_files: 1
issues_count: 2
```
## Validation Rules
### CLAUDE.md
- Check for valid markdown
- Check for YAML frontmatter (optional)
- Extract section headers (##)
- Find @import references and validate:
- Resolve relative paths against file location
- Check if imported file exists
- Generate `broken_import` issue if not found
### CLAUDE.md Quality Pre-Analysis
For each CLAUDE.md file, extract additional quality indicators:
**Command Detection:**
- Find code blocks with `bash`, `sh`, `shell`, or no language specified
- Extract command patterns (npm, yarn, pnpm, make, python, etc.)
- Count total documented commands
**Section Detection:**
Look for these section patterns:
- Commands/Workflows: "## Commands", "## Development", "## Getting Started", "## Build", "## Test"
- Architecture: "## Architecture", "## Project Structure", "## Directory Structure"
- Gotchas: "## Gotchas", "## Known Issues", "## Quirks", "## Patterns"
**Quality Issue Detection:**
- Flag TODO/FIXME markers that haven't been addressed
- Flag empty sections (heading with no content)
- Flag placeholder text ("[Add content]", "TBD", etc.)
- Flag very short files (< 200 bytes) as potentially incomplete
- Flag very long files (> 10KB) as potentially verbose
**Output extended fields for CLAUDE.md:**
```yaml
- path: "/path/CLAUDE.md"
type: "CLAUDE.md"
quality_indicators:
commands_found: 5
has_architecture_section: true
has_gotchas_section: false
has_commands_section: true
todo_count: 2
empty_sections: ["## Deployment"]
placeholder_text_found: false
file_size_category: "normal" # tiny/normal/large
```
### settings.json
- Must be valid JSON
- Check for known keys: model, permissions, env, etc.
### .mcp.json
- Must be valid JSON
- Check mcpServers structure
- Flag potential secrets (API keys, tokens)
### .claudeignore
- Check for valid gitignore-style patterns
### rules/*.md
- Check for valid markdown
- Extract path patterns from frontmatter:
- `paths:` (official Claude Code field name)
- `globs:` (legacy/alternative name, also supported)
- Normalize to `patterns` in output, record source in `pattern_source`
- Extract description from frontmatter
- Validate patterns match actual files:
- Run glob pattern against the project root
- Record `matched_files_count`
- Flag as `is_orphaned: true` if count is 0
- Generate `orphaned_rule` issue for orphaned rules
## Secret Detection Patterns
Flag as potential secrets:
- Strings matching `/xoxb-[a-zA-Z0-9-]+/` (Slack)
- Strings matching `/sk-[a-zA-Z0-9]+/` (OpenAI)
- Strings matching `/ghp_[a-zA-Z0-9]+/` (GitHub)
- Strings longer than 20 chars that look like API keys
- Any `env` key with inline values (not ${VAR} references)
## Error Handling
- If directory doesn't exist: Report empty findings
- If permission denied: Log issue, continue scanning
- If file read fails: Log issue, continue with other files
- Never fail the entire scan for individual file errors
## Performance
- Use Glob for pattern matching (fast)
- Read files sequentially to avoid overwhelming filesystem
- Maximum depth: Follow scope configuration (default unlimited)
## Model policy
v4.0 migrated from haiku to Sonnet 4.6 per global no-haiku policy. Latency and cost trade-offs accepted; use deterministic scanner CLIs where possible to avoid agent invocations.