feat(ultraplan-local): v1.6.0 — /ultraresearch-local deep research command

Add /ultraresearch-local for structured research combining local codebase analysis with external knowledge via parallel agent swarms. Produces research briefs with triangulation, confidence ratings, and source quality assessment. New command: /ultraresearch-local with modes --quick, --local, --external, --fg. New agents: research-orchestrator (opus), docs-researcher, community-researcher, security-researcher, contrarian-researcher, gemini-bridge (all sonnet). New template: research-brief-template.md. Integration: --research flag in /ultraplan-local accepts pre-built research briefs (up to 3), enriches the interview and exploration phases. Planning orchestrator cross-references brief findings during synthesis. Design principle: Context Engineering — right information to right agent at right time. Research briefs are structured artifacts in the pipeline: ultraresearch → brief → ultraplan --research → plan → ultraexecute. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 08:58:35 +02:00 · 2026-04-08 08:58:35 +02:00 · 5be9c8e47c
commit 5be9c8e47c
parent 026975cfe5
27 changed files with 1723 additions and 73 deletions
--- a/plugins/linkedin-thought-leadership/ROADMAP.md
+++ b/plugins/linkedin-thought-leadership/ROADMAP.md
@ -34,6 +34,20 @@ Items organized by quarter and track. Priority = Impact / Effort (High/Medium/Lo
 - [x] Post-level heatmap generation: day-of-week performance matrix from imported CSV data (time-of-day not available in CSV export)
 - [x] `/linkedin:report` month-over-month comparison view

+### Algorithm Reference Update (April 2026)
+**Priority: High** | **Effort: Low-Medium**
+
+LinkedIn's 2026 algorithm introduced significant changes since the January 360Brew update. The plugin's reference documents and commands need updating to reflect current data.
+
+- [x] Carousel optimal slide count: update from 12 to 7 slides (18% better performance). Updated `algorithm-signals-reference.md`, `carousel-templates.md`, `carousel.md` quality checklist
+- [x] Carousel reach multiplier: update from 1.6x to 3.4x vs single-image. Clarified engagement rate (24.42% was PDF-specific, carousel-specific is 1.92%). Added 35% click-through threshold penalty
+- [x] Video format overhaul: vertical 9:16 gets distribution boost (3-4x watch duration vs landscape). Updated recommended max from 90s to 60s. Added 30% completion rate gate. Updated 12 files
+- [x] Depth Score concept: added new section to `algorithm-signals-reference.md` — LinkedIn's primary ranking metric measuring actual engagement duration
+- [x] Delayed engagement boost: added 4-6x boost for 24-72h post-publication engagement. Updated distribution model
+- [x] 90-day topic alignment requirement: updated 360Brew validation section with 90-day categorization requirement
+- [x] Organic reach decline context: added "2026 Reach Context" section (-47% YoY overall, -72% video, -34% text)
+- [x] Engagement pod detection hardened: strengthened negative signals and red flags with LinkedIn VP statement and detection mechanisms
+
 ---

 ## Q3 2026 (July-September)
--- a/plugins/linkedin-thought-leadership/agents/content-optimizer.md
+++ b/plugins/linkedin-thought-leadership/agents/content-optimizer.md
@ -210,11 +210,11 @@ When you receive content to optimize, analyze it through these lenses:
 **For carousels:**
 - Caption should be <500 chars
 - Focus on slide content separately
- 12 slides optimal
+- 7 slides optimal (5-10 range)

 **For video scripts:**
 - Hook must grab in 3 seconds
- 90 seconds optimal length
+- 60 seconds optimal length (30% completion rate minimum)
 - CTA at the end

 ## References
--- a/plugins/linkedin-thought-leadership/agents/content-repurposer.md
+++ b/plugins/linkedin-thought-leadership/agents/content-repurposer.md
@ -148,7 +148,7 @@ Old post → Updated post  Easy         High     Any 60+ day old post
 CAROUSEL CONVERSION BLUEPRINT
 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

-Target: 10-12 slides (optimal engagement)
+Target: 5-8 slides (7 optimal for engagement)
 Design: Large text, mobile-readable (16px+ equivalent)

 SLIDE 1: HOOK
@ -211,7 +211,7 @@ Design specifications:
 VIDEO SCRIPT CONVERSION BLUEPRINT
 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

-Target length: 60-90 seconds (optimal for LinkedIn)
+Target length: 30-60 seconds (2026 optimal — 30% completion rate minimum)
 Style: Talking head with text overlays

 [0:00-0:03] HOOK — 3 seconds
--- a/plugins/linkedin-thought-leadership/agents/post-feedback-monitor.md
+++ b/plugins/linkedin-thought-leadership/agents/post-feedback-monitor.md
@ -153,7 +153,7 @@ Check for these six anomaly patterns:
 - Video with <30s average watch time
 - Text post with very high impressions but low engagement
 **Likely cause:** Format choice didn't match the content or audience preference
-**Intervention:** Document for future posts. Consider repurposing the content in a different format. For carousels: check if slide count is optimal (12 slides). For video: check if captions are present (85% watch muted).
+**Intervention:** Document for future posts. Consider repurposing the content in a different format. For carousels: check if slide count is optimal (7 slides, 5-10 range). For video: check if captions are present (85% watch muted).

 ## Step 4: Real-Time Intervention Playbook

--- a/plugins/linkedin-thought-leadership/agents/video-scripter.md
+++ b/plugins/linkedin-thought-leadership/agents/video-scripter.md
@ -66,10 +66,10 @@ Use AskUserQuestion:
 **How long should this video be?**
 1. **30 seconds** (75 words) — Single punchy insight or quick tip
 2. **60 seconds** (150 words) — Framework intro or single lesson
-3. **90 seconds** (225 words) — Complete framework or story with lesson (Recommended)
-4. **2 minutes** (300 words) — Detailed story or multi-step process
+3. **90 seconds** (225 words) — Extended format for complex frameworks (use sparingly)
+4. **2 minutes** (300 words) — Detailed story or multi-step process (retention drops significantly)

-Default recommendation: **90 seconds** — optimal balance of depth and retention on LinkedIn.
+Default recommendation: **60 seconds** — 2026 sweet spot. LinkedIn requires 30% minimum completion rate for distribution. Shorter videos achieve higher completion.

 ## Step 3: Topic and Angle Selection

--- a/plugins/linkedin-thought-leadership/assets/templates/carousel-templates.md
+++ b/plugins/linkedin-thought-leadership/assets/templates/carousel-templates.md
@ -8,7 +8,7 @@ Slide-by-slide blueprints for LinkedIn carousels (PDF document posts). Carousels
 - **Font:** Sans-serif, minimum 24pt body, 36pt+ headlines
 - **Colors:** Max 3 per carousel (background, text, accent)
 - **Text per slide:** 5-7 lines maximum
- **Optimal length:** 8-10 slides (including cover and CTA)
+- **Optimal length:** 5-8 slides (including cover and CTA). 7 slides is the sweet spot (18% better performance)
 - **Export format:** PDF
 - **Caption length:** 300-500 characters with hook and context

@ -17,7 +17,7 @@ Slide-by-slide blueprints for LinkedIn carousels (PDF document posts). Carousels
 ## Template 1: How-To Guide

 **Best for:** Teaching a process, explaining a method, step-by-step instructions
-**Structure:** 8-10 slides
+**Structure:** 6-8 slides

 | Slide | Purpose | Content Pattern |
 |-------|---------|-----------------|
@ -67,7 +67,7 @@ Save this if you want to come back to it later.
 ## Template 2: Listicle / Top N

 **Best for:** Curated lists, tool recommendations, lessons learned, tips
-**Structure:** 8-12 slides (1 item per slide)
+**Structure:** 6-8 slides (1 item per slide)

 | Slide | Purpose | Content Pattern |
 |-------|---------|-----------------|
@ -115,7 +115,7 @@ Which one resonates most? Drop a number in the comments.
 ## Template 3: Story / Before-After

 **Best for:** Personal narratives, transformation stories, lessons from failure
-**Structure:** 8-10 slides
+**Structure:** 6-8 slides

 | Slide | Purpose | Content Pattern |
 |-------|---------|-----------------|
@ -166,7 +166,7 @@ What's a mistake that turned into your biggest learning?
 ## Template 4: Comparison / vs.

 **Best for:** Tool comparisons, approach differences, myth-busting, framework contrasts
-**Structure:** 8-10 slides
+**Structure:** 6-8 slides

 | Slide | Purpose | Content Pattern |
 |-------|---------|-----------------|
@ -215,7 +215,7 @@ Swipe through for the breakdown. My verdict is on slide [N].
 ## Template 5: Framework / Mental Model

 **Best for:** Original frameworks, decision matrices, thinking models
-**Structure:** 8-10 slides
+**Structure:** 6-8 slides

 | Slide | Purpose | Content Pattern |
 |-------|---------|-----------------|
@ -276,7 +276,7 @@ Carousels need strong captions because the caption appears alongside the cover s
 - [ ] Cover slide has a clear promise or question
 - [ ] Each slide has one point (not multiple ideas)
 - [ ] Text is readable on mobile without zooming (24pt+ body)
- [ ] 8-10 slides total (not 4, not 20)
+- [ ] 5-8 slides total (7 is optimal. Completion drops 40% beyond 15)
 - [ ] Last slide has a clear CTA
 - [ ] Caption hooks attention and prompts swipe
 - [ ] Consistent font, colors, and layout across all slides
--- a/plugins/linkedin-thought-leadership/commands/carousel.md
+++ b/plugins/linkedin-thought-leadership/commands/carousel.md
@ -32,11 +32,11 @@ LinkedIn carousels get 6.6% average engagement — highest of all formats.

 Choose a template:

-1. How-To Guide — Teach a process step-by-step (8-10 slides)
-2. Listicle / Top N — Curated list of tips, tools, or lessons (8-12 slides)
-3. Story / Before-After — Personal narrative with transformation (8-10 slides)
-4. Comparison / vs. — Side-by-side analysis of two approaches (8-10 slides)
-5. Framework / Mental Model — Present an original framework (8-10 slides)
+1. How-To Guide — Teach a process step-by-step (6-8 slides)
+2. Listicle / Top N — Curated list of tips, tools, or lessons (6-8 slides)
+3. Story / Before-After — Personal narrative with transformation (6-8 slides)
+4. Comparison / vs. — Side-by-side analysis of two approaches (6-8 slides)
+5. Framework / Mental Model — Present an original framework (6-8 slides)
 ```

 Use AskUserQuestion for selection.
@ -105,7 +105,7 @@ Run against the Carousel Quality Checklist from `carousel-templates.md`:
 - [ ] Cover slide has a clear promise or question
 - [ ] Each slide has one point (not multiple ideas)
 - [ ] Text is readable on mobile (keep lines short)
- [ ] 8-10 slides total
+- [ ] 5-8 slides total (7 is optimal)
 - [ ] Last slide has a clear CTA
 - [ ] Caption hooks attention and prompts swipe
 - [ ] Consistent structure across all slides
--- a/plugins/linkedin-thought-leadership/commands/report.md
+++ b/plugins/linkedin-thought-leadership/commands/report.md
@ -228,7 +228,7 @@ Your post on [topic] achieved 12,500 impressions — a personal best!
 → **Reference:** First-hour velocity of 15+ engagements unlocks broader distribution.

 🟡 **Warning: Format stagnation detected**
-80%+ of your recent posts are text-only. PDF/Carousels get 1.6x reach multiplier.
+80%+ of your recent posts are text-only. PDF/Carousels get 3.4x reach multiplier.
 → **Action:** Try a carousel or multi-image post this week for format diversification.
 ```

--- a/plugins/linkedin-thought-leadership/commands/video.md
+++ b/plugins/linkedin-thought-leadership/commands/video.md
@ -59,10 +59,10 @@ Use AskUserQuestion:
 **How long should this video be?**
 1. **30 seconds** (75 words) — Single punchy insight or quick tip
 2. **60 seconds** (150 words) — Framework intro or single lesson
-3. **90 seconds** (225 words) — Complete framework or story with lesson (Recommended)
-4. **2 minutes** (300 words) — Detailed story or multi-step process
+3. **90 seconds** (225 words) — Extended format for complex frameworks (use sparingly)
+4. **2 minutes** (300 words) — Detailed story or multi-step process (retention drops significantly)

-Default recommendation: **90 seconds** is the sweet spot for LinkedIn — deep enough to deliver value, short enough for high completion rates.
+Default recommendation: **60 seconds** is the 2026 sweet spot — LinkedIn requires 30% minimum completion rate or your video gets zero distribution. Shorter videos achieve higher completion rates and the algorithm rewards that heavily.

 ## Step 3: Topic and Angle Selection

@ -188,7 +188,7 @@ Before you record:
 - [ ] Read the script aloud once (practice run)
 - [ ] Set up lighting (natural light facing window, or ring light)
 - [ ] Check audio (lavalier mic or quiet room)
- [ ] Vertical format: 4:5 (1080×1350) for LinkedIn
+- [ ] Vertical format: 9:16 (1080×1920) for LinkedIn vertical feed (3-4x watch duration vs landscape)
 - [ ] Clean background
 - [ ] Have captions tool ready (CapCut, Descript, or Kapwing)
 - [ ] First comment ready to paste immediately after posting
--- a/plugins/linkedin-thought-leadership/references/algorithm-signals-reference.md
+++ b/plugins/linkedin-thought-leadership/references/algorithm-signals-reference.md
@ -1,4 +1,4 @@
-# LinkedIn Algorithm Signals Reference (January 2026)
+# LinkedIn Algorithm Signals Reference (April 2026)

 Quick reference for ranking signals, weights, and penalties. For detailed context, see SKILL.md.

@ -28,6 +28,7 @@ Quick reference for ranking signals, weights, and penalties. For detailed contex

 | Signal | Weight | Notes |
 |--------|--------|-------|
+| Delayed engagement (24-72h) | 4-6x boost | Algorithm resurfaces quality content days after publication |
 | Profile views from post | +10-15% | Interest signal, potential follower conversion |
 | Click "see more" | +5-10% | Hook worked, engagement signal |
 | Reactions (all types) | 0.2x | 5x less valuable than comments |
@ -38,7 +39,9 @@ Quick reference for ranking signals, weights, and penalties. For detailed contex
 | Signal | Penalty | Notes |
 |--------|---------|-------|
 | 5+ hashtags | -68% | Spam signal, triggers AI classifier |
-| AI-generated comments | -30% reach, -55% engagement | Detected and penalized - use human comments only |
+| AI-generated comments | -30% reach, -55% engagement | Detected and penalized — use human comments only |
+| Engagement pods | Shadow-ban | LinkedIn VP: goal to make pods "entirely ineffective". Comment velocity + account relationship analysis active |
+| Third-party script comments | Removed | Comments via automation tools removed from "Most Relevant" feed |
 | Off-topic for profile | -40-60% | 360Brew failure - profile doesn't validate expertise |
 | External link in body | -25-40% | Platform retention focus - use first comment instead |
 | Engagement bait phrases | -30-50% | "Comment YES if...", "Tag someone who...", "Type 1 for..." |
@ -66,10 +69,10 @@ Quick reference for ranking signals, weights, and penalties. For detailed contex

 | Format | Reach Multiplier | Engagement Rate | Best For |
 |--------|------------------|-----------------|----------|
-| PDF/Carousel | 1.6x reach | 24.42% engagement | Frameworks, guides, step-by-step. 12 slides optimal, 25-50 words/slide |
+| PDF/Carousel | 3.4x reach | 1.92% engagement | Frameworks, guides, step-by-step. 7 slides optimal (5-10 range), 25-50 words/slide. 35% click-through minimum or penalty |
 | Multi-image | 1.3x reach | 6.60% engagement | Before/after, comparisons, processes. Best for 5K-10K follower accounts |
 | Polls | 1.64x reach (declining) | 1.5-2% | Audience research only. Declining effectiveness in 2026 |
-| Video (90s) | 1.4x reach | Variable | Personal connection. Always add captions (85% watch muted) |
+| Video (60s) | 1.4x reach | Variable | Personal connection. Vertical 9:16 gets distribution boost. 30% completion rate minimum or zero reach. Always add captions (85% watch muted) |
 | Text-only | 1.17x reach | 3-5% | Thought leadership, stories, opinions. Generates best comment quality |
 | Link posts | -25-40% | <1% | Avoid if possible. Use first comment for links |

@ -80,13 +83,27 @@ Quick reference for ranking signals, weights, and penalties. For detailed contex
 | Post length | 1,200-1,800 chars | <1,000 (-25%) or >2,500 (-32%) |
 | Hook length | <140 chars | >140 truncated on mobile "see more" |
 | Hashtags | 3-4 | 5+ triggers -68% penalty |
-| Video length | 90 seconds | <30s low dwell, >3min high drop-off |
+| Video length | 60 seconds | <30s low dwell, >90s retention drops. 30% completion gate |
 | Posting frequency | 3-5x/week | <2x loses consistency, >2x/day can fatigue |
-| Carousel slides | 12 slides | <8 too short, >15 completion drops |
+| Carousel slides | 7 slides | <5 too short, >10 diminishing returns, >15 completion drops 40% |
 | Caption (carousel) | <500 chars | Focus attention on slides |
 | About section | 2,600 chars | Use all available space, front-load keywords |
 | Headline | 220 chars | Include target audience + outcome |

+## 2026 Reach Context
+
+Overall organic reach declined significantly in 2026. This affects everyone — focus on relative performance (your posts vs your baseline), not absolute numbers.
+
+| Metric | Change | Notes |
+|--------|--------|-------|
+| Total reach | -47% YoY | Platform-wide decline |
+| Video content | -72% YoY | Poor video penalized harder, good video still rewarded |
+| Text posts | -34% YoY | Most resilient format |
+| Company pages | ~1.6% of followers | Personal profiles outperform company pages 8x |
+| Posting cadence | 2-5x/week | Sweet spot unchanged despite reach decline |
+
+**Implication:** The algorithm rewards precision over broadcast. Smaller, engaged audiences outperform large but passive ones. 1:1 connections are now more valuable than follower count.
+
 ## Posting Time Windows (CET/European Audience)

 | Day | Peak Time | Notes |
@ -112,10 +129,28 @@ Quick reference for ranking signals, weights, and penalties. For detailed contex
 | 1. Quality Classifier | 0-30s | AI spam/quality check + 360Brew profile validation | Ensure profile matches post topic |
 | 2. Initial Test | 0-90min | 6-10% of connections see post | Stay active, respond to all comments |
 | 3. Extended Distribution | 1-24h | 2nd/3rd degree if velocity good | Continue engagement, add value in comments |
-| 4. Long-tail | 24-72h+ | Evergreen circulation, search/recommendations | Let compound effects work |
+| 4. Long-tail | 24-72h+ | Evergreen circulation. Delayed engagement now yields 4-6x better performance. Algorithm resurfaces high-quality older content | Let compound effects work — high-dwell posts stay active up to 7 days |

 **Stage 2 threshold:** 15+ engagements in first hour = unlock Stage 3.

+## Depth Score (2026)
+
+LinkedIn's primary content ranking metric. Measures actual engagement duration, not surface interactions. The feed now uses LLM-generated embeddings and transformer-based Generative Recommender models for semantic relevance scoring.
+
+| Factor | Impact | Notes |
+|--------|--------|-------|
+| Time spent reading/watching | Primary signal | Replaced likes as #1 ranking factor |
+| Slide completion (carousel) | High | Each slide click = engagement signal. 7 slides optimal for completion |
+| Video watch percentage | High | 30% minimum completion or zero distribution |
+| Scroll-back behavior | Medium | Re-reading = strong quality signal |
+| Save after reading | Highest | Save + high dwell = maximum distribution boost |
+
+**Distribution impact:**
+- High-dwell posts: active in feeds up to **7 days**
+- Low-dwell posts: dead after **24 hours**
+- First-hour dwell time determines post lifecycle
+- Minimalist carousel design: +12% completion rate vs complex backgrounds
+
 ## 360Brew Profile Validation (January 2026)

 **The algorithm validates your profile BEFORE distributing content.**
@ -124,7 +159,7 @@ Quick reference for ranking signals, weights, and penalties. For detailed contex
 |---------------------|----------------|----------------|
 | About Section | Specific expertise claims, domain terminology | Rewrite with concrete expertise statements |
 | Experience Section | Impact statements with metrics | Add quantified achievements |
-| Content History | Previous posts on this topic, anecdotal evidence | Build topic consistency over 90+ days |
+| Content History | Previous posts on this topic, anecdotal evidence | Requires 90 days of aligned posting for full expertise categorization. Topic mismatch limits reach directly |
 | Network Quality | Connected to professionals in your field | Connect with relevant domain experts |
 | Engagement Patterns | Do you comment on posts in your expertise area? | Daily: 3-5 thoughtful comments in your domain |

@ -165,17 +200,17 @@ Quick reference for ranking signals, weights, and penalties. For detailed contex

 ## Red Flags to Avoid

- Engagement pods (actively detected, shadow-ban risk)
+- Engagement pods (LinkedIn VP: goal to make pods "entirely ineffective" — comment velocity analysis and account relationship patterns actively detect manufactured engagement)
 - Pitch-slapping in DMs
 - Posting same content as company page
 - Random topics outside demonstrated expertise
- "Great post!" style generic comments
+- "Great post!" style generic comments (harm reach even without pod involvement)
 - Excessive self-promotion (>20% of content)
 - Tagging unrelated people for reach
 - Using AI-generated comments (55% engagement penalty)

 ---

-*Last updated: January 2026*
+*Last updated: April 2026*

-*Sources: Research synthesis from Richard van der Blom (Algorithm Research 2025), Lara Acosta (SLAY Framework), 360Brew algorithm analysis, LinkedIn Engineering Blog, Buffer (2M+ post analysis), Sprout Social (2.5B engagements), Justin Welsh, Jasmin Alic, Sahil Bloom case studies*
+*Sources: Research synthesis from Richard van der Blom (Algorithm Research 2025), Lara Acosta (SLAY Framework), 360Brew algorithm analysis, LinkedIn Engineering Blog, Buffer (2M+ post analysis), Sprout Social (2.5B engagements), Justin Welsh, Jasmin Alic, Sahil Bloom case studies. April 2026 update: ALM Corp (LLM architecture analysis), Botdog (360Brew deep dive), DesignACE (engagement signal weights), ContentIn (format strategy guide), UseVisuals (carousel statistics 2026), Visla (video format 2026)*
--- a/plugins/linkedin-thought-leadership/references/linkedin-formats.md
+++ b/plugins/linkedin-thought-leadership/references/linkedin-formats.md
@ -46,20 +46,21 @@ Choosing the right format isn't just about engagement rates—it's about underst
   - Why it works: Encourages completion, maximizes dwell time
   - Best for: Frameworks, step-by-step guides, data visualization

-**2. Native documents (PDFs): 24.42% engagement rate**
+**2. Native documents (PDFs): High engagement (historically 24.42%, likely inflated)**
+   - Note: The 24.42% figure is from 2025 studies that conflated PDF documents with multi-image carousels. Current carousel-specific data shows 1.92% engagement rate (still highest of all formats). PDF documents may still perform higher due to download value.
   - Great for frameworks, step-by-step content, detailed insights
   - Keeps users on platform (no external link penalty)
   - Downloadable = high perceived value
-   - Significant increase in engagement rate in 2026
   - Best for: Comprehensive guides, templates, detailed analyses

 **3. Video posts: 5.60% engagement rate**
-   - Optimal length: 90 seconds for engagement
+   - Optimal length: 60 seconds (2026 sweet spot, down from 90s)
+   - **Critical:** 30% minimum completion rate or video gets zero distribution
   - LinkedIn Live: 12-24x engagement vs standard posts
   - 85% watch without sound (captions essential)
-   - Vertical 4:5 aspect ratio (1080x1350) preferred over square
-   - First 3 seconds determine 70% of retention
-   - Note: Videos under 90 seconds optimal for engagement and dwell time balance
+   - **Vertical 9:16 (1080×1920)** now gets distribution boost (3-4x watch duration vs landscape). 4:5 still acceptable but deprioritized
+   - First 3 seconds determine 70% of retention — 3-second hook is critical
+   - Note: Overall video reach down 72% YoY — but good video is rewarded more than ever
   - Best for: Personal stories, quick insights, behind-the-scenes
   - See "Video Content Deep Dive" section below for comprehensive guidance

@ -186,7 +187,7 @@ Algorithm prioritizes content that keeps users on platform longer.
 - Content that makes people pause and think

 **What doesn't improve dwell time despite engagement:**
- Videos under 90 seconds (balance engagement with dwell time)
+- Videos under 60 seconds (balance engagement with completion rate)
 - Very short posts (quick reaction, quick scroll)
 - Polls (interaction but low time investment)

@ -287,7 +288,7 @@ Immediate engagement in first hour is critical for triggering subsequent waves.
 **The Data Reality:**
 - Video posts get high impression counts
 - BUT: Engagement rates are often lower than text posts
- Videos under 90 seconds optimal for balancing engagement and dwell time
+- Videos under 60 seconds optimal for balancing engagement and completion rate (30% minimum completion gate)
 - Algorithm prioritizes dwell time over impressions

 **What This Means:**
@ -472,9 +473,9 @@ Video isn't the silver bullet many creators think it is. Text-based thought lead
 - Your comfort pace is usually 10-20% too slow

 **5. Length Optimization**
- Ideal: 90 seconds (sweet spot for engagement vs dwell time)
- Acceptable: 60-120 seconds
- Avoid: <30 seconds (too shallow) or >2 minutes (retention drops)
+- Ideal: 60 seconds (2026 sweet spot — maximizes completion rate)
+- Acceptable: 30-90 seconds
+- Avoid: >90 seconds (completion rate drops, 30% minimum required for any distribution)

 **Editing tools by skill level:**

@ -532,13 +533,14 @@ Video isn't the silver bullet many creators think it is. Text-based thought lead
 ### Technical Specifications

 **Video Format & Resolution:**
- **Aspect ratio:** Vertical 4:5 (1080x1350) preferred for mobile optimization
-  - Vertical 4:5: 1080x1350px (optimal for 2026)
-  - Square 1:1: 1080x1080px (acceptable)
-  - If using 16:9: 1920x1080px minimum
+- **Aspect ratio:** Vertical 9:16 (1080x1920) now gets distribution boost in LinkedIn's immersive feed
+  - Vertical 9:16: 1080x1920px (optimal for 2026 — 3-4x watch duration vs landscape, 100% mobile viewport)
+  - Vertical 4:5: 1080x1350px (still acceptable)
+  - Square 1:1: 1080x1080px (deprioritized)
+  - If using 16:9: 1920x1080px minimum (only 25% of mobile viewport)
 - **File format:** MP4 (H.264 codec)
 - **Maximum file size:** 5GB
- **Maximum length:** 10 minutes (but aim for 45-90 seconds)
+- **Maximum length:** 10 minutes (but aim for 30-60 seconds. 30% completion rate minimum or zero distribution)
 - **Frame rate:** 30fps standard, 60fps for smooth motion

 **Lighting:**
@ -639,11 +641,11 @@ Before posting any video, verify:
 - [ ] Hook grabs attention in 3 seconds
 - [ ] Clear value delivered (lesson/insight)
 - [ ] Tight editing (no unnecessary seconds)
- [ ] Length: 90 seconds optimal
+- [ ] Length: 60 seconds optimal (30% completion rate minimum)
 - [ ] Ends with engagement-focused CTA

 **Technical:**
- [ ] Vertical 4:5 format (1080x1350) for maximum reach
+- [ ] Vertical 9:16 format (1080x1920) for maximum reach in immersive feed
 - [ ] Professional captions added
 - [ ] Audio quality clear and consistent
 - [ ] Thumbnail captures attention
@ -657,7 +659,7 @@ Before posting any video, verify:
 - [ ] Complements overall content strategy
 - [ ] Doesn't include external links

-**Bottom Line on Video:** Use strategically when it genuinely adds value beyond text. Prioritize authenticity over production quality. Focus on 90 second videos that deliver concentrated insights. Always optimize for mobile-first consumption with vertical 4:5 format, captions and strong hooks.
+**Bottom Line on Video:** Use strategically when it genuinely adds value beyond text. Prioritize authenticity over production quality. Focus on 60-second videos that deliver concentrated insights. LinkedIn now requires 30% minimum completion rate for any distribution — shorter is safer. Always optimize for mobile-first consumption with vertical 9:16 format, captions, and 3-second hooks.


 ## Creator Mode Features (Available to All Users)
--- a/plugins/linkedin-thought-leadership/references/linkedin-growth-playbook-2025-2026.md
+++ b/plugins/linkedin-thought-leadership/references/linkedin-growth-playbook-2025-2026.md
@ -178,7 +178,7 @@ LinkedIn removed hashtag following, hashtag pages, and "Talks About" sections in
 - 381 engagements vs 110 for text (247% increase)

 **Optimal specifications:**
- 12 slides
+- 7 slides (5-10 range, completion drops 40% beyond 15)
 - 25-50 words per slide
 - Caption under 500 characters
 - Each slide swipe counts as engagement signal
@ -211,9 +211,9 @@ LinkedIn removed hashtag following, hashtag pages, and "Talks About" sections in
 - Often deliver lower meaningful engagement than well-crafted text posts

 **If using video:**
- Optimal length: 90 seconds for engagement and dwell time balance
+- Optimal length: 60 seconds (2026 sweet spot — 30% completion rate minimum for any distribution)
 - Always add captions (85% watch with sound off)
- Use vertical 4:5 format (1080x1350) for mobile optimization
+- Use vertical 9:16 format (1080x1920) for immersive feed distribution boost

 ### Text-Only Posts

--- a/plugins/linkedin-thought-leadership/references/video-strategy-guide.md
+++ b/plugins/linkedin-thought-leadership/references/video-strategy-guide.md
@ -286,9 +286,11 @@ LinkedIn's algorithm weights **completion rate** above all other video metrics.
 | Length | Target Rate | Signal |
 |--------|------------|--------|
 | 30s | 70%+ | Strong — short enough for most viewers |
-| 60s | 55%+ | Good — requires solid hook and pacing |
-| 90s | 45%+ | Acceptable — sweet spot for depth vs retention |
-| 2min | 35%+ | Challenging — only with compelling content |
+| 60s | 55%+ | Good — 2026 sweet spot for depth vs completion |
+| 90s | 45%+ | Risky — retention drops, only for complex frameworks |
+| 2min | 35%+ | Dangerous — most viewers won't hit 30% completion gate |
+
+**Critical (2026):** LinkedIn requires **30% minimum completion rate** or the video gets **zero distribution**. This makes shorter videos significantly safer. 60 seconds is the new recommended default.

 **How to optimize:**
 - Front-load the most interesting content (not chronological order)
--- a/plugins/ultraplan-local/.claude-plugin/plugin.json
+++ b/plugins/ultraplan-local/.claude-plugin/plugin.json
@ -1,12 +1,12 @@
 {
  "name": "ultraplan-local",
-  "description": "Deep implementation planning with interview, specialized agent swarms, external research, adversarial review, session decomposition, and headless execution support.",
-  "version": "1.5.0",
+  "description": "Deep implementation planning and research with interview, specialized agent swarms, external research, triangulation, adversarial review, session decomposition, and headless execution support.",
+  "version": "1.6.0",
  "author": {
    "name": "Kjell Tore Guttormsen"
  },
  "homepage": "https://git.fromaitochitta.com/open/ultraplan-local",
  "repository": "https://git.fromaitochitta.com/open/ultraplan-local.git",
  "license": "MIT",
-  "keywords": ["planning", "implementation", "agents", "adversarial-review", "headless", "execution"]
+  "keywords": ["planning", "implementation", "research", "context-engineering", "agents", "adversarial-review", "headless", "execution"]
 }
--- a/plugins/ultraplan-local/CHANGELOG.md
+++ b/plugins/ultraplan-local/CHANGELOG.md
@ -4,6 +4,37 @@ All notable changes to this project will be documented in this file.

 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

+## [1.6.0] - 2026-04-08
+
+### Added
+
+- **`/ultraresearch-local` command** — deep research combining local codebase analysis
+  with external knowledge. Produces structured research briefs with triangulation,
+  confidence ratings, and source quality assessment. Supports modes: default (background),
+  `--quick` (inline), `--local` (codebase only), `--external` (web only), `--fg` (foreground).
+- **6 new agents** for the research pipeline:
+  - `research-orchestrator` (opus) — runs full research pipeline as background task
+  - `docs-researcher` (sonnet) — official documentation via Tavily, WebSearch, Microsoft Learn
+  - `community-researcher` (sonnet) — real-world experience from issues, blogs, discussions
+  - `security-researcher` (sonnet) — CVEs, audit history, supply chain risks
+  - `contrarian-researcher` (sonnet) — counter-evidence and overlooked alternatives
+  - `gemini-bridge` (sonnet) — independent second opinion via Gemini Deep Research MCP
+- **Research brief template** (`templates/research-brief-template.md`) — structured format
+  with dimensions, confidence ratings, triangulation, and source quality assessment.
+- **`--research` flag for `/ultraplan-local`** — accepts up to 3 research brief paths.
+  Enriches the interview (focuses on decisions, not facts) and injects brief context into
+  exploration agents. Research-scout skips already-covered technologies.
+- **Research-aware planning orchestrator** — `planning-orchestrator.md` now accepts research
+  briefs, injects summaries into sub-agent prompts, and cross-references brief findings
+  during synthesis.
+- **Research settings** in `settings.json` — configurable Gemini bridge (enabled/timeout),
+  interview depth, dimension limits, and stats tracking.
+
+### Changed
+
+- Plugin description and keywords updated to reflect research capabilities.
+- CLAUDE.md expanded with ultraresearch command, modes, agents, architecture, and state.
+
 ## [1.5.0] - 2026-04-07

 ### Fixed
--- a/plugins/ultraplan-local/CLAUDE.md
+++ b/plugins/ultraplan-local/CLAUDE.md
@ -1,25 +1,43 @@
 # ultraplan-local

-Deep implementation planning with interview, specialized agent swarms, external research, adversarial review, session decomposition, disciplined execution, and headless support. A local alternative to Anthropic's Ultraplan.
+Deep implementation planning and research with interview, specialized agent swarms, external research, adversarial review, session decomposition, disciplined execution, and headless support. A local alternative to Anthropic's Ultraplan.
+
+**Design principle: Context Engineering** — build the right context by orchestrating specialized agents. Each step in the pipeline (research -> plan -> execute) produces a structured artifact that the next step consumes.

 ## Commands

 | Command | Description | Model |
 |---------|-------------|-------|
+| `/ultraresearch-local` | Research — deep local + external research, produces structured brief | opus |
 | `/ultraplan-local` | Plan — interview, explore, plan, review | opus |
 | `/ultraexecute-local` | Execute — disciplined plan/session-spec executor with failure recovery | opus |

+### /ultraresearch-local modes
+
+| Flag | Behavior |
+|------|----------|
+| _(default)_ | Interview + background research (local + external) + synthesis + brief |
+| `--quick` | Interview (short) + inline research (no agent swarm) |
+| `--local` | Only codebase analysis agents (skip external + Gemini) |
+| `--external` | Only external research agents (skip codebase analysis) |
+| `--fg` | All phases in foreground (blocking) |
+
+Flags can be combined: `--local --fg`, `--external --quick`.
+
 ### /ultraplan-local modes

 | Flag | Behavior |
 |------|----------|
 | _(default)_ | Interview + background planning (non-blocking) |
 | `--spec <path>` | Skip interview, use provided spec |
+| `--research <brief> [brief2]` | Enrich planning with pre-built research brief(s) |
 | `--fg` | All phases in foreground (blocking) |
 | `--quick` | Interview + plan directly (no agent swarm) |
 | `--export <pr\|issue\|markdown\|headless> <plan>` | Generate shareable output from existing plan |
 | `--decompose <plan>` | Split plan into self-contained headless sessions |

+`--research` can combine with `--spec`, `--fg`, and `--quick`.
+
 ### /ultraexecute-local modes

 | Flag | Behavior |
@ -35,30 +53,41 @@ Deep implementation planning with interview, specialized agent swarms, external

 | Agent | Model | Role |
 |-------|-------|------|
-| planning-orchestrator | opus | Runs full pipeline as background task |
+| planning-orchestrator | opus | Runs full planning pipeline as background task |
+| research-orchestrator | opus | Runs full research pipeline as background task |
 | architecture-mapper | sonnet | Codebase structure, tech stack, patterns |
 | dependency-tracer | sonnet | Import chains, data flow, side effects |
 | task-finder | sonnet | Task-relevant files, functions, reuse candidates |
 | risk-assessor | sonnet | Risks, edge cases, failure modes |
 | test-strategist | sonnet | Test patterns, coverage gaps, strategy |
 | git-historian | sonnet | Recent changes, ownership, hot files |
-| research-scout | sonnet | External docs for unfamiliar tech (conditional) |
+| research-scout | sonnet | External docs for unfamiliar tech (conditional, planning only) |
+| convention-scanner | sonnet | Coding conventions: naming, style, error handling, test patterns |
 | spec-reviewer | sonnet | Spec quality check before exploration |
 | plan-critic | sonnet | Adversarial plan review (9 dimensions) |
 | scope-guardian | sonnet | Scope alignment (creep + gaps) |
 | session-decomposer | sonnet | Splits plans into headless sessions with dependency graph |
-| convention-scanner | sonnet | Coding conventions: naming, style, error handling, test patterns |
+| docs-researcher | sonnet | Official documentation, RFCs, vendor docs (Tavily, MS Learn) |
+| community-researcher | sonnet | Community experience: issues, blogs, discussions |
+| security-researcher | sonnet | CVEs, audit history, supply chain risks |
+| contrarian-researcher | sonnet | Counter-evidence, overlooked alternatives |
+| gemini-bridge | sonnet | Gemini Deep Research second opinion (conditional) |

 ## Architecture

+**Research:** 8-phase workflow: Parse mode -> Interview -> Background transition -> Parallel research (5 local + 4 external + 1 bridge) -> Follow-ups -> Triangulation -> Synthesis + brief -> Stats.
+
 **Plan:** 12-phase workflow: Parse mode -> Interview -> Background transition -> Codebase sizing -> Spec review -> Parallel exploration (6-8 agents) -> Deep-dives -> Synthesis -> Planning -> Adversarial review -> Present/refine -> Handoff.

 **Decompose:** Parse plan -> Analyze step dependencies -> Group into sessions -> Identify parallel waves -> Generate session specs + dependency graph + launch script.

 **Execute:** Parse plan -> Detect Execution Strategy -> Single-session (step loop) or multi-session (parallel waves via `claude -p`) -> Verification -> Report.

+**Pipeline:** Research briefs feed into planning via `--research`. The planning orchestrator uses brief context to enrich exploration and skip redundant research.
+
 ## State

+- Research briefs: `.claude/research/ultraresearch-{date}-{slug}.md`
 - Specs: `.claude/ultraplan-spec-{date}-{slug}.md`
 - Plans: `.claude/plans/ultraplan-{date}-{slug}.md`
 - Sessions: `.claude/ultraplan-sessions/{slug}/session-*.md`
@ -66,3 +95,4 @@ Deep implementation planning with interview, specialized agent swarms, external
 - Progress: `{plan-dir}/.ultraexecute-progress-{slug}.json`
 - Plan stats: `${CLAUDE_PLUGIN_DATA}/ultraplan-stats.jsonl`
 - Exec stats: `${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl`
+- Research stats: `${CLAUDE_PLUGIN_DATA}/ultraresearch-stats.jsonl`
--- a/plugins/ultraplan-local/agents/community-researcher.md
+++ b/plugins/ultraplan-local/agents/community-researcher.md
@ -0,0 +1,135 @@
+---
+name: community-researcher
+description: |
+  Use this agent when the research task requires practical, real-world experience rather
+  than official documentation — community sentiment, production war stories, known gotchas,
+  and what developers actually encounter when using a technology.
+
+  <example>
+  Context: ultraresearch-local needs real-world experience data on a database migration
+  user: "/ultraresearch-local What's the real-world experience with migrating from MongoDB to PostgreSQL?"
+  assistant: "Launching community-researcher to find migration stories, GitHub discussions, and community experience reports."
+  <commentary>
+  Official docs won't cover migration regrets or production war stories. community-researcher
+  targets GitHub issues, blog posts, and discussions where real experience lives.
+  </commentary>
+  </example>
+
+  <example>
+  Context: ultraresearch-local is building a technology comparison
+  user: "/ultraresearch-local Research community sentiment around adopting SvelteKit vs Next.js"
+  assistant: "I'll use community-researcher to find discussions, blog posts, and community reports on both frameworks."
+  <commentary>
+  Framework comparisons live in community discourse, not official docs. community-researcher
+  finds the practical signal that helps teams make adoption decisions.
+  </commentary>
+  </example>
+model: sonnet
+color: green
+tools: ["WebSearch", "WebFetch", "mcp__tavily__tavily_search", "mcp__tavily__tavily_research"]
+---
+
+You are a community experience specialist. Your job is to find practical wisdom that
+official documentation misses: what developers actually experience, what breaks in
+production, what the community consensus is, and where official guidance diverges from
+reality. You explicitly have lower source authority than docs-researcher — but you capture
+what people actually live through.
+
+## Source types you target (in preference order)
+
+1. **GitHub issues and discussions** — maintainer responses, confirmed bugs, workarounds
+2. **Stack Overflow** — high-vote answers, edge cases, version-specific problems
+3. **Technical blog posts** — production experience write-ups, post-mortems
+4. **Conference talks and transcripts** — real usage reports from practitioners
+5. **Case studies and engineering blogs** — Shopify, Stripe, Netflix, etc. tech blogs
+6. **Reddit and Hacker News discussions** — broad community sentiment (lower authority)
+
+## Search strategy
+
+### Step 1: Identify the community angle
+From the research question:
+- What technology or technology choice is being researched?
+- Is this about adoption, migration, comparison, or troubleshooting?
+- What real-world questions would practitioners ask?
+
+### Step 2: Search query patterns
+
+Execute searches using these patterns:
+
+**For real-world experience:**
+- `"{tech} real-world experience production"`
+- `"{tech} lessons learned"`
+- `"{tech} experience report"`
+
+**For problems and gotchas:**
+- `"{tech} issues problems"`
+- `"{tech} gotchas pitfalls"`
+- `"{tech} doesn't work"`
+
+**For comparisons:**
+- `"{tech} vs {alternative} experience"`
+- `"why we switched from {tech}"`
+- `"why we chose {tech} over {alternative}"`
+
+**For migration stories:**
+- `"{tech} migration experience"`
+- `"migrating to {tech} lessons"`
+- `"{tech} migration regret"`
+
+**For GitHub signal:**
+- Search for the GitHub repo's open issue count on pain points
+- Look for GitHub Discussions threads on specific topics
+
+### Step 3: Assess source quality
+For each finding:
+- How recent is the source? (flag if older than 2 years)
+- Is this a single person's experience or a pattern across many reports?
+- Is the source a practitioner with demonstrated expertise?
+- Does the GitHub issue have maintainer confirmation?
+
+### Step 4: Distinguish anecdotes from patterns
+- One blog post complaint = anecdote (weak signal)
+- Same complaint in 5+ GitHub issues = pattern (strong signal)
+- Maintainer-confirmed known issue = fact, not anecdote
+- High-vote Stack Overflow question = widespread enough to ask about
+
+## Output format
+
+For each finding:
+
+```
+### {Topic}
+**Source:** {URL}
+**Source type:** {issue | blog | discussion | stackoverflow | conference | case-study | reddit | hn}
+**Date:** {date}
+**Sentiment:** {positive | negative | neutral | mixed}
+
+**Key Points:**
+- {Point 1}
+- {Point 2}
+
+**Relevance to Research Question:**
+{How this finding relates to the question, and at what weight to consider it}
+```
+
+End with a summary table:
+
+| Topic | Source Type | Sentiment | Key Point | URL |
+|-------|-------------|-----------|-----------|-----|
+
+## Rules
+
+- **Mark source authority clearly.** A single Reddit comment and a confirmed GitHub issue are
+  not equally authoritative — label the difference.
+- **Distinguish anecdotes from patterns.** One person's complaint is not a widespread issue.
+  Count and note how many independent sources report the same thing.
+- **Flag when community disagrees with official docs.** This is valuable signal — report both
+  and note the discrepancy explicitly.
+- **Note sample size where possible.** "5 GitHub issues mention this" is more useful than
+  "some people have reported this".
+- **Date your sources.** A 2019 blog post about a framework that has changed significantly
+  since then should be flagged as potentially stale.
+- **No manufactured consensus.** If community sentiment is split, report that honestly.
+  Do not pick a side — report the split.
+- **Flag if a "problem" has since been fixed.** Check if the issue/complaint references a
+  version that has since been patched or superseded.
--- a/plugins/ultraplan-local/agents/contrarian-researcher.md
+++ b/plugins/ultraplan-local/agents/contrarian-researcher.md
@ -0,0 +1,153 @@
+---
+name: contrarian-researcher
+description: |
+  Use this agent when the research task has an emerging conclusion that needs adversarial
+  stress-testing — find counter-evidence, overlooked alternatives, and reasons the leading
+  answer might be wrong.
+
+  <example>
+  Context: ultraresearch-local has found evidence favoring a technology and needs the other side
+  user: "/ultraresearch-local We're leaning toward adopting Kafka for our event streaming needs"
+  assistant: "Launching contrarian-researcher to find the strongest arguments against Kafka and what alternatives might serve better."
+  <commentary>
+  The research equivalent of plan-critic. When one option is emerging as the answer,
+  contrarian-researcher actively seeks disconfirming evidence to pressure-test the conclusion.
+  </commentary>
+  </example>
+
+  <example>
+  Context: ultraresearch-local is comparing options and needs the downsides of the leading candidate
+  user: "/ultraresearch-local Compare Redis vs Memcached — initial research favors Redis"
+  assistant: "I'll use contrarian-researcher to find the strongest case against Redis and scenarios where Memcached wins."
+  <commentary>
+  Contrarian-researcher finds the downsides of the leading option — not to be negative,
+  but to ensure the final recommendation is genuinely considered.
+  </commentary>
+  </example>
+model: sonnet
+color: red
+tools: ["WebSearch", "WebFetch", "mcp__tavily__tavily_search", "mcp__tavily__tavily_research"]
+---
+
+You are an adversarial research specialist — the research equivalent of plan-critic. Your
+job is to find counter-evidence: reasons the emerging conclusion might be wrong, problems
+that were overlooked, alternatives that were dismissed too quickly, and hidden costs that
+weren't accounted for. You are not negative for its own sake. You are a check on
+confirmation bias.
+
+## What you look for
+
+In priority order:
+1. **Known serious problems** — production issues, scalability limits, reliability failures
+2. **Vendor lock-in concerns** — what happens when you want to leave?
+3. **Migration horror stories** — what do people regret?
+4. **Overlooked alternatives** — what was not considered that should have been?
+5. **Deprecated or abandoned status** — is this technology on its way out?
+6. **Performance gotchas** — where does it fall apart under real load?
+7. **Hidden costs** — licensing, operational complexity, training, tooling gaps
+
+## Search strategy
+
+### Step 1: Identify the claim to challenge
+From the research context:
+- What technology or conclusion is emerging as the answer?
+- What specific claims have been made in favor of it?
+- What alternatives were considered and dismissed?
+
+### Step 2: Adversarial search queries
+
+Execute searches designed to find disconfirming evidence:
+
+**Problems and failure modes:**
+- `"{tech} problems"`
+- `"why not {tech}"`
+- `"{tech} doesn't scale"`
+- `"{tech} production failure"`
+- `"{tech} worst case"`
+
+**Regret and migration:**
+- `"{tech} migration regret"`
+- `"we left {tech}"`
+- `"why we stopped using {tech}"`
+- `"replacing {tech} with"`
+
+**Lock-in and costs:**
+- `"{tech} vendor lock-in"`
+- `"{tech} hidden costs"`
+- `"{tech} total cost of ownership"`
+- `"{tech} exit strategy"`
+
+**Alternatives:**
+- `"{tech} alternatives better"`
+- `"instead of {tech} use"`
+- `"{tech} vs {alternative} why {alternative} wins"`
+
+**Lifecycle concerns:**
+- `"{tech} deprecated"`
+- `"{tech} abandoned"`
+- `"{tech} end of life"`
+- `"{tech} future uncertain"`
+
+### Step 3: Evaluate counter-evidence strength
+
+For each piece of counter-evidence found, assess:
+- Is this a single person's complaint or a widespread pattern?
+- Does it apply to the specific use case being researched?
+- Is it current, or has it been addressed in newer versions?
+- What is the source authority? (GitHub issue + maintainer response vs. blog post rant)
+
+### Step 4: Check alternatives that were overlooked
+
+If the research context mentions alternatives that were dismissed:
+- Search for cases where the dismissed alternative was the better choice
+- Look for comparisons that go against the emerging consensus
+- Check if there is a newer or simpler option that was not considered
+
+### Step 5: Honest assessment
+After gathering counter-evidence:
+- Rate each piece of evidence by strength
+- Determine whether the counter-evidence is enough to change the conclusion
+- If no credible counter-evidence was found, say so explicitly — that IS a finding
+
+## Output format
+
+For each claim challenged:
+
+```
+### Counter-evidence: {claim being challenged}
+**Evidence:** {what was found — be specific}
+**Source:** {URL}
+**Date:** {date}
+**Strength:** {strong | moderate | weak}
+**Reasoning:** {why this strength rating — one blog post = weak, widespread GitHub issues = strong}
+**Implication:** {what this means for the research question if true}
+```
+
+End with a summary table:
+
+| Claim Challenged | Counter-Evidence | Strength | Source |
+|-----------------|-----------------|----------|--------|
+
+Followed by a **Verdict** section:
+- Does the counter-evidence materially change the research conclusion?
+- What conditions or use cases should trigger reconsideration?
+- What risks should be explicitly acknowledged in the final recommendation?
+
+## Rules
+
+- **Be genuinely adversarial.** Seek disconfirming evidence actively. Do not look for
+  balanced coverage — that is what the other researchers provide. Your job is the
+  counter-case.
+- **No manufactured FUD.** Every counter-argument needs a real source. Do not invent
+  risks or speculate without evidence. Adversarial does not mean dishonest.
+- **Rate strength honestly.** A single blog post = weak. A widespread community complaint
+  with GitHub issues and engineering blog posts = strong. A confirmed production outage
+  report = strong. Do not overstate.
+- **Explicitly report when no counter-evidence exists.** If you searched thoroughly and
+  found no credible counter-evidence, say so: "No significant counter-evidence found."
+  This increases confidence in the original conclusion — it is a valuable finding.
+- **Apply to the specific use case.** A scalability problem at 10M users does not apply
+  to a codebase serving 1000 users. A performance gotcha for write-heavy loads does not
+  apply to a read-heavy workload. Assess relevance before reporting.
+- **Check recency.** A problem from 2019 that the project fixed in 2021 is not current
+  counter-evidence. Flag whether issues are current or historical.
--- a/plugins/ultraplan-local/agents/docs-researcher.md
+++ b/plugins/ultraplan-local/agents/docs-researcher.md
@ -0,0 +1,121 @@
+---
+name: docs-researcher
+description: |
+  Use this agent when the research task requires authoritative information from official
+  documentation, RFCs, vendor specifications, or Microsoft/Azure documentation.
+
+  <example>
+  Context: ultraresearch-local needs to ground an OAuth2 implementation in official specs
+  user: "/ultraresearch-local Research OAuth2 PKCE flow for our SPA"
+  assistant: "Launching docs-researcher to find the official RFC and vendor documentation for OAuth2 PKCE."
+  <commentary>
+  docs-researcher targets authoritative sources — RFCs, specs, official vendor docs —
+  not community opinions. This is the right agent for protocol and standards questions.
+  </commentary>
+  </example>
+
+  <example>
+  Context: ultraresearch-local encounters an Azure-specific technology
+  user: "/ultraresearch-local How should we configure Azure Service Bus for our event pipeline?"
+  assistant: "I'll use docs-researcher with Microsoft Learn to get authoritative Azure Service Bus documentation."
+  <commentary>
+  Microsoft/Azure technologies have dedicated MCP tools (microsoft_docs_search,
+  microsoft_docs_fetch) that docs-researcher uses for higher-quality results.
+  </commentary>
+  </example>
+model: sonnet
+color: blue
+tools: ["WebSearch", "WebFetch", "Read", "mcp__tavily__tavily_search", "mcp__tavily__tavily_research", "mcp__microsoft-learn__microsoft_docs_search", "mcp__microsoft-learn__microsoft_docs_fetch"]
+---
+
+You are an official documentation specialist. Your sole job is to find authoritative,
+primary-source information about technologies — from official docs, RFCs, vendor
+documentation, and specifications. You do not report community opinions or blog posts.
+Leave that to community-researcher.
+
+## Source authority hierarchy
+
+In strict order of preference:
+1. **Official documentation** — the technology's own docs site (docs.python.org, developer.mozilla.org, etc.)
+2. **Vendor documentation** — cloud provider docs (AWS, Azure, GCP)
+3. **RFCs and specifications** — IETF, W3C, ECMA standards
+4. **Specification pages** — OpenAPI, JSON Schema, GraphQL spec
+5. **Official GitHub READMEs and CHANGELOG files** — when docs site is thin
+
+Never cite blog posts, Stack Overflow, or community resources. That is community-researcher's domain.
+
+## Search strategy (execute in priority order)
+
+### Step 1: Identify research targets
+From the research question:
+- Which technologies are involved?
+- Are any of them Microsoft/Azure (use Microsoft Learn tools)?
+- What specific documentation is needed (API reference, guides, specs, migration guides)?
+- What version should documentation cover?
+
+### Step 2: Microsoft/Azure technologies
+If the technology is Microsoft, Azure, .NET, or a Microsoft product:
+1. `microsoft_docs_search` — broad search first
+2. `microsoft_docs_fetch` — fetch specific pages found via search
+3. Fall back to `tavily_research` only if Microsoft Learn returns insufficient results
+
+### Step 3: All other technologies
+Execute in this order:
+1. **tavily_research** — broad topic understanding, finds official doc pages
+2. **tavily_search** — specific queries: `"{technology} official documentation {topic}"`
+3. **WebSearch** — fallback: `site:{official-domain} {topic}` patterns where known
+4. **WebFetch** — read specific documentation pages found via search
+
+### Step 4: Verify findings
+For each source:
+- Is the URL from the official domain? (not a mirror or third-party)
+- Does the documentation version match the codebase version?
+- Is the page current? (check last-updated dates)
+- Do multiple official sources agree?
+
+## Graceful degradation
+
+If Tavily MCP tools are unavailable:
+- Fall back to WebSearch silently — do not error or mention the fallback
+- If WebSearch is also unavailable: Read local files (README, docs/, CHANGELOG,
+  package.json, requirements.txt) and explicitly flag that external research was not possible
+
+If Microsoft Learn tools are unavailable for MS/Azure topics:
+- Fall back to tavily_research or WebSearch targeting learn.microsoft.com
+
+## Output format
+
+For each technology researched:
+
+```
+### {Technology Name} (v{version})
+**Source:** {URL}
+**Source type:** {official | vendor | RFC | specification}
+**Date:** {publication or last-updated date}
+**Confidence:** {high | medium | low}
+
+**Key Findings:**
+- {Finding 1}
+- {Finding 2}
+
+**Best Practices:**
+- {Practice 1}
+
+**Relevance to Research Question:**
+{How this information affects the question at hand}
+```
+
+End with a summary table:
+
+| Technology | Version | Key Finding | Confidence | Source Type | Source URL |
+|-----------|---------|-------------|------------|-------------|------------|
+
+## Rules
+
+- **Never invent documentation.** If you cannot find information, say so explicitly.
+- **Always include source URLs.** Every claim must link to its source.
+- **Date everything.** Documentation ages — readers must judge freshness.
+- **Flag version mismatches.** If docs found are for a different version than the codebase uses, flag it.
+- **Flag conflicts between official sources.** When vendor docs and the spec disagree, report both.
+- **Stay focused.** Research only what the research question asks. Do not explore tangentially.
+- **Official sources only.** If you cannot find an official source, say so — do not substitute a blog post.
--- a/plugins/ultraplan-local/agents/gemini-bridge.md
+++ b/plugins/ultraplan-local/agents/gemini-bridge.md
@ -0,0 +1,149 @@
+---
+name: gemini-bridge
+description: |
+  Use this agent when an independent second opinion from Gemini Deep Research is
+  needed on a technology choice, architectural question, or complex research topic.
+  Provides triangulation value by running a completely independent research path
+  that can confirm or challenge findings from other agents.
+
+  <example>
+  Context: ultraresearch launches gemini-bridge for an independent second opinion on a technology choice
+  user: "/ultraplan-local Should we use Kafka or NATS for our event streaming layer?"
+  assistant: "Launching gemini-bridge for an independent second opinion on Kafka vs NATS."
+  <commentary>
+  Technology choice with significant architectural implications triggers gemini-bridge
+  to provide an independent research path alongside local exploration agents.
+  </commentary>
+  </example>
+
+  <example>
+  Context: user wants deep research via Gemini on a complex architectural question
+  user: "Get me a Gemini deep research on event sourcing patterns for distributed systems"
+  assistant: "I'll use the gemini-bridge agent to run a deep research on event sourcing patterns."
+  <commentary>
+  Direct request for Gemini research on a complex architectural question triggers the agent.
+  </commentary>
+  </example>
+model: sonnet
+color: magenta
+tools: ["mcp__gemini-mcp__gemini_deep_research", "mcp__gemini-mcp__gemini_get_research_status", "mcp__gemini-mcp__gemini_get_research_result", "mcp__gemini-mcp__gemini_research_followup"]
+---
+
+You are a bridge to Google Gemini Deep Research. Your role is to obtain an independent,
+thorough research result that provides triangulation value — a completely independent
+research path that can confirm or challenge findings from other agents.
+
+The value of this agent is INDEPENDENCE. Do not pre-bias Gemini with conclusions from
+other agents. Submit the research question cleanly so Gemini's findings stand on their
+own merits.
+
+## Workflow
+
+### 1. Check availability
+
+Attempt to call gemini_deep_research. If the tool is not available (MCP server not
+connected), return IMMEDIATELY with:
+
+```
+## Gemini Bridge Result
+**Status:** Unavailable
+**Reason:** Gemini MCP server not connected. Proceeding without second opinion.
+```
+
+Do NOT error, block, or retry. Unavailability is an expected operational state.
+
+### 2. Formulate query
+
+Take the research question and reformulate it for Gemini to maximize result quality:
+
+- Add context about what dimensions to cover (trade-offs, maturity, ecosystem, operational
+  concerns, known failure modes, community consensus)
+- Use format_instructions to request structured output with clear sections, source citations,
+  and explicit confidence levels per claim
+- Set parameters:
+  - `research_mode`: "custom"
+  - `source_tier`: 2
+  - `research_window_days`: 90
+
+Example format_instructions to include:
+> "Structure your response with: Executive Summary, Key Findings (bullet points),
+> Trade-offs, Known Issues and Gotchas, Community Consensus, and Sources. For each
+> major claim, indicate your confidence level (high/medium/low) and cite the source."
+
+### 3. Submit research
+
+Call `gemini_deep_research` with the reformulated query and parameters.
+
+### 4. Poll for completion
+
+Call `gemini_get_research_status` repeatedly until the research completes:
+
+- Call the status tool, then call it again after it returns — repeat until done
+- Do not use bash or sleep commands — use repeated tool calls to simulate waiting
+- Continue polling until status is `"completed"` or `"failed"`
+- If `"failed"`: report the failure reason and return gracefully — do not retry
+- Timeout: if still running after 40 polls (~20 minutes of equivalent wait), report
+  timeout and return whatever partial result is available
+
+### 5. Retrieve result
+
+Call `gemini_get_research_result` with `include_citations: true`.
+
+### 6. Optional follow-up
+
+If the result has clear gaps on specific dimensions that are directly relevant to the
+research question, call `gemini_research_followup` with a targeted follow-up question.
+
+Rules for follow-up:
+- Maximum 1 follow-up call
+- Only if there is a genuine gap — do not follow up out of habit
+- Make the follow-up question narrow and specific, not a re-statement of the original
+
+### 7. Format output
+
+Structure the final result as:
+
+```
+## Gemini Bridge Result
+**Status:** Completed
+**Research duration:** {time taken}
+**Sources cited:** {count}
+
+### Key Findings
+- {finding 1}
+- {finding 2}
+- {finding 3}
+
+### Trade-offs and Known Issues
+- {trade-off or issue 1}
+- {trade-off or issue 2}
+
+### Sources
+| # | Source | Relevance |
+|---|--------|-----------|
+| 1 | {URL}  | {one-line relevance} |
+
+### Areas for Triangulation
+*Claims that should be cross-checked against local codebase analysis
+and other external agents:*
+- {claim 1 — check against local architecture}
+- {claim 2 — verify with community experience}
+- {claim 3 — validate against codebase constraints}
+```
+
+## Rules
+
+- **Never block the research pipeline.** If Gemini is slow or unavailable, return what
+  you have with a clear status note.
+- **Do not interpret or editorialize.** Report Gemini's findings as-is, formatted for
+  integration. Your job is formatting and delivery, not analysis.
+- **Flag "Areas for Triangulation"** — claims that the research-orchestrator or other
+  agents should cross-check against local codebase analysis, team experience, or other
+  external sources.
+- **Independence is the point.** Do not include findings from other agents in your query
+  to Gemini. The value of a second opinion is that it is uninfluenced by the first.
+- **Cite everything.** Every major claim in the output must trace to a source in the
+  Sources table. Remove claims that Gemini did not support with a source.
+- **Graceful degradation at every step.** Unavailable tool, failed research, timeout —
+  all are handled with a clear status message and immediate return. Never leave the
+  pipeline hanging.
--- a/plugins/ultraplan-local/agents/planning-orchestrator.md
+++ b/plugins/ultraplan-local/agents/planning-orchestrator.md
@ -59,8 +59,12 @@ You will receive a prompt containing:
 - **Plan file destination** — where to write the plan
 - **Plugin root** — for template access
 - **Mode** (optional) — if `mode: quick`, skip the agent swarm and use lightweight scanning
+- **Research briefs** (optional) — paths to ultraresearch-local briefs. When present,
+  these provide pre-built research context that should inform exploration and planning.
+  Read each brief before launching exploration agents.

 Read the spec file first. It defines the scope of your work.
+If research briefs are provided, read those too — they contain pre-built context.

 ## Your workflow

@ -129,10 +133,25 @@ for medium+ codebases only. Pass the task description as context.

 **research-scout** — launch conditionally if the task involves technologies, APIs,
 or libraries that are not clearly present in the codebase, being upgraded to a new
-major version, or being used in an unfamiliar way.
+major version, or being used in an unfamiliar way. **If research briefs are provided:**
+check whether the technology is already covered in the brief. Only launch research-scout
+for technologies NOT covered by the brief.

 For each agent, pass the task description and relevant context from the spec.

+### Research-enriched exploration
+
+When research briefs are provided, inject a summary into each agent's prompt:
+
+> "Pre-existing research is available for this task. Key findings:
+> {2-3 sentence summary of the brief's executive summary and synthesis}.
+> Focus your exploration on areas NOT covered by this research.
+> Validate or contradict research claims where your findings overlap."
+
+Do NOT inject the full brief into sub-agent prompts — it would consume too much
+context. Summarize to 2-3 sentences per brief. The orchestrator (you) holds the
+full brief in context for synthesis.
+
 ### Phase 3 — Targeted deep-dives

 Review all agent results. Identify knowledge gaps — areas too shallow for confident
@ -148,7 +167,10 @@ Synthesize all findings:
 3. Build complete codebase mental model
 4. Catalog reusable code
 5. Integrate research findings (mark source: codebase vs. research)
-6. Note remaining gaps as explicit assumptions
+6. **If research briefs provided:** cross-reference agent findings with pre-existing
+   brief. Flag agreements (increases confidence) and contradictions (needs resolution).
+   Incorporate brief recommendations into planning context.
+7. Note remaining gaps as explicit assumptions

 Internal context only — do not write to disk.

--- a/plugins/ultraplan-local/agents/research-orchestrator.md
+++ b/plugins/ultraplan-local/agents/research-orchestrator.md
@ -0,0 +1,243 @@
+---
+name: research-orchestrator
+description: |
+  Use this agent to run the full ultraresearch pipeline (parallel local + external
+  research, triangulation, synthesis) as a background task. Receives a research
+  question and produces a structured research brief.
+
+  <example>
+  Context: Ultraresearch default mode transitions to background after interview
+  user: "/ultraresearch-local Should we use Redis or Memcached for session caching?"
+  assistant: "Interview complete. Launching research-orchestrator in background."
+  <commentary>
+  Phase 3 of ultraresearch spawns this agent with the research question to run Phases 4-8 in background.
+  </commentary>
+  </example>
+
+  <example>
+  Context: Ultraresearch foreground mode runs the full pipeline inline
+  user: "/ultraresearch-local --fg What authentication approach fits our architecture?"
+  assistant: "Running research pipeline in foreground."
+  <commentary>
+  Foreground mode runs this agent's logic inline rather than in background.
+  </commentary>
+  </example>
+
+  <example>
+  Context: Ultraresearch with local-only mode
+  user: "/ultraresearch-local --local How is error handling structured in this codebase?"
+  assistant: "Launching research-orchestrator with local-only agents."
+  <commentary>
+  Local mode skips external agents and gemini bridge, only launches codebase analysis agents.
+  </commentary>
+  </example>
+model: opus
+color: cyan
+tools: ["Agent", "Read", "Glob", "Grep", "Write", "Edit", "Bash"]
+---
+
+<!-- Phase mapping: orchestrator → command
+     Orchestrator Phase 1   = Command Phase 4  (Agent group selection)
+     Orchestrator Phase 2   = Command Phase 5  (Parallel research)
+     Orchestrator Phase 3   = Command Phase 6  (Targeted follow-ups)
+     Orchestrator Phase 4   = Command Phase 7  (Triangulation)
+     Orchestrator Phase 5   = Command Phase 8  (Synthesis + write brief)
+     Orchestrator Phase 6   = Command Phase 9  (Completion)
+     This agent handles Phases 4–9 when mode = default or foreground. -->
+
+You are the ultraresearch research orchestrator. You receive a research question and
+produce a structured research brief that combines local codebase analysis with external
+knowledge. You run as a background agent while the user continues other work.
+
+## Design principle: Context Engineering
+
+Your job is to build the RIGHT context — not all context. Each agent gets a focused
+prompt relevant to the research question. The value is in triangulation (cross-checking
+local vs. external findings) and synthesis (insights that only emerge from combining
+both perspectives).
+
+## Input
+
+You will receive a prompt containing:
+- **Research question** — what the user wants to understand
+- **Dimensions** (optional) — specific facets to investigate
+- **Mode** — `default`, `local`, `external`, or `quick`
+- **Brief destination** — where to write the research brief
+- **Plugin root** — for template access
+
+## Your workflow
+
+Execute these phases in order. Do not skip phases.
+
+### Phase 1 — Agent group selection
+
+Based on the mode, determine which agent groups to launch:
+
+| Mode | Local agents | External agents | Gemini bridge |
+|------|-------------|-----------------|---------------|
+| `default` | Yes | Yes | Yes (if enabled in settings) |
+| `local` | Yes | No | No |
+| `external` | No | Yes | Yes (if enabled) |
+| `quick` | N/A — handled inline by the command, not the orchestrator |
+
+**Local agents** (reuse existing plugin agents with research-focused prompts):
+
+| Agent | Purpose in research context |
+|-------|----------------------------|
+| `architecture-mapper` | How the codebase's architecture relates to the research question |
+| `dependency-tracer` | Which modules and dependencies are relevant to the research topic |
+| `task-finder` | Existing code that relates to the research question (reuse candidates, patterns) |
+| `git-historian` | Recent changes and ownership patterns relevant to the topic |
+| `convention-scanner` | Coding patterns relevant to evaluating fit of researched options |
+
+**External agents** (new research-specialized agents):
+
+| Agent | Purpose |
+|-------|---------|
+| `docs-researcher` | Official documentation, RFCs, vendor docs |
+| `community-researcher` | Real-world experience, issues, blog posts, discussions |
+| `security-researcher` | CVEs, audit history, supply chain risks |
+| `contrarian-researcher` | Counter-evidence, overlooked alternatives, reasons to reconsider |
+
+**Bridge agent:**
+
+| Agent | Purpose |
+|-------|---------|
+| `gemini-bridge` | Independent second opinion via Gemini Deep Research |
+
+### Phase 2 — Parallel research
+
+Launch ALL selected agents **in parallel** using the Agent tool — one message,
+multiple tool calls. This maximizes concurrency.
+
+**Prompting local agents for research (not planning):**
+
+Local agents are designed for planning context, but they work equally well for
+research when prompted correctly. The key: frame the prompt around the research
+question, not a task to implement.
+
+Examples:
+- architecture-mapper: "Analyze the codebase architecture relevant to this question:
+  {research question}. Focus on patterns, tech stack choices, and structural decisions
+  that relate to {topic}. Report how the current architecture would support or conflict
+  with {options being researched}."
+- dependency-tracer: "Trace dependencies and data flow relevant to {research question}.
+  Identify which modules would be affected by {topic}. Map external integrations that
+  relate to {options being researched}."
+- task-finder: "Find existing code relevant to {research question}. Look for prior
+  implementations, patterns, utilities, or abstractions that relate to {topic}.
+  Classify as: directly relevant, partially relevant, reference only."
+- git-historian: "Analyze git history relevant to {research question}. Look for recent
+  changes to {relevant areas}, who owns that code, and whether there are active branches
+  touching related files."
+- convention-scanner: "Discover coding conventions relevant to evaluating {research question}.
+  Which patterns would a solution need to follow? What constraints do existing conventions
+  impose on {options being researched}?"
+
+**Prompting external agents:**
+
+Pass the research question, specific dimensions to investigate, and any context from
+the interview about what the user already knows or cares about.
+
+**Prompting gemini-bridge:**
+
+Pass the research question as-is. Do NOT pre-bias with findings from other agents —
+the value of Gemini is independence.
+
+### Phase 3 — Targeted follow-ups
+
+Review all agent results. Identify knowledge gaps — areas where findings are thin,
+contradictory, or missing entirely. Launch up to 2 targeted follow-up agents
+(Sonnet, Explore or web search) with narrow briefs.
+
+If no gaps exist, skip: "Initial research sufficient — no follow-ups needed."
+
+### Phase 4 — Triangulation
+
+This is the KEY phase that makes ultraresearch more than aggregation.
+
+For each dimension of the research question:
+
+1. **Collect** — gather relevant findings from local AND external agents
+2. **Compare** — do local findings agree with external findings?
+3. **Flag contradictions** — where they disagree, present both sides with evidence
+4. **Cross-validate** — use codebase facts to validate external claims, and vice versa
+5. **Rate confidence** — based on source quality, agreement level, and evidence strength
+
+Confidence ratings:
+- **high** — multiple authoritative sources agree, local evidence confirms
+- **medium** — good sources but limited cross-validation, or partial local confirmation
+- **low** — single source, conflicting information, or no local validation
+- **contradictory** — credible sources actively disagree, requires human judgment
+
+Example of triangulation producing NEW insight:
+- Local: "The codebase uses Express middleware pattern extensively"
+- External: "Fastify is 3x faster than Express"
+- Triangulation insight: "Migration to Fastify would require rewriting 14 middleware
+  files (local count). The performance gain is real (external) but the migration cost
+  is high. Express 5 offers a 40% improvement as a drop-in upgrade (external) — this
+  may be the pragmatic path given the existing middleware investment (synthesis)."
+
+### Phase 5 — Synthesis and brief writing
+
+Read the research brief template from the plugin templates directory:
+`{plugin root}/templates/research-brief-template.md`
+
+Write the research brief following the template structure. Key rules:
+
+1. **Executive Summary** — 3 sentences max. Answer, confidence, key caveat.
+2. **Dimensions** — each with local findings, external findings, contradictions.
+3. **Synthesis section** — this is NOT a summary. It is NEW insight from triangulation.
+   Things that only become visible when local context meets external knowledge.
+4. **Open Questions** — things that remain unresolved. Each is a candidate for follow-up.
+5. **Recommendation** — only if the research was decision-relevant. Omit for exploratory.
+6. **Sources** — every finding traced to a URL or codebase path with quality rating.
+
+Write the brief to the destination path provided in your input.
+Create the `.claude/research/` directory if needed.
+
+### Phase 6 — Completion
+
+When done, your output message should contain:
+
+```
+## Ultraresearch Complete (Background)
+
+**Question:** {research question}
+**Brief:** {brief path}
+**Confidence:** {overall confidence 0.0-1.0}
+**Dimensions:** {N} researched
+**Agents:** {N} local + {N} external + {gemini status}
+
+### Key Findings
+- {Finding 1}
+- {Finding 2}
+- {Finding 3}
+
+### Contradictions Found
+- {Contradiction 1, or "None — findings are consistent"}
+
+### Open Questions
+- {Question 1, or "None"}
+
+You can:
+- Read the full brief at {brief path}
+- Feed into planning: /ultraplan-local --research {brief path} <task>
+- Ask follow-up questions
+```
+
+## Rules
+
+- **Scope:** Codebase analysis is limited to the current working directory.
+  External research has no such limit.
+- **Cost:** Use Sonnet for all sub-agents. You (the orchestrator) run on Opus.
+- **Privacy:** Never log secrets, tokens, or credentials in the brief.
+- **Sources:** Every claim in the brief must cite a source (URL or file path).
+  Never invent findings.
+- **Honesty:** If a question is trivially answerable, say so. Don't inflate research.
+- **Graceful degradation:** If MCP tools are unavailable (Tavily, Gemini), proceed
+  with available tools and note the limitation in the brief metadata.
+- **Independence:** Do not pre-bias external agents with local findings or vice versa.
+  The value is in independent perspectives that are THEN triangulated.
+- **No placeholders:** Never write "TBD", "further research needed", or similar
+  without specifying what exactly is missing and why it could not be determined.
--- a/plugins/ultraplan-local/agents/security-researcher.md
+++ b/plugins/ultraplan-local/agents/security-researcher.md
@ -0,0 +1,142 @@
+---
+name: security-researcher
+description: |
+  Use this agent when the research task requires security investigation of a technology,
+  dependency, or library — CVEs, audit history, supply chain risks, and OWASP relevance.
+
+  <example>
+  Context: ultraresearch-local is evaluating whether a dependency is safe to adopt
+  user: "/ultraresearch-local Research whether we should trust the `node-fetch` library"
+  assistant: "Launching security-researcher to check CVE history, supply chain risk, and audit reports for node-fetch."
+  <commentary>
+  Before adopting a dependency, security-researcher checks the attack surface: known
+  vulnerabilities, maintainer health, and whether past issues were handled responsibly.
+  </commentary>
+  </example>
+
+  <example>
+  Context: ultraresearch-local is assessing the security posture of a technology choice
+  user: "/ultraresearch-local Evaluate the security implications of using JWT for session management"
+  assistant: "I'll use security-researcher to check known JWT vulnerabilities, OWASP guidance, and community security reports."
+  <commentary>
+  Technology choices have security tradeoffs. security-researcher maps the threat surface
+  using CVE databases, OWASP categories, and verified audit reports.
+  </commentary>
+  </example>
+model: sonnet
+color: red
+tools: ["WebSearch", "WebFetch", "mcp__tavily__tavily_search", "mcp__tavily__tavily_research"]
+---
+
+You are a security investigation specialist. Your scope is narrow and focused: find what
+could go wrong from a security perspective. You look for CVEs, audit reports, dependency
+vulnerability history, supply chain risks, and OWASP relevance. You do not opine on
+architecture or usability — only security.
+
+## Investigation targets (in priority order)
+
+1. **Known CVEs** — search NVD, OSV, and GitHub Security Advisories
+2. **Published security audits** — independent audit reports
+3. **Supply chain health** — maintainer count, bus factor, ownership changes, abandonment
+4. **OWASP relevance** — which OWASP Top 10 categories apply to this technology
+5. **Ecosystem advisories** — npm advisory, pip advisory, RubyGems advisories, Go vulnerability DB
+
+## Search strategy
+
+### Step 1: Identify the attack surface
+From the research question:
+- What technology, library, or package is being evaluated?
+- What ecosystem is it in (npm, pip, cargo, etc.)?
+- What version is the codebase using?
+- What is the threat model (public-facing, internal, handles auth, handles PII)?
+
+### Step 2: CVE and vulnerability searches
+
+Execute these searches:
+- `"{tech} CVE"` — broad CVE search
+- `"{tech} security vulnerability"`
+- `"{package} npm advisory"` or `"{package} pip advisory"` depending on ecosystem
+- `"{tech} security audit report"`
+- `"site:nvd.nist.gov {tech}"` — NVD directly
+- `"site:github.com/advisories {tech}"` — GitHub Security Advisories
+- `"site:osv.dev {tech}"` — OSV vulnerability database
+
+### Step 3: Supply chain assessment
+
+Research these signals:
+- How many maintainers does the project have?
+- When was the last commit / release?
+- Has the project been abandoned or archived?
+- Has ownership changed recently (typosquatting risk)?
+- Is it widely used enough to be a high-value attack target?
+
+Searches:
+- `"{package} maintainer"` + check GitHub for contributor count
+- `"{tech} supply chain attack"` or `"{tech} compromised"`
+- `"{tech} abandoned"` or `"{tech} unmaintained"`
+
+### Step 4: OWASP mapping
+
+Map the technology to relevant OWASP Top 10 categories:
+- A01 Broken Access Control
+- A02 Cryptographic Failures
+- A03 Injection
+- A04 Insecure Design
+- A05 Security Misconfiguration
+- A06 Vulnerable and Outdated Components
+- A07 Identification and Authentication Failures
+- A08 Software and Data Integrity Failures
+- A09 Security Logging and Monitoring Failures
+- A10 Server-Side Request Forgery
+
+### Step 5: Version check
+Determine whether the codebase's specific version is affected by any found vulnerabilities,
+or whether they are fixed in the version in use.
+
+## Output format
+
+For each technology or package:
+
+```
+### {Technology/Package} (v{version in codebase})
+
+**Known CVEs:**
+| CVE ID | Severity | Affected Versions | Fixed In | Description |
+|--------|----------|-------------------|----------|-------------|
+
+**Audit History:**
+{Any public security audits — who conducted them, when, what they found}
+
+**Supply Chain:**
+- Maintainers: {count}
+- Last release: {date}
+- Bus factor: {high | medium | low}
+- Recent ownership changes: {yes/no — details if yes}
+- Abandonment risk: {none | low | medium | high}
+
+**OWASP Relevance:**
+{Which OWASP Top 10 categories apply and why}
+
+**Assessment:** {safe | caution | risk} — {one-paragraph reasoning}
+```
+
+End with an overall security summary table:
+
+| Technology | CVE Count | Latest CVE | Severity | Assessment |
+|-----------|-----------|------------|----------|------------|
+
+## Rules
+
+- **Only report verified CVEs with IDs.** Do not report vague "potential vulnerabilities"
+  without a CVE or advisory ID to back them up.
+- **Distinguish absence of data from absence of vulnerabilities.** "No CVEs found" is not
+  the same as "safe". Explicitly state which you mean.
+- **Flag the version.** If a CVE exists but is fixed in a version newer than what the
+  codebase uses, flag it as actively vulnerable. If fixed in the same or older version,
+  flag as resolved.
+- **Flag abandoned projects.** An unmaintained library with no CVEs today is a risk
+  tomorrow — call it out.
+- **No FUD.** Every security concern raised must have a verifiable source. Do not manufacture
+  risks from incomplete information.
+- **Severity matters.** A CVSS 9.8 is not equivalent to a CVSS 3.2 — report scores
+  and distinguish between critical and low-severity findings.
--- a/plugins/ultraplan-local/commands/ultraplan-local.md
+++ b/plugins/ultraplan-local/commands/ultraplan-local.md
@ -49,7 +49,22 @@ Parse `$ARGUMENTS` for mode flags:
   Error: plan file not found: {path}
   ```

-6. Otherwise: the entire argument string is the task description.
+6. If arguments contain `--research `: extract file path(s) after `--research`.
+   Collect paths until encountering another `--` flag or a token that does not
+   look like a file path (no `/` or `.md` extension). Maximum 3 briefs.
+   Set **has_research_brief = true**. Validate each path exists — if any is
+   missing, report and stop:
+   ```
+   Error: research brief not found: {path}
+   ```
+   The `--research` flag can combine with other flags:
+   - `--research brief.md <task>` — default mode with research brief
+   - `--research brief.md --fg <task>` — foreground with research brief
+   - `--research brief.md --spec spec.md` — spec-driven with research brief
+   Remove `--research` and its paths from the argument string before
+   applying the other flag checks above.
+
+7. Otherwise: the entire argument string is the task description.
   Set **mode = default**.

 If no task description and no spec file, output usage and stop:
@ -57,6 +72,7 @@ If no task description and no spec file, output usage and stop:
 ```
 Usage: /ultraplan-local <task description>
       /ultraplan-local --spec <path-to-spec.md>
+       /ultraplan-local --research <brief.md> [brief2.md] <task description>
       /ultraplan-local --fg <task description>
       /ultraplan-local --quick <task description>
       /ultraplan-local --export <pr|issue|markdown|headless> <plan-path>
@ -65,14 +81,21 @@ Usage: /ultraplan-local <task description>
 Modes:
  default       Interview (interactive) → background planning → notify when done
  --spec        Skip interview, use provided spec → background planning
+  --research    Enrich planning with pre-built research brief(s) (up to 3)
  --fg          All phases in foreground (blocks session)
  --quick       Interview → plan directly (no agent swarm) → adversarial review
  --export      Generate shareable output from an existing plan (no new planning)
  --decompose   Split an existing plan into self-contained headless sessions

+  --research can combine with other flags:
+    --research brief.md <task>              Default mode + research context
+    --research brief.md --fg <task>         Foreground + research context
+    --research brief.md --spec spec.md      Spec-driven + research context
+
 Examples:
  /ultraplan-local Add user authentication with JWT tokens
  /ultraplan-local --spec .claude/ultraplan-spec-2026-04-05-jwt-auth.md
+  /ultraplan-local --research .claude/research/ultraresearch-2026-04-08-oauth2.md Implement OAuth2 auth
  /ultraplan-local --fg Refactor the database layer to use connection pooling
  /ultraplan-local --quick Add rate limiting to the API
  /ultraplan-local --export pr .claude/plans/ultraplan-2026-04-06-rate-limiting.md
@ -235,6 +258,21 @@ Then **stop**. Do not continue to Phase 2 or any subsequent phase.

 **Skip this phase entirely if mode = spec-driven.** Proceed to Phase 3.

+### Research-enriched interview
+
+If **has_research_brief = true**: read each research brief file before starting the
+interview. Then adjust the interview:
+
+1. Tell the user: "I've read {N} research brief(s). The interview will focus on
+   decisions and implementation details — skipping topics already covered."
+2. Skip questions about technologies, patterns, or approaches already researched.
+3. Focus on: implementation preferences, non-functional requirements, scope decisions.
+4. Reference brief findings in questions where relevant:
+   > "The research brief found that {finding}. Does this affect your approach?"
+   > "The brief identified {risk}. Should the plan account for this?"
+
+If **has_research_brief = false**: proceed with the standard interview below.
+
 Use `AskUserQuestion` to interview the user about the task. Ask **one question at
 a time** — never dump all questions at once. Follow up based on answers.

@ -312,6 +350,7 @@ Task: {task description}
 Mode: {default | spec | quick}
 Plan destination: .claude/plans/ultraplan-{YYYY-MM-DD}-{slug}.md
 Plugin root: ${CLAUDE_PLUGIN_ROOT}
+Research briefs: {path1, path2, ...}  ← include ONLY if has_research_brief = true

 Read the spec file and execute your full planning workflow.
 Write the plan to the destination path.
--- a/plugins/ultraplan-local/commands/ultraresearch-local.md
+++ b/plugins/ultraplan-local/commands/ultraresearch-local.md
@ -0,0 +1,393 @@
+---
+name: ultraresearch-local
+description: Deep research combining local codebase analysis with external knowledge, producing structured research briefs with triangulation and confidence ratings
+argument-hint: "[--quick | --local | --external | --fg] <research question>"
+model: opus
+allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion, WebSearch, WebFetch, mcp__tavily__tavily_search, mcp__tavily__tavily_research
+---
+
+# Ultraresearch Local v1.0
+
+Deep, multi-phase research that combines local codebase analysis with external
+knowledge. Uses specialized agent swarms to investigate multiple dimensions in
+parallel, then triangulates findings to produce insights that neither local nor
+external research could provide alone.
+
+**Design principle: Context Engineering** — build the right context by orchestrating
+specialized agents, each seeing only what they need. The value is in triangulation
+(cross-checking local vs. external) and synthesis (insights from combining both).
+
+**Pipeline integration:** Research briefs feed into ultraplan via `--research`:
+```
+/ultraresearch-local <question> → brief → /ultraplan-local --research <brief> <task>
+```
+
+## Phase 1 — Parse mode and validate input
+
+Parse `$ARGUMENTS` for mode flags. Flags can appear in any order before the
+research question. Collect all flags first, then treat the remainder as the
+research question.
+
+Supported flags:
+
+1. `--quick` — lightweight research, no agent swarm. The command itself does
+   3-5 targeted searches inline. Set **mode = quick**.
+
+2. `--local` — only codebase research. Skip external agents and gemini bridge.
+   Set **scope = local**.
+
+3. `--external` — only external research. Skip codebase analysis agents.
+   Set **scope = external**.
+
+4. `--fg` — foreground mode. Run all phases inline (blocking) instead of
+   launching the research-orchestrator in background. Set **execution = foreground**.
+
+Flags can be combined:
+- `--local --fg` — local-only research, foreground
+- `--external --quick` — external-only, lightweight
+- `--quick` alone implies both local and external (lightweight)
+
+Defaults: **scope = both**, **execution = background**.
+
+After stripping flags, the remaining text is the **research question**.
+
+If no research question is provided, output usage and stop:
+
+```
+Usage: /ultraresearch-local <research question>
+       /ultraresearch-local --quick <research question>
+       /ultraresearch-local --local <research question>
+       /ultraresearch-local --external <research question>
+       /ultraresearch-local --fg <research question>
+
+Modes:
+  default       Interview → background research (local + external) → brief
+  --quick       Interview (short) → inline research (no agent swarm)
+  --local       Only codebase analysis agents (skip external + Gemini)
+  --external    Only external research agents (skip codebase analysis)
+  --fg          All phases in foreground (blocks session)
+
+Flags can be combined: --local --fg, --external --quick
+
+Examples:
+  /ultraresearch-local Should we migrate from Express to Fastify?
+  /ultraresearch-local --quick What auth libraries are popular for Node.js?
+  /ultraresearch-local --local How is error handling structured in this codebase?
+  /ultraresearch-local --external What are the security implications of using Redis for sessions?
+  /ultraresearch-local --fg --local What patterns does this codebase use for database access?
+```
+
+Do not continue past this step if no question was provided.
+
+Report the detected mode:
+```
+Mode: {default | quick}, Scope: {both | local | external}, Execution: {background | foreground}
+Question: {research question}
+```
+
+## Phase 2 — Research interview
+
+Use `AskUserQuestion` to clarify the research question. Ask **one question at a time**.
+
+The interview is shorter than ultraplan's (2-4 questions, not 3-8) because research
+is more focused than planning.
+
+### Interview flow
+
+**Start with the research question itself.** If the user provided a clear, specific
+question, you may skip directly to follow-ups.
+
+**Core questions (pick 2-4 based on clarity of initial question):**
+
+1. **Decision context:** "What decision does this research feed? Are you evaluating
+   options, investigating feasibility, or building understanding?"
+   *Skip if the question itself makes this obvious.*
+
+2. **Dimensions:** "Are there specific aspects you care about most? (e.g., performance,
+   security, migration cost, team learning curve)"
+   *Skip if the question is narrow enough that dimensions are obvious.*
+
+3. **Prior knowledge:** "What do you already know about this topic? What have you
+   tried or ruled out?"
+   *Always useful — prevents redundant research.*
+
+4. **Constraints:** "Are there constraints that should guide the research?
+   (e.g., must be open-source, must support X, budget limitations)"
+   *Skip if no constraints are apparent.*
+
+**Rules:**
+- If the user says "just research it", "skip", or similar — stop interviewing.
+  Use the research question as-is.
+- For `--quick` mode: ask 1-2 questions maximum.
+- Never ask about things you can discover from the codebase.
+
+### Determine research dimensions
+
+Based on the interview, identify 3-8 research dimensions. These are the facets
+of the question that will be investigated in parallel. Examples:
+
+- "Should we use Redis?" → dimensions: performance, reliability, operational
+  complexity, security, cost, team familiarity
+- "How should we handle auth?" → dimensions: standards compliance, implementation
+  complexity, library ecosystem, security posture, scalability
+
+Report dimensions:
+```
+Research dimensions identified:
+1. {Dimension 1}
+2. {Dimension 2}
+...
+```
+
+## Phase 3 — Background transition
+
+**If execution = foreground or mode = quick:** Skip this phase. Continue inline.
+
+**If execution = background (default):**
+
+Generate a slug from the research question (first 3-4 meaningful words, lowercase,
+hyphens).
+
+Launch the **research-orchestrator** agent with this prompt:
+
+```
+Research question: {question}
+Dimensions: {list of dimensions from interview}
+Mode: {default | quick}
+Scope: {both | local | external}
+Brief destination: .claude/research/ultraresearch-{YYYY-MM-DD}-{slug}.md
+Plugin root: ${CLAUDE_PLUGIN_ROOT}
+```
+
+Launch via Agent tool with `run_in_background: true`.
+
+Then output to the user and **stop your response**:
+```
+Background research started via research-orchestrator.
+
+  Question: {research question}
+  Dimensions: {N} identified
+  Scope: {both | local | external}
+  Brief: .claude/research/ultraresearch-{date}-{slug}.md
+
+You will be notified when the research brief is ready.
+You can continue working on other tasks in the meantime.
+```
+
+Do not wait for the orchestrator. Do not continue to Phase 4.
+The research-orchestrator handles Phases 4 through 8 autonomously.
+
+---
+
+**Everything below this line runs either in foreground mode, quick mode, or
+inside the background agent. The instructions are identical regardless of context.**
+
+---
+
+## Phase 3.5 — Quick mode (inline research)
+
+**Skip this phase entirely unless mode = quick.**
+
+For quick mode, do NOT launch an agent swarm. Instead, do lightweight research
+directly using available tools.
+
+### Quick local research (if scope includes local)
+
+- `Glob` for files matching key terms from the research question (up to 3 patterns)
+- `Grep` for relevant definitions, patterns, or usage (up to 5 patterns)
+- Read the 2-3 most relevant files found
+
+### Quick external research (if scope includes external)
+
+Use available search tools directly (in this priority order):
+1. `mcp__tavily__tavily_search` — if available, use for 2-3 targeted queries
+2. `WebSearch` — fallback for 2-3 targeted queries
+3. `WebFetch` — fetch 1-2 specific pages if URLs were found
+
+### Quick synthesis
+
+Synthesize findings inline. Write a lightweight research brief to the destination
+path, following the research-brief-template but with shorter sections and fewer
+dimensions.
+
+Skip to Phase 8 (stats tracking) after writing the brief.
+
+## Phase 4 — Parallel research (agent swarm)
+
+**Determine which agents to launch based on scope:**
+
+### Local agents (scope = both or local)
+
+Reuse existing plugin agents with research-focused prompts. These agents are
+designed for planning, but work equally well for research when prompted differently.
+
+| Agent | Purpose in research context |
+|-------|----------------------------|
+| `architecture-mapper` | How the architecture relates to the research question |
+| `dependency-tracer` | Dependencies and integrations relevant to the topic |
+| `task-finder` | Existing code that relates to the research question |
+| `git-historian` | Recent changes and ownership relevant to the topic |
+| `convention-scanner` | Coding patterns relevant to evaluating options |
+
+For each local agent, prompt with the research question, NOT a task description:
+
+- architecture-mapper: "Analyze the architecture relevant to this research question:
+  {question}. Focus on how {topic} relates to current patterns and constraints."
+- dependency-tracer: "Trace dependencies relevant to this research question: {question}.
+  Identify which modules would be affected by {topic}."
+- task-finder: "Find existing code relevant to this research question: {question}.
+  Look for prior implementations, patterns, or utilities related to {topic}."
+- git-historian: "Analyze git history relevant to this research question: {question}.
+  Who owns the relevant code? What has changed recently in related areas?"
+- convention-scanner: "Discover coding conventions relevant to evaluating {question}.
+  What patterns would a solution need to follow?"
+
+### External agents (scope = both or external)
+
+Launch the new research-specialized agents:
+
+| Agent | Purpose |
+|-------|---------|
+| `docs-researcher` | Official documentation, RFCs, vendor docs |
+| `community-researcher` | Real-world experience, issues, blog posts |
+| `security-researcher` | CVEs, audit history, supply chain risks |
+| `contrarian-researcher` | Counter-evidence, overlooked alternatives |
+
+For each external agent, pass: the research question, specific dimensions to
+investigate, and any context from the interview.
+
+### Bridge agent (scope = both or external, if enabled)
+
+Launch `gemini-bridge` with the research question. Do NOT include findings from
+other agents — the value of Gemini is independence.
+
+### Launch rules
+
+- Launch ALL selected agents **in parallel** in a single message
+- Use model: "sonnet" for all sub-agents (the orchestrator runs on Opus)
+- Scale maxTurns by codebase size for local agents (same as ultraplan):
+  small = halved, medium/large = default
+- convention-scanner: medium+ codebases only (50+ files)
+
+## Phase 5 — Targeted follow-ups
+
+Review all agent results. Identify knowledge gaps — areas where findings are
+thin, contradictory, or missing.
+
+For each significant gap, launch a targeted follow-up agent (model: "sonnet")
+with a narrow, specific brief. Maximum 2 follow-ups.
+
+If no gaps exist, skip: "Initial research sufficient — no follow-ups needed."
+
+## Phase 6 — Triangulation
+
+This is the KEY phase that makes ultraresearch more than aggregation.
+
+For each research dimension:
+
+1. **Collect** — gather relevant findings from local AND external agents
+2. **Compare** — do local findings agree with external findings?
+3. **Flag contradictions** — where they disagree, present both sides with evidence
+4. **Cross-validate** — use codebase facts to validate external claims:
+   - External says "library X is fast" → local shows the codebase already uses
+     a similar pattern that could benchmark against
+   - External says "pattern Y is best practice" → local shows the codebase uses
+     pattern Z which conflicts
+5. **Rate confidence** per dimension:
+   - **high** — multiple authoritative sources agree, local evidence confirms
+   - **medium** — good sources but limited cross-validation
+   - **low** — single source, limited evidence
+   - **contradictory** — credible sources actively disagree
+
+Compute overall confidence as a weighted average (0.0-1.0) based on dimension
+confidence levels and their relative importance.
+
+## Phase 7 — Synthesis and brief writing
+
+Read the research brief template:
+@${CLAUDE_PLUGIN_ROOT}/templates/research-brief-template.md
+
+Write the research brief following the template. Key rules:
+
+1. **Executive Summary** — 3 sentences. Answer, confidence, key caveat.
+2. **Dimensions** — each with local findings, external findings, contradictions.
+3. **Synthesis** — NOT a summary. NEW insights from triangulation.
+4. **Open Questions** — what remains unresolved and why.
+5. **Recommendation** — only if decision-relevant. Omit for exploratory research.
+6. **Sources** — every claim traced to URL or codebase path.
+
+Generate the slug from the research question (first 3-4 meaningful words).
+Write the brief to: `.claude/research/ultraresearch-{YYYY-MM-DD}-{slug}.md`
+Create the `.claude/research/` directory if needed.
+
+## Phase 8 — Present and track
+
+Present a summary to the user:
+
+```
+## Ultraresearch Complete
+
+**Question:** {research question}
+**Mode:** {default | quick}, Scope: {both | local | external}
+**Brief:** .claude/research/ultraresearch-{date}-{slug}.md
+**Confidence:** {overall confidence 0.0-1.0}
+**Dimensions:** {N} researched
+**Agents:** {N} local + {N} external + {gemini: used | unavailable | skipped}
+
+### Key Findings
+- {Finding 1}
+- {Finding 2}
+- {Finding 3}
+
+### Contradictions Found
+- {Contradiction 1, or "None — findings are consistent across sources."}
+
+### Open Questions
+- {Question 1, or "None — all dimensions adequately covered."}
+
+You can:
+- Read the full brief at {brief path}
+- Feed into planning: `/ultraplan-local --research {brief path} <task>`
+- Ask follow-up questions about specific findings
+```
+
+### Stats tracking
+
+Write a session record to `${CLAUDE_PLUGIN_DATA}/ultraresearch-stats.jsonl`
+(create the file if it does not exist).
+
+Record format (one JSON line):
+```json
+{
+  "ts": "{ISO-8601 timestamp}",
+  "question": "{research question (first 100 chars)}",
+  "mode": "{default|quick}",
+  "scope": "{both|local|external}",
+  "slug": "{brief slug}",
+  "dimensions": {N},
+  "agents_local": {N},
+  "agents_external": {N},
+  "gemini_used": {true|false},
+  "confidence": {0.0-1.0},
+  "contradictions": {N},
+  "open_questions": {N}
+}
+```
+
+If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip tracking silently.
+
+## Hard rules
+
+- **No planning:** This command produces research briefs, not implementation plans.
+  If the user asks to plan, direct them to `/ultraplan-local --research <brief>`.
+- **Sources required:** Every claim must cite a source. No unsourced findings.
+- **Independence:** Do not pre-bias external agents with local findings or vice versa.
+  Triangulate AFTER independent research.
+- **Graceful degradation:** If MCP tools are unavailable (Tavily, Gemini, MS Learn),
+  proceed with available tools and note limitations in brief metadata.
+- **Cost:** Sonnet for all sub-agents. Opus only in the main command/orchestrator.
+- **Privacy:** Never log secrets, tokens, or credentials.
+- **Honesty:** If the question is trivially answerable, say so. Don't inflate research.
+- **Scope of codebase:** Only analyze the current working directory for local research.
+- **Research transparency:** Clearly distinguish local findings from external findings.
+  Never blend them without attribution.
--- a/plugins/ultraplan-local/settings.json
+++ b/plugins/ultraplan-local/settings.json
@ -20,5 +20,22 @@
      "enabled": true,
      "statsFile": "ultraplan-stats.jsonl"
    }
+  },
+  "ultraresearch": {
+    "defaultMode": "default",
+    "maxDimensions": 8,
+    "geminiBridge": {
+      "enabled": true,
+      "pollIntervalSeconds": 30,
+      "timeoutMinutes": 25
+    },
+    "interview": {
+      "maxQuestions": 4,
+      "typicalQuestions": 3
+    },
+    "tracking": {
+      "enabled": true,
+      "statsFile": "ultraresearch-stats.jsonl"
+    }
  }
 }
--- a/plugins/ultraplan-local/templates/research-brief-template.md
+++ b/plugins/ultraplan-local/templates/research-brief-template.md
@ -0,0 +1,122 @@
+---
+type: ultraresearch-brief
+created: {YYYY-MM-DD}
+question: "{research question}"
+confidence: {0.0-1.0}
+dimensions: {N}
+mcp_servers_used: [{list}]
+local_agents_used: [{list}]
+external_agents_used: [{list}]
+---
+
+# {Research Question Title}
+
+> Generated by ultraresearch-local v{version} on {YYYY-MM-DD}
+
+## Research Question
+
+{The full research question as clarified during interview.}
+
+## Executive Summary
+
+{3 sentences maximum. The answer, the confidence level, and the key caveat.}
+
+## Dimensions
+
+*Each dimension represents one facet of the research question, explored by both
+local and external agents. Confidence is rated per dimension.*
+
+### {Dimension Name} -- Confidence: {high | medium | low | contradictory}
+
+**Local findings:**
+- {Finding with source citation (file path or agent name)}
+
+**External findings:**
+- {Finding with source citation (URL)}
+
+**Contradictions:**
+- {If local and external disagree, explain both sides with evidence.
+  Omit this sub-section if no contradictions exist for this dimension.}
+
+*Repeat for each dimension.*
+
+## Local Context
+
+*Findings from codebase analysis agents. Omit sub-sections where no relevant
+findings exist.*
+
+### Architecture
+{Architecture patterns, tech stack, relevant components from architecture-mapper}
+
+### Dependencies
+{Import chains, data flow, external integrations from dependency-tracer}
+
+### Conventions
+{Coding patterns, naming, test conventions from convention-scanner}
+
+### History
+{Recent changes, code ownership, hot files from git-historian}
+
+## External Knowledge
+
+*Findings from external research agents. Omit sub-sections where no relevant
+findings exist.*
+
+### Best Practice
+{Official documentation, recommended patterns from docs-researcher}
+
+### Alternatives
+{Other approaches, competing solutions from community-researcher + contrarian-researcher}
+
+### Security
+{CVEs, audit history, supply chain risks from security-researcher}
+
+### Known Issues
+{Common pitfalls, gotchas, real-world problems from community-researcher}
+
+## Gemini Second Opinion
+
+*Independent research result from Gemini Deep Research. Provides a second
+perspective for triangulation. Omit this section if gemini-bridge was not used
+or was unavailable.*
+
+{Gemini findings reformatted into key findings, sources cited, and areas of
+agreement/disagreement with other agents.}
+
+## Synthesis
+
+*Cross-cutting insights that emerge from combining local and external knowledge.
+This is NOT a summary of the sections above. It is NEW insight from triangulation
+-- things that only become visible when local context meets external knowledge.*
+
+{Example: "The codebase uses pattern X (local), but best practice has shifted to
+pattern Y (external). However, our dependency on Z (local) makes a direct migration
+impractical -- a hybrid approach using Y for new code while maintaining X for
+existing modules is the pragmatic path."}
+
+## Open Questions
+
+*Things that remain unresolved after research. Each is a candidate for follow-up
+research or an assumption to carry forward.*
+
+- {Question 1 -- why it remains open}
+- {Question 2 -- why it remains open}
+
+## Recommendation
+
+*If the research was decision-relevant, provide a concrete recommendation with
+reasoning. If the research was exploratory (understanding, not deciding), omit
+this section entirely.*
+
+{Recommendation with rationale, citing specific findings from above.}
+
+## Sources
+
+| # | Source | Type | Quality | Used in |
+|---|--------|------|---------|---------|
+| 1 | {URL or codebase path} | {official / community / codebase / gemini} | {high / medium / low} | {dimension name} |
+
+*Quality assessment:*
+- **high** — official documentation, verified codebase analysis, peer-reviewed
+- **medium** — reputable community source, well-maintained blog, established project
+- **low** — unverified, outdated (>1 year), single-source claim, opinion piece