Kjell Tore Guttormsen edcd1721df Initial commit: multimodal RAG guide with Claude Code

Prompt-driven guide for building multimodal search using
Gemini Embedding 2 + Pinecone + Claude Code. Includes example
data (NASA public domain), step-by-step prompts, concepts
explainer, cost breakdown, and troubleshooting guide.

2026-03-12 16:36:22 +01:00

2 KiB

Raw Permalink Blame History

Prompt 3: Build the Search Interface

Copy this into Claude Code after ingestion is complete.

Now build a simple web interface for searching my content.
I want a local web app (localhost) with:

1. A search box where I type a question in plain English.

2. When I search, the app should:
   a. Convert my question to an embedding using Gemini Embedding 2
      with task_type "RETRIEVAL_QUERY"
   b. Search Pinecone for the 5 most similar chunks
   c. Show me the matching results with:
      - The source file name
      - The relevant text snippet
      - A similarity score
      - If the result is from an image, show the image too

3. Below the search results, add a "Ask Claude" button that:
   a. Takes the search results as context
   b. Sends them to Claude (use the claude CLI command, since I
      have Claude Code installed) with my question
   c. Shows Claude's answer, which should reference the specific
      files it used

Keep the design simple and clean. Dark background, readable text.
No frameworks needed, just HTML + CSS + a small server.

Start the app after building it.

What Claude Code will do

Build a small web server (Express or similar)
Create an HTML page with a search box
Wire up the search to Gemini embeddings + Pinecone
Add Claude integration for AI-powered answers
Start the server

What to try

Once the app is running, try these searches:

"What is the largest planet in our solar system?" (Should find the Jupiter fact sheet AND the Jupiter image)
"Tell me about the first Moon landing" (Should find the Aldrin image description)
"Which moon has active volcanoes?" (Should find the moons PDF mentioning Io)
"How far is Jupiter from Earth?" (Should find specific numbers from the Jupiter fact sheet)
"What can astronauts see from the space station?" (Should find the ISS image description)

These examples demonstrate the power of multimodal search: a question about "the largest planet" finds both text data AND a photograph, because both are semantically related.

2 KiB Raw Permalink Blame History

Prompt 3: Build the Search Interface

What Claude Code will do

What to try

2 KiB

Raw Permalink Blame History