Initial commit: multimodal RAG guide with Claude Code

Prompt-driven guide for building multimodal search using Gemini Embedding 2 + Pinecone + Claude Code. Includes example data (NASA public domain), step-by-step prompts, concepts explainer, cost breakdown, and troubleshooting guide.
2026-03-12 16:36:22 +01:00 · 2026-03-12 16:36:22 +01:00 · edcd1721df
commit edcd1721df
19 changed files with 4446 additions and 0 deletions
--- a/prompts/01-setup.md
+++ b/prompts/01-setup.md
@ -0,0 +1,48 @@
+# Prompt 1: Set Up the Project
+
+Copy this into Claude Code after you have your API keys ready.
+
+---
+
+```
+I want to build a multimodal search app. I have a folder of files
+(PDFs, images with text descriptions) that I want to make searchable
+using natural language.
+
+Here is the tech stack I want:
+- Google Gemini Embedding 2 for converting content to embeddings
+  (I have a Google AI Studio API key)
+- Pinecone for storing the embeddings
+  (I have a Pinecone API key and an index called "space-search")
+- A simple local web interface where I can type questions and
+  get results from my files
+- Use Claude for answering questions based on the search results
+  (use my Claude Code subscription, not a separate API key)
+
+My example files are in the folder: example-data/
+That folder contains:
+- 3 PDF files about the solar system, Jupiter, and planetary moons
+- 4 JPG images (Earthrise, Moon landing, Jupiter, ISS)
+- A file called descriptions.md with detailed text descriptions
+  of each image
+
+Please set up the project structure, install dependencies, and
+create a .env.template file for the API keys. Use Node.js with
+TypeScript. Do not start building the search logic yet, just
+the project skeleton.
+```
+
+---
+
+## What Claude Code will do
+
+1. Create a new project folder with `package.json`
+2. Install libraries for Gemini embeddings, Pinecone, and a web server
+3. Create a `.env.template` with placeholders for your API keys
+4. Set up TypeScript configuration
+
+## What you do next
+
+1. Copy `.env.template` to `.env`
+2. Fill in your actual API keys
+3. Move to Prompt 2
--- a/prompts/02-ingest.md
+++ b/prompts/02-ingest.md
@ -0,0 +1,67 @@
+# Prompt 2: Ingest Your Files
+
+Copy this into Claude Code after the project is set up and your
+.env file has your API keys.
+
+---
+
+```
+Now build the ingestion pipeline. I need a script that:
+
+1. Reads each PDF in example-data/ and extracts the text content.
+   Split long documents into chunks of roughly 500 words each.
+   Keep track of which file and which section each chunk came from.
+
+2. Reads the image descriptions from example-data/descriptions.md.
+   Use the "Good description" for each image (ignore the "Bad" ones).
+   Each image description becomes one chunk, linked to its image file.
+
+3. For each chunk, generate an embedding using Google Gemini
+   Embedding 2 (model: gemini-embedding-exp-03-07 or the latest
+   available). Use task_type "RETRIEVAL_DOCUMENT" for all chunks.
+
+4. Store each embedding in Pinecone along with metadata:
+   - source_file: the original filename
+   - content_type: "pdf" or "image"
+   - text: the actual text content of the chunk
+   - chunk_index: which chunk number within the file
+
+5. After ingestion, print a summary: how many chunks were created,
+   how many embeddings stored, and any errors.
+
+Run the ingestion script after building it. Show me the output.
+```
+
+---
+
+## What Claude Code will do
+
+1. Build a script that reads PDFs and extracts text
+2. Parse the descriptions.md file for image descriptions
+3. Send each chunk to Google Gemini for embedding
+4. Store everything in Pinecone with metadata
+5. Run the script and show results
+
+## What to expect
+
+You should see output like:
+
+```
+Processing solar-system-overview.pdf... 3 chunks
+Processing jupiter-fact-sheet.pdf... 4 chunks
+Processing solar-system-moons.pdf... 3 chunks
+Processing earthrise.jpg (from descriptions)... 1 chunk
+Processing aldrin-moon.jpg (from descriptions)... 1 chunk
+Processing jupiter-great-red-spot.jpg (from descriptions)... 1 chunk
+Processing iss-over-earth.jpg (from descriptions)... 1 chunk
+
+Total: 14 chunks ingested, 14 embeddings stored in Pinecone.
+```
+
+The exact numbers may vary depending on how Claude Code splits the PDFs.
+
+## If something goes wrong
+
+- "API key invalid": check your .env file
+- "Index not found": make sure your Pinecone index name matches
+- "Rate limit": wait a minute and run the script again
--- a/prompts/03-search.md
+++ b/prompts/03-search.md
@ -0,0 +1,67 @@
+# Prompt 3: Build the Search Interface
+
+Copy this into Claude Code after ingestion is complete.
+
+---
+
+```
+Now build a simple web interface for searching my content.
+I want a local web app (localhost) with:
+
+1. A search box where I type a question in plain English.
+
+2. When I search, the app should:
+   a. Convert my question to an embedding using Gemini Embedding 2
+      with task_type "RETRIEVAL_QUERY"
+   b. Search Pinecone for the 5 most similar chunks
+   c. Show me the matching results with:
+      - The source file name
+      - The relevant text snippet
+      - A similarity score
+      - If the result is from an image, show the image too
+
+3. Below the search results, add a "Ask Claude" button that:
+   a. Takes the search results as context
+   b. Sends them to Claude (use the claude CLI command, since I
+      have Claude Code installed) with my question
+   c. Shows Claude's answer, which should reference the specific
+      files it used
+
+Keep the design simple and clean. Dark background, readable text.
+No frameworks needed, just HTML + CSS + a small server.
+
+Start the app after building it.
+```
+
+---
+
+## What Claude Code will do
+
+1. Build a small web server (Express or similar)
+2. Create an HTML page with a search box
+3. Wire up the search to Gemini embeddings + Pinecone
+4. Add Claude integration for AI-powered answers
+5. Start the server
+
+## What to try
+
+Once the app is running, try these searches:
+
+- "What is the largest planet in our solar system?"
+  (Should find the Jupiter fact sheet AND the Jupiter image)
+
+- "Tell me about the first Moon landing"
+  (Should find the Aldrin image description)
+
+- "Which moon has active volcanoes?"
+  (Should find the moons PDF mentioning Io)
+
+- "How far is Jupiter from Earth?"
+  (Should find specific numbers from the Jupiter fact sheet)
+
+- "What can astronauts see from the space station?"
+  (Should find the ISS image description)
+
+These examples demonstrate the power of multimodal search:
+a question about "the largest planet" finds both text data
+AND a photograph, because both are semantically related.
--- a/prompts/04-improve.md
+++ b/prompts/04-improve.md
@ -0,0 +1,69 @@
+# Prompt 4: Improve and Iterate
+
+These are prompts for when you want to make the search better.
+Use whichever ones are relevant to you.
+
+---
+
+## If search results are not relevant enough
+
+```
+The search results are not matching my questions well.
+Can you add a re-ranking step? After getting the top 10 results
+from Pinecone, use Claude to re-rank them by relevance to my
+actual question, then show the top 5.
+```
+
+## If you want to add your own files
+
+```
+I want to add more files to the search index. I have put new
+files in example-data/. Please:
+1. Check which files are new (not already in Pinecone)
+2. Process only the new files
+3. Add their embeddings to the existing index
+4. Show me what was added
+```
+
+## If you want better image descriptions
+
+```
+Look at the images in example-data/ and suggest improved
+descriptions for any that could be more detailed. Show me
+the current description and your suggested improvement
+side by side. Do not update descriptions.md until I approve.
+```
+
+## If you want to add a video clip
+
+```
+I have added a short video file (MP4, under 2 minutes) to
+example-data/. Please:
+1. Extract key frames from the video
+2. Generate descriptions for each key frame
+3. Add these descriptions as searchable chunks in Pinecone
+4. Link them back to the video file with timestamps
+```
+
+## If you want to export or share
+
+```
+I want to share this search app with someone else. Can you:
+1. Create a README with setup instructions
+2. Make sure the .env.template has all required variables
+3. Add a "first run" script that handles ingestion automatically
+4. Package it so someone can clone the repo and get started
+   with just their own API keys
+```
+
+## If something is broken
+
+```
+[Paste the exact error message here]
+
+This happened when I tried to [describe what you did].
+Can you diagnose and fix the issue?
+```
+
+The key to good results with Claude Code: be specific about
+what you see and what you expected to see instead.