Initial commit: multimodal RAG guide with Claude Code
Prompt-driven guide for building multimodal search using Gemini Embedding 2 + Pinecone + Claude Code. Includes example data (NASA public domain), step-by-step prompts, concepts explainer, cost breakdown, and troubleshooting guide.
This commit is contained in:
commit
edcd1721df
19 changed files with 4446 additions and 0 deletions
48
prompts/01-setup.md
Normal file
48
prompts/01-setup.md
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
# Prompt 1: Set Up the Project
|
||||
|
||||
Copy this into Claude Code after you have your API keys ready.
|
||||
|
||||
---
|
||||
|
||||
```
|
||||
I want to build a multimodal search app. I have a folder of files
|
||||
(PDFs, images with text descriptions) that I want to make searchable
|
||||
using natural language.
|
||||
|
||||
Here is the tech stack I want:
|
||||
- Google Gemini Embedding 2 for converting content to embeddings
|
||||
(I have a Google AI Studio API key)
|
||||
- Pinecone for storing the embeddings
|
||||
(I have a Pinecone API key and an index called "space-search")
|
||||
- A simple local web interface where I can type questions and
|
||||
get results from my files
|
||||
- Use Claude for answering questions based on the search results
|
||||
(use my Claude Code subscription, not a separate API key)
|
||||
|
||||
My example files are in the folder: example-data/
|
||||
That folder contains:
|
||||
- 3 PDF files about the solar system, Jupiter, and planetary moons
|
||||
- 4 JPG images (Earthrise, Moon landing, Jupiter, ISS)
|
||||
- A file called descriptions.md with detailed text descriptions
|
||||
of each image
|
||||
|
||||
Please set up the project structure, install dependencies, and
|
||||
create a .env.template file for the API keys. Use Node.js with
|
||||
TypeScript. Do not start building the search logic yet, just
|
||||
the project skeleton.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What Claude Code will do
|
||||
|
||||
1. Create a new project folder with `package.json`
|
||||
2. Install libraries for Gemini embeddings, Pinecone, and a web server
|
||||
3. Create a `.env.template` with placeholders for your API keys
|
||||
4. Set up TypeScript configuration
|
||||
|
||||
## What you do next
|
||||
|
||||
1. Copy `.env.template` to `.env`
|
||||
2. Fill in your actual API keys
|
||||
3. Move to Prompt 2
|
||||
67
prompts/02-ingest.md
Normal file
67
prompts/02-ingest.md
Normal file
|
|
@ -0,0 +1,67 @@
|
|||
# Prompt 2: Ingest Your Files
|
||||
|
||||
Copy this into Claude Code after the project is set up and your
|
||||
.env file has your API keys.
|
||||
|
||||
---
|
||||
|
||||
```
|
||||
Now build the ingestion pipeline. I need a script that:
|
||||
|
||||
1. Reads each PDF in example-data/ and extracts the text content.
|
||||
Split long documents into chunks of roughly 500 words each.
|
||||
Keep track of which file and which section each chunk came from.
|
||||
|
||||
2. Reads the image descriptions from example-data/descriptions.md.
|
||||
Use the "Good description" for each image (ignore the "Bad" ones).
|
||||
Each image description becomes one chunk, linked to its image file.
|
||||
|
||||
3. For each chunk, generate an embedding using Google Gemini
|
||||
Embedding 2 (model: gemini-embedding-exp-03-07 or the latest
|
||||
available). Use task_type "RETRIEVAL_DOCUMENT" for all chunks.
|
||||
|
||||
4. Store each embedding in Pinecone along with metadata:
|
||||
- source_file: the original filename
|
||||
- content_type: "pdf" or "image"
|
||||
- text: the actual text content of the chunk
|
||||
- chunk_index: which chunk number within the file
|
||||
|
||||
5. After ingestion, print a summary: how many chunks were created,
|
||||
how many embeddings stored, and any errors.
|
||||
|
||||
Run the ingestion script after building it. Show me the output.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What Claude Code will do
|
||||
|
||||
1. Build a script that reads PDFs and extracts text
|
||||
2. Parse the descriptions.md file for image descriptions
|
||||
3. Send each chunk to Google Gemini for embedding
|
||||
4. Store everything in Pinecone with metadata
|
||||
5. Run the script and show results
|
||||
|
||||
## What to expect
|
||||
|
||||
You should see output like:
|
||||
|
||||
```
|
||||
Processing solar-system-overview.pdf... 3 chunks
|
||||
Processing jupiter-fact-sheet.pdf... 4 chunks
|
||||
Processing solar-system-moons.pdf... 3 chunks
|
||||
Processing earthrise.jpg (from descriptions)... 1 chunk
|
||||
Processing aldrin-moon.jpg (from descriptions)... 1 chunk
|
||||
Processing jupiter-great-red-spot.jpg (from descriptions)... 1 chunk
|
||||
Processing iss-over-earth.jpg (from descriptions)... 1 chunk
|
||||
|
||||
Total: 14 chunks ingested, 14 embeddings stored in Pinecone.
|
||||
```
|
||||
|
||||
The exact numbers may vary depending on how Claude Code splits the PDFs.
|
||||
|
||||
## If something goes wrong
|
||||
|
||||
- "API key invalid": check your .env file
|
||||
- "Index not found": make sure your Pinecone index name matches
|
||||
- "Rate limit": wait a minute and run the script again
|
||||
67
prompts/03-search.md
Normal file
67
prompts/03-search.md
Normal file
|
|
@ -0,0 +1,67 @@
|
|||
# Prompt 3: Build the Search Interface
|
||||
|
||||
Copy this into Claude Code after ingestion is complete.
|
||||
|
||||
---
|
||||
|
||||
```
|
||||
Now build a simple web interface for searching my content.
|
||||
I want a local web app (localhost) with:
|
||||
|
||||
1. A search box where I type a question in plain English.
|
||||
|
||||
2. When I search, the app should:
|
||||
a. Convert my question to an embedding using Gemini Embedding 2
|
||||
with task_type "RETRIEVAL_QUERY"
|
||||
b. Search Pinecone for the 5 most similar chunks
|
||||
c. Show me the matching results with:
|
||||
- The source file name
|
||||
- The relevant text snippet
|
||||
- A similarity score
|
||||
- If the result is from an image, show the image too
|
||||
|
||||
3. Below the search results, add a "Ask Claude" button that:
|
||||
a. Takes the search results as context
|
||||
b. Sends them to Claude (use the claude CLI command, since I
|
||||
have Claude Code installed) with my question
|
||||
c. Shows Claude's answer, which should reference the specific
|
||||
files it used
|
||||
|
||||
Keep the design simple and clean. Dark background, readable text.
|
||||
No frameworks needed, just HTML + CSS + a small server.
|
||||
|
||||
Start the app after building it.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What Claude Code will do
|
||||
|
||||
1. Build a small web server (Express or similar)
|
||||
2. Create an HTML page with a search box
|
||||
3. Wire up the search to Gemini embeddings + Pinecone
|
||||
4. Add Claude integration for AI-powered answers
|
||||
5. Start the server
|
||||
|
||||
## What to try
|
||||
|
||||
Once the app is running, try these searches:
|
||||
|
||||
- "What is the largest planet in our solar system?"
|
||||
(Should find the Jupiter fact sheet AND the Jupiter image)
|
||||
|
||||
- "Tell me about the first Moon landing"
|
||||
(Should find the Aldrin image description)
|
||||
|
||||
- "Which moon has active volcanoes?"
|
||||
(Should find the moons PDF mentioning Io)
|
||||
|
||||
- "How far is Jupiter from Earth?"
|
||||
(Should find specific numbers from the Jupiter fact sheet)
|
||||
|
||||
- "What can astronauts see from the space station?"
|
||||
(Should find the ISS image description)
|
||||
|
||||
These examples demonstrate the power of multimodal search:
|
||||
a question about "the largest planet" finds both text data
|
||||
AND a photograph, because both are semantically related.
|
||||
69
prompts/04-improve.md
Normal file
69
prompts/04-improve.md
Normal file
|
|
@ -0,0 +1,69 @@
|
|||
# Prompt 4: Improve and Iterate
|
||||
|
||||
These are prompts for when you want to make the search better.
|
||||
Use whichever ones are relevant to you.
|
||||
|
||||
---
|
||||
|
||||
## If search results are not relevant enough
|
||||
|
||||
```
|
||||
The search results are not matching my questions well.
|
||||
Can you add a re-ranking step? After getting the top 10 results
|
||||
from Pinecone, use Claude to re-rank them by relevance to my
|
||||
actual question, then show the top 5.
|
||||
```
|
||||
|
||||
## If you want to add your own files
|
||||
|
||||
```
|
||||
I want to add more files to the search index. I have put new
|
||||
files in example-data/. Please:
|
||||
1. Check which files are new (not already in Pinecone)
|
||||
2. Process only the new files
|
||||
3. Add their embeddings to the existing index
|
||||
4. Show me what was added
|
||||
```
|
||||
|
||||
## If you want better image descriptions
|
||||
|
||||
```
|
||||
Look at the images in example-data/ and suggest improved
|
||||
descriptions for any that could be more detailed. Show me
|
||||
the current description and your suggested improvement
|
||||
side by side. Do not update descriptions.md until I approve.
|
||||
```
|
||||
|
||||
## If you want to add a video clip
|
||||
|
||||
```
|
||||
I have added a short video file (MP4, under 2 minutes) to
|
||||
example-data/. Please:
|
||||
1. Extract key frames from the video
|
||||
2. Generate descriptions for each key frame
|
||||
3. Add these descriptions as searchable chunks in Pinecone
|
||||
4. Link them back to the video file with timestamps
|
||||
```
|
||||
|
||||
## If you want to export or share
|
||||
|
||||
```
|
||||
I want to share this search app with someone else. Can you:
|
||||
1. Create a README with setup instructions
|
||||
2. Make sure the .env.template has all required variables
|
||||
3. Add a "first run" script that handles ingestion automatically
|
||||
4. Package it so someone can clone the repo and get started
|
||||
with just their own API keys
|
||||
```
|
||||
|
||||
## If something is broken
|
||||
|
||||
```
|
||||
[Paste the exact error message here]
|
||||
|
||||
This happened when I tried to [describe what you did].
|
||||
Can you diagnose and fix the issue?
|
||||
```
|
||||
|
||||
The key to good results with Claude Code: be specific about
|
||||
what you see and what you expected to see instead.
|
||||
Loading…
Add table
Add a link
Reference in a new issue