1
0
Fork 0

Initial commit: multimodal RAG guide with Claude Code

Prompt-driven guide for building multimodal search using
Gemini Embedding 2 + Pinecone + Claude Code. Includes example
data (NASA public domain), step-by-step prompts, concepts
explainer, cost breakdown, and troubleshooting guide.
This commit is contained in:
Kjell Tore Guttormsen 2026-03-12 16:36:22 +01:00
commit edcd1721df
19 changed files with 4446 additions and 0 deletions

7
.gitignore vendored Normal file
View file

@ -0,0 +1,7 @@
node_modules/
.env
dist/
src/
package.json
package-lock.json
tsconfig.json

21
LICENSE Normal file
View file

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2026 The Dharma Lab
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

241
README.md Normal file
View file

@ -0,0 +1,241 @@
# Build Multimodal Search with Claude Code
Search across your PDFs, images, and documents using plain English.
No coding required. Claude Code builds everything for you.
## What you will build
A local search app that lets you ask questions like:
- "What is the largest planet in our solar system?"
- "Show me photos from the first Moon landing"
- "Which moon has active volcanoes?"
The app searches through your PDFs and images simultaneously and
gives you answers with sources. You talk to it in plain English.
## How is this different from a Google search?
Google searches the internet. This searches YOUR files.
Imagine you have 500 PDFs, research papers, photos, and notes
scattered across folders. Normal file search only matches exact
words. This system understands meaning. You ask "what do we know
about storms on other planets?" and it finds the Jupiter fact sheet
mentioning wind speeds, the Jupiter photograph showing cloud bands,
and the solar system overview describing atmospheric composition.
It connects information across files and formats. That is what
makes it powerful.
## What you need
1. **Claude Code** (comes with Claude Pro at $20/month or Claude Max)
2. **A Google AI Studio account** (free) for Gemini embeddings
3. **A Pinecone account** (free tier) for the vector database
4. **30-45 minutes** for your first time
No programming knowledge required. You will copy prompts into
Claude Code, and it will build everything.
## How it works (the simple version)
```
Your files ──> Embeddings (Gemini) ──> Vector database (Pinecone)
Your question ──> Embedding (Gemini) ──> Search ──> Claude answers
```
1. Your files get converted into "embeddings" (numerical fingerprints
that capture meaning)
2. When you ask a question, it gets the same treatment
3. The system finds fingerprints that match
4. Claude reads the matching content and answers your question
For a deeper explanation, see [concepts.md](concepts.md).
## Step 0: Get your accounts (10 minutes)
### Google AI Studio (for embeddings)
Embeddings convert your content into searchable vectors. We use
Google's Gemini Embedding 2 for this because it handles text,
images, and video.
1. Go to [aistudio.google.com](https://aistudio.google.com/)
2. Sign in with a Google account
3. Click "Get API key" in the left sidebar
4. Click "Create API key"
5. Copy the key somewhere safe
**What is an API key?** It is like a password that lets your app
talk to Google's embedding service. You will paste it into a
configuration file later. It never leaves your computer.
### Pinecone (for storing embeddings)
A vector database stores embeddings so you can search through
them. Think of it as a smart filing cabinet.
1. Go to [pinecone.io](https://www.pinecone.io/) and create a free account
2. Once in the dashboard, click "Create Index"
3. Name it `space-search` (or whatever you like)
4. Set dimensions to `3072` (this matches Gemini Embedding 2)
5. Choose the `cosine` metric
6. Select the free "Starter" plan
7. Copy your API key from the "API Keys" section
### Verify you have Claude Code
Open your terminal and type `claude`. If Claude Code starts,
you are ready. If not, install it:
```
npm install -g @anthropic-ai/claude-code
```
You need a Claude Pro or Max subscription for this to work.
## Step 1: Get the example files
Clone or download this repository. The `example-data/` folder
contains everything you need to get started:
**PDFs:**
- `solar-system-overview.pdf` - Overview of our solar system (NASA)
- `jupiter-fact-sheet.pdf` - Detailed data about Jupiter (NASA)
- `solar-system-moons.pdf` - Guide to planetary moons (NASA)
**Images:**
- `earthrise.jpg` - Earth seen from lunar orbit, Apollo 8 (1968)
- `aldrin-moon.jpg` - Buzz Aldrin on the Moon, Apollo 11 (1969)
- `jupiter-great-red-spot.jpg` - Jupiter photographed by Voyager 1 (1979)
- `iss-over-earth.jpg` - The Moon seen from the ISS
**Descriptions:**
- `descriptions.md` - Detailed text descriptions of each image.
This is the most important file for image search quality.
See the section below on why descriptions matter.
All files are NASA public domain. No copyright restrictions.
## Step 2: Start Claude Code (5 minutes)
Open your terminal, navigate to this folder, and start Claude Code:
```
claude
```
Then copy the prompt from [prompts/01-setup.md](prompts/01-setup.md)
and paste it into Claude Code.
Claude Code will create the project structure and install
dependencies. When it is done, copy `.env.template` to `.env`
and fill in your API keys.
## Step 3: Ingest your files (10 minutes)
Copy the prompt from [prompts/02-ingest.md](prompts/02-ingest.md)
into Claude Code.
Claude Code will read each file, split it into chunks, generate
embeddings, and store everything in Pinecone. You will see a
summary of what was processed.
This is the step where your files become searchable.
## Step 4: Search (5 minutes)
Copy the prompt from [prompts/03-search.md](prompts/03-search.md)
into Claude Code.
Claude Code will build a web interface and start it. Open the URL
it gives you (usually `http://localhost:3333`) in your browser.
Try these searches:
| Search query | What should come back |
|---|---|
| "What is the largest planet?" | Jupiter fact sheet + Jupiter image |
| "First Moon landing" | Aldrin image + solar system overview |
| "Which moon has volcanoes?" | Moons PDF (mentioning Io) |
| "How far is Jupiter from Earth?" | Jupiter fact sheet (588.5 to 968.1 million km) |
| "What do astronauts see from orbit?" | ISS image description |
Notice how a single question can pull results from both PDFs and
images. That is multimodal search.
## Step 5: Make it your own
Now that you have seen it work with NASA files, try it with
your own content:
1. Add your own PDFs, images, or documents to the `example-data/` folder
2. Write descriptions for any images (see the tips in `descriptions.md`)
3. Use [prompts/04-improve.md](prompts/04-improve.md) to re-index
Ideas for what to search:
- Your company's internal documents
- Research papers for a project
- Travel photos with descriptions
- Recipe collections
- Course notes and textbook screenshots
## Why image descriptions matter
The search system cannot "see" your images directly. It finds
images through their text descriptions. This means:
**Bad description:** "Photo of a planet" will only match
searches containing "photo" or "planet."
**Good description:** "Full-disk portrait of Jupiter captured by
Voyager 1 in 1979, showing horizontal cloud bands and the Great
Red Spot, a massive storm larger than Earth" will match searches
about Jupiter, Voyager missions, storms, cloud patterns, and more.
The `descriptions.md` file in `example-data/` shows side-by-side
examples of bad versus good descriptions. Spending five minutes
on better descriptions will dramatically improve your search
results.
## What this costs
$0 extra if you already have a Claude subscription.
Both Gemini embeddings and Pinecone have generous free tiers.
See [costs.md](costs.md) for details.
## If you get stuck
See [troubleshooting.md](troubleshooting.md) for the 10 most
common problems and their solutions.
The most effective fix for almost anything: copy the exact error
message and paste it into Claude Code. It is very good at
diagnosing its own work.
## How it works (the deeper version)
Read [concepts.md](concepts.md) for plain-English explanations of:
- What are embeddings?
- What is a vector database?
- What is RAG?
- What is chunking?
- What does "multimodal" mean?
## Credits
Example data: All PDFs and images are from NASA and are in the
public domain (U.S. Government works, no copyright restrictions).
Built with:
- [Claude Code](https://claude.ai) by Anthropic (app building + AI answers)
- [Gemini Embedding 2](https://ai.google.dev/) by Google (multimodal embeddings)
- [Pinecone](https://www.pinecone.io/) (vector database)
---
*Part of [The Dharma Lab](https://thedharmalab.com). Read the
[full article](https://thedharmalab.com/) for the story behind this project.*

94
concepts.md Normal file
View file

@ -0,0 +1,94 @@
# Concepts: What You Need to Know (and Nothing More)
This page explains the key ideas behind multimodal search.
You do not need to understand these concepts to follow the guide.
But if you are curious about what is happening behind the scenes,
this is for you.
## What is an embedding?
Think of it as a fingerprint for meaning.
When you read the sentence "Jupiter is the largest planet," your brain
understands what it means. An embedding is a way for a computer to do
something similar. It converts text (or an image) into a long list of
numbers that captures the meaning of that content.
The key insight: things that mean similar things get similar numbers.
So "Jupiter is massive" and "Jupiter is the biggest planet" would have
very similar embeddings, even though the words are different.
You never see these numbers. They work behind the scenes.
## What is a vector database?
A place to store embeddings so you can search through them quickly.
Imagine a library where books are not organized by author or title,
but by what they are about. You walk in and say "I want something
about storms on other planets" and the librarian immediately hands
you the right book. That is what a vector database does, but with
your files.
We use Pinecone in this guide because it has a free tier and works
well. There are other options (Chroma, Weaviate, Qdrant), but
Pinecone requires the least setup.
## What is RAG?
RAG stands for Retrieval-Augmented Generation. Big name, simple idea.
Normally, when you ask an AI a question, it answers from its training
data. It might know general facts, but it does not know about YOUR
files. RAG changes that.
With RAG, the AI first searches through your documents to find
relevant information, then uses what it found to answer your question.
It is like giving the AI a cheat sheet of your own content before
it answers.
Without RAG: "What do we know about Jupiter's atmosphere?"
The AI answers from general knowledge.
With RAG: "What do we know about Jupiter's atmosphere?"
The AI searches your PDFs and images, finds the Jupiter fact sheet
and the Voyager photo, and answers based on YOUR specific collection.
## What is chunking?
Your documents might be long. A 50-page PDF cannot be processed
as one piece. Chunking means splitting it into smaller sections
that the AI can work with.
Think of it like cutting a book into chapters. Each chapter gets
its own embedding. When you search, the system finds the right
chapter, not the whole book.
Claude Code handles chunking automatically. You do not need to
do anything.
## What does "multimodal" mean?
"Multi" means many. "Modal" means types.
Regular search works with text only. Multimodal search works with
text AND images AND PDFs AND videos. You can search across all
of them at once.
This is what makes this project interesting. You ask a question
in plain English, and the system searches through your PDFs,
images, and their descriptions to find the best answer, regardless
of what format the information is in.
## How does it all fit together?
1. You put files in a folder (PDFs, images with descriptions)
2. Claude Code builds a system that reads each file
3. Each piece of content gets converted to an embedding (a fingerprint)
4. The embeddings are stored in Pinecone (the vector database)
5. When you search, your question also gets converted to an embedding
6. Pinecone finds the stored embeddings most similar to your question
7. The matching content is shown to you (or fed to an AI for a detailed answer)
That is it. The rest is implementation details, and Claude Code
handles those for you.

67
costs.md Normal file
View file

@ -0,0 +1,67 @@
# What Does This Cost?
Short answer: you can do this entire guide for free or near-free.
## Claude Code
You need a Claude subscription that includes Claude Code.
- **Claude Pro ($20/month):** Includes Claude Code with usage limits.
- **Claude Max ($100/month or $200/month):** Higher limits.
If you already have a Claude subscription, there is no extra cost.
Claude Code handles building the app AND answering questions
based on your search results.
## Google Gemini Embedding 2 (for embeddings only)
We use Google's Gemini Embedding 2 to convert your content into
searchable embeddings. We do NOT use Gemini as a language model.
Claude does the thinking. Gemini just creates the fingerprints.
- **Free tier:** Available through Google AI Studio with rate limits
(1,500 requests per day)
- **Paid:** $0.20 per million tokens
For this guide with 7 example files, you will use roughly 10-20
requests total. The free tier is more than enough.
Why Gemini Embedding 2 and not Voyage (Anthropic's embeddings)?
Because Gemini Embedding 2 supports text, images, AND video natively.
Voyage supports text and images but not video. For a multimodal guide,
the broadest format support wins.
## Pinecone Vector Database (free tier)
Pinecone's free tier includes:
- **Free:** 1 project, 5 indexes, 100,000 vectors
- **No credit card required**
For this guide, we store about 15-30 vectors (one per chunk of
content). You could store thousands of documents and still stay
within the free tier.
## Total cost for this guide
| Service | What it does | Cost |
|---------|-------------|------|
| Claude Code | Builds the app, answers questions | Part of your subscription |
| Gemini Embedding 2 | Converts content to searchable vectors | Free (well within free tier) |
| Pinecone | Stores and searches vectors | Free (well within free tier) |
| **Total** | | **$0 extra** |
## What if you want to scale up?
If you move beyond the example and want to index thousands of
documents, here is what the costs look like:
| Scale | Gemini embeddings | Pinecone | Monthly total |
|-------|-------------------|----------|---------------|
| 100 documents | Free | Free | $0 |
| 1,000 documents | ~$0.50 | Free | ~$0.50 |
| 10,000 documents | ~$5 | Free | ~$5 |
| 100,000 documents | ~$50 | $70+ (Starter plan) | ~$120 |
For most personal and small business use cases, you will stay
comfortably in the free tier.

10
env.template Normal file
View file

@ -0,0 +1,10 @@
# Gemini Embedding 2 (from Google AI Studio)
# Get your key at: https://aistudio.google.com/ > "Get API key"
GOOGLE_API_KEY=
# Pinecone Vector Database
# Get your key at: https://app.pinecone.io/ > "API Keys"
PINECONE_API_KEY=
# Pinecone index name (the one you created in the dashboard)
PINECONE_INDEX=space-search

Binary file not shown.

After

Width:  |  Height:  |  Size: 228 KiB

View file

@ -0,0 +1,100 @@
# Image Descriptions
Good descriptions are the most important part of making images searchable.
The AI uses these descriptions to understand what each image shows.
Better descriptions lead to better search results.
Below are descriptions for each image in this folder. We include both
a "bad" and a "good" version so you can see the difference.
---
## earthrise.jpg
**Bad description:** "Photo of Earth from space."
**Good description:** "Earthrise, photographed by astronaut William Anders
during the Apollo 8 mission on December 24, 1968. Shows planet Earth
rising above the lunar horizon, with the grey, cratered surface of the
Moon in the foreground and the blackness of space behind. Earth appears
as a blue and white marble, partly in shadow, with visible cloud patterns
and ocean. This was the first photograph of Earth taken by a human from
lunar orbit. It became one of the most influential environmental
photographs ever taken."
**Why the good version works:** It includes the mission name (Apollo 8),
the date, the photographer, what is visible in the image, and why the
photo matters historically. A search for "first photo of Earth from the
Moon" or "Apollo 8" or "William Anders" will all find this image.
---
## aldrin-moon.jpg
**Bad description:** "Astronaut on the Moon."
**Good description:** "Buzz Aldrin working beside the Apollo 11 Lunar Module
Eagle on the surface of the Moon, July 20, 1969. Aldrin is wearing a white
spacesuit (A7L Extravehicular Mobility Unit) and is positioned near
scientific equipment deployed on the lunar surface. The Lunar Module is
visible behind him with its gold and silver thermal protection layers. The
grey lunar soil shows footprints and equipment tracks. Photographed by
mission commander Neil Armstrong. This was the first crewed Moon landing
in history."
**Why the good version works:** It names both astronauts, the mission,
the spacecraft, and the equipment visible. A search for "first Moon
landing equipment" or "Apollo 11 Lunar Module" or "Buzz Aldrin" will
find this image.
---
## jupiter-great-red-spot.jpg
**Bad description:** "Planet Jupiter."
**Good description:** "Full-disk color portrait of Jupiter captured by the
Voyager 1 spacecraft in 1979 during its flyby of the planet. Shows
Jupiter's distinctive horizontal cloud bands in shades of orange, brown,
and cream. The Great Red Spot, a massive storm larger than Earth that has
been raging for hundreds of years, is visible in the southern hemisphere.
Jupiter is the largest planet in our solar system, with a mass 318 times
that of Earth. It is a gas giant composed primarily of hydrogen and helium."
**Why the good version works:** It mentions the spacecraft (Voyager 1),
the Great Red Spot, the cloud bands, and key facts about Jupiter.
A search for "largest storm in the solar system" or "gas giant cloud
bands" or "Voyager Jupiter photos" will all find this image.
---
## iss-over-earth.jpg
**Bad description:** "Moon and Earth from space."
**Good description:** "The Moon photographed from the International Space
Station (ISS), showing a gibbous Moon suspended above Earth's atmosphere.
Earth's surface is visible in the lower portion of the image, covered with
clouds and showing the thin blue line of the atmosphere at the horizon.
The photo demonstrates the perspective astronauts have from the ISS,
orbiting approximately 400 kilometers (250 miles) above Earth's surface.
The ISS has been continuously occupied since November 2000 and serves
as a microgravity research laboratory."
**Why the good version works:** It describes what is actually in the frame
(the Moon seen from ISS, not the ISS itself), includes the orbital
altitude, and mentions the ISS as a research laboratory. A search for
"view from the space station" or "Moon from orbit" or "Earth's atmosphere
from space" will find this image.
---
## Tips for writing your own descriptions
1. **Name what you see.** People, places, objects, colors, positions.
2. **Add context.** When was it taken? By whom? Why does it matter?
3. **Include facts.** Numbers, dates, names. These make searches precise.
4. **Think about how someone would search.** What question would lead
to this image? Make sure your description contains those words.
5. **Be specific.** "A 12-mile-high cliff on Miranda, a moon of Uranus"
beats "a cliff on a moon" every time.

BIN
example-data/earthrise.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

48
prompts/01-setup.md Normal file
View file

@ -0,0 +1,48 @@
# Prompt 1: Set Up the Project
Copy this into Claude Code after you have your API keys ready.
---
```
I want to build a multimodal search app. I have a folder of files
(PDFs, images with text descriptions) that I want to make searchable
using natural language.
Here is the tech stack I want:
- Google Gemini Embedding 2 for converting content to embeddings
(I have a Google AI Studio API key)
- Pinecone for storing the embeddings
(I have a Pinecone API key and an index called "space-search")
- A simple local web interface where I can type questions and
get results from my files
- Use Claude for answering questions based on the search results
(use my Claude Code subscription, not a separate API key)
My example files are in the folder: example-data/
That folder contains:
- 3 PDF files about the solar system, Jupiter, and planetary moons
- 4 JPG images (Earthrise, Moon landing, Jupiter, ISS)
- A file called descriptions.md with detailed text descriptions
of each image
Please set up the project structure, install dependencies, and
create a .env.template file for the API keys. Use Node.js with
TypeScript. Do not start building the search logic yet, just
the project skeleton.
```
---
## What Claude Code will do
1. Create a new project folder with `package.json`
2. Install libraries for Gemini embeddings, Pinecone, and a web server
3. Create a `.env.template` with placeholders for your API keys
4. Set up TypeScript configuration
## What you do next
1. Copy `.env.template` to `.env`
2. Fill in your actual API keys
3. Move to Prompt 2

67
prompts/02-ingest.md Normal file
View file

@ -0,0 +1,67 @@
# Prompt 2: Ingest Your Files
Copy this into Claude Code after the project is set up and your
.env file has your API keys.
---
```
Now build the ingestion pipeline. I need a script that:
1. Reads each PDF in example-data/ and extracts the text content.
Split long documents into chunks of roughly 500 words each.
Keep track of which file and which section each chunk came from.
2. Reads the image descriptions from example-data/descriptions.md.
Use the "Good description" for each image (ignore the "Bad" ones).
Each image description becomes one chunk, linked to its image file.
3. For each chunk, generate an embedding using Google Gemini
Embedding 2 (model: gemini-embedding-exp-03-07 or the latest
available). Use task_type "RETRIEVAL_DOCUMENT" for all chunks.
4. Store each embedding in Pinecone along with metadata:
- source_file: the original filename
- content_type: "pdf" or "image"
- text: the actual text content of the chunk
- chunk_index: which chunk number within the file
5. After ingestion, print a summary: how many chunks were created,
how many embeddings stored, and any errors.
Run the ingestion script after building it. Show me the output.
```
---
## What Claude Code will do
1. Build a script that reads PDFs and extracts text
2. Parse the descriptions.md file for image descriptions
3. Send each chunk to Google Gemini for embedding
4. Store everything in Pinecone with metadata
5. Run the script and show results
## What to expect
You should see output like:
```
Processing solar-system-overview.pdf... 3 chunks
Processing jupiter-fact-sheet.pdf... 4 chunks
Processing solar-system-moons.pdf... 3 chunks
Processing earthrise.jpg (from descriptions)... 1 chunk
Processing aldrin-moon.jpg (from descriptions)... 1 chunk
Processing jupiter-great-red-spot.jpg (from descriptions)... 1 chunk
Processing iss-over-earth.jpg (from descriptions)... 1 chunk
Total: 14 chunks ingested, 14 embeddings stored in Pinecone.
```
The exact numbers may vary depending on how Claude Code splits the PDFs.
## If something goes wrong
- "API key invalid": check your .env file
- "Index not found": make sure your Pinecone index name matches
- "Rate limit": wait a minute and run the script again

67
prompts/03-search.md Normal file
View file

@ -0,0 +1,67 @@
# Prompt 3: Build the Search Interface
Copy this into Claude Code after ingestion is complete.
---
```
Now build a simple web interface for searching my content.
I want a local web app (localhost) with:
1. A search box where I type a question in plain English.
2. When I search, the app should:
a. Convert my question to an embedding using Gemini Embedding 2
with task_type "RETRIEVAL_QUERY"
b. Search Pinecone for the 5 most similar chunks
c. Show me the matching results with:
- The source file name
- The relevant text snippet
- A similarity score
- If the result is from an image, show the image too
3. Below the search results, add a "Ask Claude" button that:
a. Takes the search results as context
b. Sends them to Claude (use the claude CLI command, since I
have Claude Code installed) with my question
c. Shows Claude's answer, which should reference the specific
files it used
Keep the design simple and clean. Dark background, readable text.
No frameworks needed, just HTML + CSS + a small server.
Start the app after building it.
```
---
## What Claude Code will do
1. Build a small web server (Express or similar)
2. Create an HTML page with a search box
3. Wire up the search to Gemini embeddings + Pinecone
4. Add Claude integration for AI-powered answers
5. Start the server
## What to try
Once the app is running, try these searches:
- "What is the largest planet in our solar system?"
(Should find the Jupiter fact sheet AND the Jupiter image)
- "Tell me about the first Moon landing"
(Should find the Aldrin image description)
- "Which moon has active volcanoes?"
(Should find the moons PDF mentioning Io)
- "How far is Jupiter from Earth?"
(Should find specific numbers from the Jupiter fact sheet)
- "What can astronauts see from the space station?"
(Should find the ISS image description)
These examples demonstrate the power of multimodal search:
a question about "the largest planet" finds both text data
AND a photograph, because both are semantically related.

69
prompts/04-improve.md Normal file
View file

@ -0,0 +1,69 @@
# Prompt 4: Improve and Iterate
These are prompts for when you want to make the search better.
Use whichever ones are relevant to you.
---
## If search results are not relevant enough
```
The search results are not matching my questions well.
Can you add a re-ranking step? After getting the top 10 results
from Pinecone, use Claude to re-rank them by relevance to my
actual question, then show the top 5.
```
## If you want to add your own files
```
I want to add more files to the search index. I have put new
files in example-data/. Please:
1. Check which files are new (not already in Pinecone)
2. Process only the new files
3. Add their embeddings to the existing index
4. Show me what was added
```
## If you want better image descriptions
```
Look at the images in example-data/ and suggest improved
descriptions for any that could be more detailed. Show me
the current description and your suggested improvement
side by side. Do not update descriptions.md until I approve.
```
## If you want to add a video clip
```
I have added a short video file (MP4, under 2 minutes) to
example-data/. Please:
1. Extract key frames from the video
2. Generate descriptions for each key frame
3. Add these descriptions as searchable chunks in Pinecone
4. Link them back to the video file with timestamps
```
## If you want to export or share
```
I want to share this search app with someone else. Can you:
1. Create a README with setup instructions
2. Make sure the .env.template has all required variables
3. Add a "first run" script that handles ingestion automatically
4. Package it so someone can clone the repo and get started
with just their own API keys
```
## If something is broken
```
[Paste the exact error message here]
This happened when I tried to [describe what you did].
Can you diagnose and fix the issue?
```
The key to good results with Claude Code: be specific about
what you see and what you expected to see instead.

96
troubleshooting.md Normal file
View file

@ -0,0 +1,96 @@
# Troubleshooting
Common problems and how to fix them.
## 1. "API key not found" or "Invalid API key"
**What happened:** The .env file is missing, in the wrong place,
or the key was copied incorrectly.
**Fix:** Open the .env file in your project folder. Check that:
- The file is named exactly `.env` (not `env.txt` or `.env.txt`)
- There are no spaces around the `=` sign
- The key is complete (no missing characters from copy-paste)
Tell Claude Code: "Check if my .env file is set up correctly."
## 2. "Module not found" or dependency errors
**What happened:** The project dependencies were not installed.
**Fix:** Tell Claude Code: "Install the project dependencies."
It will run the right install command for you.
## 3. Image search returns bad results
**What happened:** The image descriptions are too vague.
**Fix:** Open `descriptions.md` and improve the descriptions.
Compare your descriptions to the "good" examples in that file.
Then tell Claude Code: "Re-index the images with the updated
descriptions."
## 4. PDF content is not showing up in search
**What happened:** The PDF might be scanned (image-based)
rather than text-based, or it might be too large.
**Fix:** Tell Claude Code: "Check if the PDFs contain extractable
text." If a PDF is image-based, Claude Code can use OCR to
extract the text.
## 5. Search is slow
**What happened:** This is normal for the first search after
starting the app. The system needs to load.
**Fix:** Wait a few seconds. Subsequent searches will be faster.
If it stays slow, tell Claude Code: "The search is slow. Can you
optimize the query?"
## 6. "Rate limit exceeded" from Google
**What happened:** You sent too many requests too quickly.
**Fix:** Wait one minute and try again. The free tier allows
1,500 requests per day but has a per-minute limit too. For the
example files in this guide, you should never hit this limit.
## 7. Pinecone index not found
**What happened:** The index name in your code does not match
what you created in Pinecone.
**Fix:** Log into pinecone.io, check your index name, and tell
Claude Code: "My Pinecone index is named [your-index-name].
Update the configuration."
## 8. Claude Code asks me something I do not understand
**What happened:** Claude Code sometimes asks technical questions
about implementation choices.
**Fix:** You can safely answer with: "Use the simplest option"
or "You decide, I trust your judgment." Claude Code will pick
sensible defaults.
## 9. The app works but results are not great
**What happened:** Search quality depends on the descriptions
and the chunking strategy.
**Fix:** Three things to try:
1. Improve your image descriptions (this makes the biggest difference)
2. Tell Claude Code: "The search results are not relevant enough.
Can you adjust the chunking or add re-ranking?"
3. Try rephrasing your search query in different ways
## 10. Something else is broken
Tell Claude Code exactly what you see. Copy-paste the error message.
Claude Code is very good at diagnosing and fixing problems when
you give it the exact error text.
If you are stuck, you can also start fresh: tell Claude Code
"Let's start over. Delete the current search index and re-build
everything from scratch."