Initial commit: multimodal RAG guide with Claude Code
Prompt-driven guide for building multimodal search using Gemini Embedding 2 + Pinecone + Claude Code. Includes example data (NASA public domain), step-by-step prompts, concepts explainer, cost breakdown, and troubleshooting guide.
This commit is contained in:
commit
edcd1721df
19 changed files with 4446 additions and 0 deletions
67
prompts/03-search.md
Normal file
67
prompts/03-search.md
Normal file
|
|
@ -0,0 +1,67 @@
|
|||
# Prompt 3: Build the Search Interface
|
||||
|
||||
Copy this into Claude Code after ingestion is complete.
|
||||
|
||||
---
|
||||
|
||||
```
|
||||
Now build a simple web interface for searching my content.
|
||||
I want a local web app (localhost) with:
|
||||
|
||||
1. A search box where I type a question in plain English.
|
||||
|
||||
2. When I search, the app should:
|
||||
a. Convert my question to an embedding using Gemini Embedding 2
|
||||
with task_type "RETRIEVAL_QUERY"
|
||||
b. Search Pinecone for the 5 most similar chunks
|
||||
c. Show me the matching results with:
|
||||
- The source file name
|
||||
- The relevant text snippet
|
||||
- A similarity score
|
||||
- If the result is from an image, show the image too
|
||||
|
||||
3. Below the search results, add a "Ask Claude" button that:
|
||||
a. Takes the search results as context
|
||||
b. Sends them to Claude (use the claude CLI command, since I
|
||||
have Claude Code installed) with my question
|
||||
c. Shows Claude's answer, which should reference the specific
|
||||
files it used
|
||||
|
||||
Keep the design simple and clean. Dark background, readable text.
|
||||
No frameworks needed, just HTML + CSS + a small server.
|
||||
|
||||
Start the app after building it.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What Claude Code will do
|
||||
|
||||
1. Build a small web server (Express or similar)
|
||||
2. Create an HTML page with a search box
|
||||
3. Wire up the search to Gemini embeddings + Pinecone
|
||||
4. Add Claude integration for AI-powered answers
|
||||
5. Start the server
|
||||
|
||||
## What to try
|
||||
|
||||
Once the app is running, try these searches:
|
||||
|
||||
- "What is the largest planet in our solar system?"
|
||||
(Should find the Jupiter fact sheet AND the Jupiter image)
|
||||
|
||||
- "Tell me about the first Moon landing"
|
||||
(Should find the Aldrin image description)
|
||||
|
||||
- "Which moon has active volcanoes?"
|
||||
(Should find the moons PDF mentioning Io)
|
||||
|
||||
- "How far is Jupiter from Earth?"
|
||||
(Should find specific numbers from the Jupiter fact sheet)
|
||||
|
||||
- "What can astronauts see from the space station?"
|
||||
(Should find the ISS image description)
|
||||
|
||||
These examples demonstrate the power of multimodal search:
|
||||
a question about "the largest planet" finds both text data
|
||||
AND a photograph, because both are semantically related.
|
||||
Loading…
Add table
Add a link
Reference in a new issue