1
0
Fork 0
multimodal-rag-guide/prompts/01-setup.md
Kjell Tore Guttormsen edcd1721df Initial commit: multimodal RAG guide with Claude Code
Prompt-driven guide for building multimodal search using
Gemini Embedding 2 + Pinecone + Claude Code. Includes example
data (NASA public domain), step-by-step prompts, concepts
explainer, cost breakdown, and troubleshooting guide.
2026-03-12 16:36:22 +01:00

48 lines
1.5 KiB
Markdown

# Prompt 1: Set Up the Project
Copy this into Claude Code after you have your API keys ready.
---
```
I want to build a multimodal search app. I have a folder of files
(PDFs, images with text descriptions) that I want to make searchable
using natural language.
Here is the tech stack I want:
- Google Gemini Embedding 2 for converting content to embeddings
(I have a Google AI Studio API key)
- Pinecone for storing the embeddings
(I have a Pinecone API key and an index called "space-search")
- A simple local web interface where I can type questions and
get results from my files
- Use Claude for answering questions based on the search results
(use my Claude Code subscription, not a separate API key)
My example files are in the folder: example-data/
That folder contains:
- 3 PDF files about the solar system, Jupiter, and planetary moons
- 4 JPG images (Earthrise, Moon landing, Jupiter, ISS)
- A file called descriptions.md with detailed text descriptions
of each image
Please set up the project structure, install dependencies, and
create a .env.template file for the API keys. Use Node.js with
TypeScript. Do not start building the search logic yet, just
the project skeleton.
```
---
## What Claude Code will do
1. Create a new project folder with `package.json`
2. Install libraries for Gemini embeddings, Pinecone, and a web server
3. Create a `.env.template` with placeholders for your API keys
4. Set up TypeScript configuration
## What you do next
1. Copy `.env.template` to `.env`
2. Fill in your actual API keys
3. Move to Prompt 2