1
0
Fork 0
multimodal-rag-guide/prompts/01-setup.md
Kjell Tore Guttormsen edcd1721df Initial commit: multimodal RAG guide with Claude Code
Prompt-driven guide for building multimodal search using
Gemini Embedding 2 + Pinecone + Claude Code. Includes example
data (NASA public domain), step-by-step prompts, concepts
explainer, cost breakdown, and troubleshooting guide.
2026-03-12 16:36:22 +01:00

1.5 KiB

Prompt 1: Set Up the Project

Copy this into Claude Code after you have your API keys ready.


I want to build a multimodal search app. I have a folder of files
(PDFs, images with text descriptions) that I want to make searchable
using natural language.

Here is the tech stack I want:
- Google Gemini Embedding 2 for converting content to embeddings
  (I have a Google AI Studio API key)
- Pinecone for storing the embeddings
  (I have a Pinecone API key and an index called "space-search")
- A simple local web interface where I can type questions and
  get results from my files
- Use Claude for answering questions based on the search results
  (use my Claude Code subscription, not a separate API key)

My example files are in the folder: example-data/
That folder contains:
- 3 PDF files about the solar system, Jupiter, and planetary moons
- 4 JPG images (Earthrise, Moon landing, Jupiter, ISS)
- A file called descriptions.md with detailed text descriptions
  of each image

Please set up the project structure, install dependencies, and
create a .env.template file for the API keys. Use Node.js with
TypeScript. Do not start building the search logic yet, just
the project skeleton.

What Claude Code will do

  1. Create a new project folder with package.json
  2. Install libraries for Gemini embeddings, Pinecone, and a web server
  3. Create a .env.template with placeholders for your API keys
  4. Set up TypeScript configuration

What you do next

  1. Copy .env.template to .env
  2. Fill in your actual API keys
  3. Move to Prompt 2