Prompt-driven guide for building multimodal search using Gemini Embedding 2 + Pinecone + Claude Code. Includes example data (NASA public domain), step-by-step prompts, concepts explainer, cost breakdown, and troubleshooting guide.
1.5 KiB
1.5 KiB
Prompt 1: Set Up the Project
Copy this into Claude Code after you have your API keys ready.
I want to build a multimodal search app. I have a folder of files
(PDFs, images with text descriptions) that I want to make searchable
using natural language.
Here is the tech stack I want:
- Google Gemini Embedding 2 for converting content to embeddings
(I have a Google AI Studio API key)
- Pinecone for storing the embeddings
(I have a Pinecone API key and an index called "space-search")
- A simple local web interface where I can type questions and
get results from my files
- Use Claude for answering questions based on the search results
(use my Claude Code subscription, not a separate API key)
My example files are in the folder: example-data/
That folder contains:
- 3 PDF files about the solar system, Jupiter, and planetary moons
- 4 JPG images (Earthrise, Moon landing, Jupiter, ISS)
- A file called descriptions.md with detailed text descriptions
of each image
Please set up the project structure, install dependencies, and
create a .env.template file for the API keys. Use Node.js with
TypeScript. Do not start building the search logic yet, just
the project skeleton.
What Claude Code will do
- Create a new project folder with
package.json - Install libraries for Gemini embeddings, Pinecone, and a web server
- Create a
.env.templatewith placeholders for your API keys - Set up TypeScript configuration
What you do next
- Copy
.env.templateto.env - Fill in your actual API keys
- Move to Prompt 2