Build a Vector Store in n8n (Embeddings for RAG)
The cheapest part of RAG is the index. Here is exactly how to build it.
AI-drafted, reviewed by Muhammad Qasim Hammad on June 16, 2026. See our AI disclosure.
Table of contents
- What is a vector store, and why does RAG need one?
- Simple, Supabase, or another backend: which should you use?
- How much do embeddings and storage cost?
- How do you load documents into the store?
- How do you build it in n8n, step by step?
- Where can this go wrong?
- Which vector store should you use? (Decision guide)
- What should you set up this weekend?
Your help docs, past support tickets, and PDF guides are sitting in folders where no AI can find them, and keyword search misses anything a customer phrased differently. An n8n vector store converts each document into an embedding and retrieves by meaning, and building the entire index costs about 1.3 cents per thousand documents.
The symptom is a support bot that answers from its training data instead of your actual product. Or you copy-paste context into every Claude prompt by hand. Or you told yourself "vector databases are complicated." None of that holds once you see the node wiring.
This post covers how a vector store works inside n8n, which backend to pick, what the index actually costs (almost nothing), and the exact node wiring to load your documents into it.
What is a vector store, and why does RAG need one?#
A vector store is the retrieval half of RAG. It holds each document chunk as a list of numbers (the embedding) that encodes its meaning. At query time it embeds your question the same way and returns the chunks whose vectors sit closest. That is how it matches "cancellation policy" to "how do I stop my subscription."
Keyword search misses synonyms, paraphrases, and anything phrased unexpectedly. A vector store trades exact-match precision for semantic breadth. For question-answering over your own documents, that trade is almost always worth it.
n8n ships with a Simple Vector Store node (in-memory, four modes) and dedicated nodes for persistent backends including Supabase Vector Store, Pinecone, Qdrant, and PGVector, according to the n8n vector store node docs. Every one of them requires an Embeddings sub-node to convert text into vectors before storing or querying.
Simple, Supabase, or another backend: which should you use?#
The Simple Vector Store is zero-setup and useful for a five-minute test. The Supabase Vector Store is the right choice for anything real: it writes to a Postgres database with pgvector, survives instance restarts, and is inspectable with plain SQL.
The Simple Vector Store node docs are explicit: the store is not persistent, and all instance users can read it. That makes it wrong for production or for anything containing private data. The Supabase Vector Store adds an Update Documents mode on top of the standard four modes (Get Many, Insert Documents, Retrieve Documents As Vector Store for Chain, Retrieve Documents As Tool for AI Agent), giving you five modes total.
Other backends follow the same pattern with different storage: Pinecone is a managed cloud index, Qdrant is self-hostable, PGVector is Postgres without the Supabase wrapper. Pick based on where you already run infrastructure.
How much do embeddings and storage cost?#
Embedding 1,000 documents costs about 1.3 cents with OpenAI text-embedding-3-small, and storing them on Supabase free tier costs $0. The index is nearly free. The real, recurring cost of RAG is the generation model (Claude, GPT-4o, etc.) that reads the retrieved chunks at query time.
The math is checkable by hand. A "document" here means one chunk of roughly 500 words, about 650 tokens. Embeddings charge input tokens only, no output. Embedding 1,000 such chunks uses 0.65 million tokens.
| Embedding model | Price per 1M tokens | Cost per 1,000 docs | Best for |
|---|---|---|---|
| OpenAI text-embedding-3-small | $0.02 | $0.013 (~1.3 cents) | Default: cheap, accurate, 1536-dim |
| OpenAI text-embedding-3-large | $0.13 | $0.085 (~8.5 cents) | Higher retrieval accuracy, larger corpora |
| Self-hosted (Ollama / Hugging Face) | $0 in tokens | $0 in tokens | You pay compute, not per call |
Prices are from OpenAI's pricing page as of mid-2026 (June 2026). These are OpenAI embedding prices. Anthropic has no embeddings API.
Supabase free tier stores vectors at $0 within the 500 MB database limit, per supabase.com/pricing. Beyond that limit, you pay for database size, not for embeddings themselves. n8n self-hosted (Community edition) is $0; n8n Cloud Starter is €20/month per n8n.io/pricing under the sustainable-use license.
How do you load documents into the store?#
The ingestion flow in n8n has three stages: load and split the source documents, embed each chunk, then insert the vectors. A Document Loader node reads the source (file, URL, or plain text). A Text Splitter node breaks it into chunks. The Vector Store node in Insert Documents mode calls its Embeddings sub-node per chunk and writes the result.
Chunk size matters. Chunks that are too large dilute the match because a single vector has to represent too many ideas. Chunks that are too small lose context across sentence boundaries. Start at roughly 500 words and test retrieval on real questions. You can pre-condense very long documents with n8n's AI summarization workflow before feeding them here.
If your source is a structured file (invoice, form, ticket export), document extraction pulls specific fields rather than semantic chunks. That is a different pattern. Vector stores shine on free-form text: help articles, email threads, PDFs, transcripts.
How do you build it in n8n, step by step?#
This is the full node wiring, from an empty canvas to a working index you can query. The build has six steps: an insert-mode Vector Store, an OpenAI Embeddings sub-node, a loader and splitter to chunk your documents, the insert run, then a second retrieval node to test. Every label below is exactly what you will see in n8n.
Step 1: Add a Supabase Vector Store node in Insert Documents mode. Open a blank workflow. Add the Supabase Vector Store node. Set Operation Mode to Insert Documents. Connect your Supabase credentials (URL + service role key). Create or name the target table in the Table Name field.
Step 2: Attach an OpenAI Embeddings sub-node.
Inside the Vector Store node, open the Embeddings sub-node slot and add Embeddings OpenAI. Set the model to text-embedding-3-small. Connect your OpenAI API key.
Step 3: Connect a Document Loader and Text Splitter. Upstream of the Vector Store, add a Default Data Loader (or Binary Input Loader for files). Chain a Recursive Character Text Splitter after it. Set Chunk Size to around 2,000 characters (~500 words) and Chunk Overlap to 200 characters to preserve boundary context. Wire the splitter's output into the Vector Store document input.
Step 4: Run the insert to build the index. Execute the workflow. Watch the Supabase table populate with rows, each containing the chunk text, its embedding vector, and any metadata. This is your index.
Step 5: Add a second Vector Store node in retrieval mode.
Add a second Supabase Vector Store node. Set its mode to Retrieve Documents (As Vector Store for Chain) to use it inside an AI chain, or Get Many for a standalone query. Attach the same Embeddings OpenAI sub-node with text-embedding-3-small.
Step 6: Test retrieval with a real question. Wire a trigger (manual or webhook) that passes a question string to the retrieval node. Inspect the returned chunks and their similarity scores. If the top result is not relevant, check your chunk size and confirm the embedding model matches the insert model.
The full retrieval-and-generation layer (where Claude reads the chunks and produces an answer) is the n8n RAG support bot. That post picks up exactly where this one ends. For controlling the per-query token cost of the generation model, see the Claude API cost control guide. To compare Claude, GPT-4o, and Gemini as generation models, the AI agents cost and speed test has the numbers.
Where can this go wrong?#
The most common mistake is using the Simple Vector Store and assuming the data persists. It does not. Everything in memory vanishes the moment n8n restarts, and anyone else on the same instance can read what you stored. Move to Supabase before you show this to a client or store anything sensitive.
The second mistake is an embedding-model mismatch. If you embed documents with text-embedding-3-small and then query with text-embedding-3-large, you are comparing 1536-dimensional vectors to 3072-dimensional ones. The similarity scores are nonsense and the store returns irrelevant chunks with no error message. Lock the model name in a credential note or an n8n sticky and check it before adding any new workflow step that touches the store.
The third mistake is treating the vector store as a source of facts. The store returns the closest chunks, not the correct answer. Garbage source text, duplicate articles, and outdated content all get embedded faithfully and retrieved faithfully. Curate what goes in. Run a handful of real questions against the store before building the generation layer on top of it.
Do not confuse the near-free indexing cost with the ongoing query cost. The 1.3-cent figure covers building the entire index. Every query at production time sends retrieved chunks plus the user's question to a generation model, and that is where the bill accumulates. Plan the generation budget separately.
Which vector store should you use? (Decision guide)#
The answer comes down to two questions: are you testing or shipping, and where does your infrastructure already live? Use the Simple Vector Store only for a quick local test. For anything real, use Supabase if you already run Postgres, or reach for Qdrant or Pinecone otherwise. The flowchart below walks through it.
flowchart TD
classDef startEnd fill:#22c55e,color:#fff,stroke:none
classDef decision fill:#f97316,color:#fff,stroke:none
classDef action fill:#3b82f6,color:#fff,stroke:none
A([Need a vector store]):::startEnd --> B{Just testing?}:::decision
B -- Yes --> C[Use Simple Vector Store]:::action --> Z([Done]):::startEnd
B -- No --> D{Run Postgres or Supabase?}:::decision
D -- Yes --> E[Use Supabase Vector Store]:::action --> Z
D -- No --> F[Use Qdrant or Pinecone]:::action --> ZWhat should you set up this weekend?#
Pick one set of documents you wish an AI could answer questions about: your help articles, a product FAQ, a folder of past client emails. Create a free Supabase project, enable the pgvector extension, and wire the six steps above. The index will cost a cent or two in tokens and nothing in storage.
Once the rows are in the table, the RAG support bot post shows you how to query them with Claude.
The foundation takes an afternoon. The generation layer takes another one. If you run n8n on a VPS rather than cloud, the self-host n8n setup guide covers the full installation so you pay $0 for the platform itself.
Frequently asked questions
What is a vector store in n8n?
How much does it cost to embed documents in n8n?
Does Anthropic have an embeddings API?
What is the difference between the Simple and Supabase vector stores in n8n?
Which embedding model should I use with n8n?
Is the vector store the same as RAG?
Sources
Primary references and vendor documentation used while drafting and reviewing this article.
Related reading
Force Structured JSON Output from AI in n8n
Your n8n AI step returns a paragraph when the next node needs clean fields. The Structured Output Parser sub-node fixes this by constraining the model to a JSON schema you define, for roughly 30 cents per 1,000 calls on Claude Haiku 4.5.
Give Your n8n AI Agent Tools (Calculator, HTTP, Workflows)
Your n8n AI Agent answers from stale training data until you attach real tools. This guide shows you exactly how to wire HTTP Request, Calculator, and Workflow tools so your agent acts on live data.
Auto-Categorize Emails & Tickets in n8n with the AI Text Classifier
The n8n AI Text Classifier node reads each inbound message, picks the right category, and branches your workflow automatically. Costs $0.65 per 1,000 emails on Claude Haiku 4.5. Here is exactly how to build it.


