Why Classic RAG Fails
Before diving into Vedana’s architecture, it’s important to understand exactly where classic RAG breaks down.
What classic RAG is
Classic RAG connects an LLM to documents with a simple flow:
Chunks → Embeddings → Top-K → LLM → Answer
flowchart LR
subgraph "Classic RAG: «All products in category X»"
Q1[Question] --> E1[embedding]
E1 --> K1[top-K chunks]
K1 --> L1[LLM produces<br/>a sample]
L1 --> R1[Some products<br/>missed ❌]
end
subgraph "Vedana: «All products in category X»"
Q2[Question] --> A[Cypher: MATCH ...<br/>WHERE category=X<br/>RETURN p]
A --> G[(Memgraph)]
G --> R2[All products ✅<br/>with sources]
end
This works well when the answer fits in a few text fragments.
Where it works
Classic RAG is effective for:
- summarization;
- simple factual questions;
- searching a small document set;
- approximate answers.
If the answer lives in one or two paragraphs, RAG is usually enough.
Where it breaks
The trouble starts when a query requires completeness, structure, or logic.
1. Aggregations (“how many?”)
RAG can’t count over a dataset — it guesses based on the top-K results. An answer to “How many of our contracts expire this quarter?” computed from top-K is close to the truth but not the actual number.
2. Exhaustive queries (“show me all”)
RAG returns a sample, not the full set. Missed items are invisible. “All documents that regulate category X” — top-K never guarantees this is all of them.
3. Relationship queries
Questions that require joins, graph traversal, or compatibility checks can’t be answered reliably through vector similarity. “Which documents regulate products in category X” — that needs traversing the edges Product → belongs_to → Category → regulated_by → Document.
4. Domain logic
RAG doesn’t execute rules. It predicts answer-shaped text. Any business validation (“can product A be sold together with product B to client C?”) falls apart.
The root cause
The LLM doesn’t build a structured model of the domain. It works with text patterns, not with data or logic.
As a result:
- no consistency guarantees;
- no way to enforce rules;
- no reliable system reasoning.
Classic RAG amplifies these limitations by asking the model to reason on top of incomplete text fragments.
What this means in practice
These failures are structural. They can’t be fixed by:
- better embeddings;
- larger context windows;
- raising top-K.
Reliable answers need:
- access to full data (not a sample);
- structured queries (not similarity);
- explicit relationships;
- executable logic.
Fluency is not correctness.
Why Vedana exists
Reliable AI requires more than text retrieval. It requires a system that can:
- query structured data;
- execute logic;
- guarantee completeness;
- attach a source to every answer.
That’s the problem Vedana is built to solve.