When people say “RAG gives bad answers,” in most cases the issue is not generation - it’s retrieval.
A typical scenario:
- the question is correct
- the answer sounds reasonable
- but the user feels: this isn’t what I asked for
The reason is a failure at the knowledge retrieval stage.
AI simply didn’t receive the right information because:
- documents were chunked randomly
- metadata and tagging are missing
- search runs across the entire knowledge base at once
- the relevant document never made it into context
As a result, the model reasons based on almost relevant information and produces a nearly correct answer.
According to the Stanford AI Index, retrieval quality has a stronger impact on answer accuracy than the choice of LLM itself. The difference in accuracy can reach 30–40%, even when using the same model.
That’s why good RAG starts not with the model, but with how knowledge is searched and selected.