Completing the RAG Loop: From Vector Search to LLM Recommendations - Part 2

The LLM acts as the final reasoning layer, turning vector based search results into tailored viewing recommendations. Where We Left Off In Part 1, we built the foundation of a movie recommendation RAG pipeline. We took raw user preferences, generated 1024-dimension vector embeddings using bge-large, and stored them in a Qdrant vector database. The result was a semantic search engine that could find similar preferences with accuracy relevant to the user’s interests and mood of the moment. ...

September 6, 2025 · 15 min · 3076 words · Mark Holton