RAG

Retrieval Augmented Generation (RAG) is a technique to ground Large Language Model (LLM) responses in external, up-to-date, or private data. By retrieving relevant information (often using vector search) and feeding it into the model's context, RAG enables more accurate and fact-based answers.

Is cosine similarity always the best choice for text embedding search?

For normalized text embeddings, the choice between cosine similarity, dot product, and Euclidean distance is simpler than it appears, as all three produce identical search rankings, which simplifies building RAG systems.May 29, 2024

TEQNation 2024: using RAG and ReACT to extend your LLM app

At TEQNation 2024, I demonstrated how to build a complex LLM app using RAG and ReACT, grounded in over 20 million Hacker News comments, and explained how I did data scraping, indexing, and implementation with LangChain and Cloud Run.May 03, 2024

Build and deploy a LangChain app with a vector database

Learn to build a Retrieval Augmented Generation (RAG) app with LangChain and Gemini on Cloud Run, using a vector database to answer questions from Cloud Run release notes.April 27, 2024