Organizations needed to query multi-format documents with fast retrieval and contextual reasoning across different file types.
Developed full-stack RAG application with FastAPI and React.js. Implemented hybrid search pipeline using Pinecone and ChromaDB, reducing retrieval latency by 15% through optimized metadata filtering. Integrated Google Gemini API for contextual reasoning.
-15% retrieval latency for multi-format document queries with hybrid search.
User-uploaded files (PDF, image, text) are routed through a file-type dispatcher — PyMuPDF for native PDFs, Tesseract OCR for scanned images. Extracted text is chunked with overlap and embedded via Google Gemini embeddings, stored in both Pinecone (cloud vector search) and ChromaDB (local hybrid retrieval). At query time, both stores are queried in parallel using asyncio, results merged with Reciprocal Rank Fusion, and the top-k chunks passed to Gemini for contextual answer generation.
Retrieval quality dominates RAG performance — a well-ranked retrieval pipeline with a smaller model beats a poorly-ranked one with a larger model every time. I also learned that hybrid search requires careful weight tuning per document type: the RRF weights that work well for legal PDFs don't generalize to technical diagrams or spreadsheets.