Thanks for testing Ava! Your feedback shapes what gets built next. What would you like to share?
To power true document retrieval with semantic search, connect this frontend to a backend. Recommended stack:
Extract text with pdfplumber or pypdf. Chunk into ~400-token segments with 50-token overlap. Store with metadata: filename, page, doc type.
Embed each chunk via voyage-3 (Anthropic) or text-embedding-3-small (OpenAI). Store vectors in Pinecone, Chroma, or Supabase pgvector.
At query time: embed the user message, run top-k=6 cosine similarity, return matching chunks with their source filenames.
Inject retrieved chunks into Claude's system prompt. Instruct: "cite sources as [Source: filename]". Frontend parses and renders citation tags automatically.
Frontend: this file · API: FastAPI (Python) · Vector DB: Pinecone or Supabase · LLM: claude-sonnet-4