How Matthew's Codex Works
This project is a compact retrieval-augmented generation app: it turns a personal document set into a conversational portfolio assistant with source attribution, reusable API boundaries, and a safe demo mode for public deployment.
Document Ingestion
PDF, Markdown, and text documents are collected through the admin surface or local data folder.
Chunking & Metadata
The ingestion script extracts text, creates retrievable chunks, and keeps title/source metadata for attribution.
Semantic Retrieval
Production mode embeds queries and retrieves relevant chunks from Pinecone before answer generation.
Mode-Aware Prompting
A shared system prompt plus mode preambles reshape the same evidence into interview, story, TL;DR, brag, or reflection styles.
Why the architecture is interesting
The UI, API routes, retrieval layer, prompt strategy, and ingestion script are separate enough that production infrastructure can be swapped for deterministic fixtures without changing the user flow. That is what makes the project easy to demo, test, and explain.