Meet Rizzo: The AI Assistant Living on This Website
How I built an AI chatbot that actually knows my work, using Spring AI, RAG, and an open-source starter you can fork today
You might have noticed a chat bubble in the bottom-right corner of this site, or the banner on the home page. That’s Rizzo — an AI assistant that can answer questions about my blog posts, CV, publications, and speaking engagements. Unlike a generic chatbot, Rizzo doesn’t guess. It retrieves answers directly from the content on this site using a technique called Retrieval-Augmented Generation, or RAG.
How RAG Keeps Rizzo Honest
Every piece of content on this site — posts, pages, my curriculum vitae — gets split into small chunks, converted into numerical vectors (embeddings), and stored in a searchable index. When you ask Rizzo a question, your query gets converted into a vector too. The system finds the chunks most similar to your question, then hands those specific chunks to Anthropic’s Claude as context. Claude generates an answer grounded in that context, not from its general training data.
The result is a chatbot that stays current with the site and doesn’t hallucinate. If the answer isn’t in the content, Rizzo says so rather than making something up. Right now the index covers 277 document chunks across every post and page on the site, and it re-indexes automatically whenever I publish new content.
Built on Open Source
Rizzo started life as a demo I built for a Cincinnati Java Users Group (CinJUG) presentation on Spring AI and RAG. That demo is freely available on GitHub:
spring-ai-rag-demo — a complete, working RAG application built with Spring Boot, Spring AI 1.0.0, and Docker Compose. Drop your own markdown files into the docs folder, run docker-compose up, and you have a working chatbot. It supports both Anthropic Claude and Ollama, so you can run it entirely locally without any API keys.
The production version powering Rizzo swaps a few components for lighter-weight alternatives suited to a single GCP e2-micro VM with 1 GB of RAM. The vector store is Spring AI’s built-in SimpleVectorStore instead of Qdrant, embeddings come from Google’s Vertex AI, and a crawl pipeline reads the Jekyll source files directly. But the core RAG pattern — chunk, embed, retrieve, generate — is identical to the demo. If you can run the demo, you understand how Rizzo works.
Try It
Click the chat button and ask Rizzo something — what I’ve written about agentic AI, where I’ve spoken, what my background is. If you’re a developer interested in building your own, fork the demo and have it running in minutes.