Post

Meet Rizzo: The AI Assistant Living on This Website

How I built an AI chatbot that actually knows my work, using Spring AI, RAG, and an open-source starter you can fork today

You might have noticed a chat bubble in the bottom-right corner of this site, or the banner on the home page. That’s Rizzo — an AI assistant that can answer questions about my blog posts, CV, publications, and speaking engagements. Unlike a generic chatbot, Rizzo doesn’t guess. It retrieves answers directly from the content on this site using a technique called Retrieval-Augmented Generation, or RAG.

How RAG Keeps Rizzo Honest

Every piece of content on this site — posts, pages, my curriculum vitae — gets split into small chunks, converted into numerical vectors (embeddings), and stored in a searchable index. When you ask Rizzo a question, your query gets converted into a vector too. The system finds the chunks most similar to your question, then hands those specific chunks to Anthropic’s Claude as context. Claude generates an answer grounded in that context, not from its general training data.

The result is a chatbot that stays current with the site and doesn’t hallucinate. If the answer isn’t in the content, Rizzo says so rather than making something up. Right now the index covers 277 document chunks across every post and page on the site, and it re-indexes automatically whenever I publish new content.

Built on Open Source

Rizzo started life as a demo I built for a Cincinnati Java Users Group (CinJUG) presentation on Spring AI and RAG. That demo is freely available on GitHub:

spring-ai-rag-demo — a complete, working RAG application built with Spring Boot, Spring AI 1.0.0, and Docker Compose. Drop your own markdown files into the docs folder, run docker-compose up, and you have a working chatbot. It supports both Anthropic Claude and Ollama, so you can run it entirely locally without any API keys.

The production version powering Rizzo swaps a few components for lighter-weight alternatives suited to a single GCP e2-micro VM with 1 GB of RAM. The vector store is Spring AI’s built-in SimpleVectorStore instead of Qdrant, embeddings come from Google’s Vertex AI, and a crawl pipeline reads the Jekyll source files directly. But the core RAG pattern — chunk, embed, retrieve, generate — is identical to the demo. If you can run the demo, you understand how Rizzo works.

Try It

Click the chat button and ask Rizzo something — what I’ve written about agentic AI, where I’ve spoken, what my background is. If you’re a developer interested in building your own, fork the demo and have it running in minutes.

References

This post is licensed under CC BY 4.0 by the author.
Ask about Michael's work
Hi! I can answer questions about Michael's experience, blog posts, publications, and speaking engagements. What would you like to know?