Buzzkeeper Update: Persistent Intelligent Classifiable Memory

In case you missed the announcement, I released an open-source Discord bot written in Rust called Buzzkeeper (a play on Buzzkiller).

Buzzkeeper started with a very simple form of memory: persist a list of notes, then shove the latest few into the prompt.

That worked for early testing, but it breaks down fast in practice:

the newest memories are not always the relevant ones
the bot can remember facts, but it struggles to recall the right fact at the right time
state-linked history like "what did I say when I was absolutely cooked?" needs more than recency
running a separate vector database felt like operational overkill for a Discord bot

So the memory system was upgraded to use a hybrid SQLite-backed RAG layer.

What Changed

Buzzkeeper now keeps two durable stores:

JSON for guild state, persona state, permissions, drink/action config, and conversation state
SQLite for long-term memory indexing and retrieval

That SQLite memory index uses:

FTS5 for lexical recall
sqlite-vec for embedded vector search
model-assisted memory enrichment for summaries, tags, entities, categories, and importance
query analysis for topics, entities, requested memory categories, and requested bot states

Why SQLite Instead Of A Separate Vector Database

The goal was better retrieval, not more infrastructure.

SQLite gives us:

one local file
simple backups
no extra service to deploy
good enough performance for a bot memory workload
the ability to combine structured filters, FTS, and vector search in one embedded runtime

For this project, that tradeoff is much better than introducing a remote vector store.

How The New Memory Flow Works

1. Memory creation

When a user adds memory with /remember, or when the bot stores its own state-linked memories, the memory is enriched at write time.

If model-assisted memory analysis is enabled, the bot asks the model for:

category
summary
tags
entities
importance

If the model is unavailable, Buzzkeeper falls back to heuristics. But the idea is the bot classifies its own memories into specific sinks (categories).

2. Mirroring into SQLite

The canonical guild state still lives in JSON, but every guild's memory set is mirrored into SQLite.

That mirror stores:

searchable text
metadata like category, source, stage, and importance
a vector embedding for similarity search in sqlite-vec

The embedding path is now configurable:

by default, Buzzkeeper uses deterministic local embeddings, so the memory index still works offline
if MEMORY_EMBEDDING_MODEL is configured, Buzzkeeper will try provider-backed embeddings and fit them to the SQLite index dimensions
unchanged memories are skipped during sync, so the bot does not keep re-embedding the same corpus on every reply

3. Query analysis

Before recall, the incoming message is analysed into:

topics
entities
requested memory categories
requested bot stage, if any
whether the user is asking about the bot's own past words or actions

4. Hybrid retrieval

Recall then combines:

lexical candidates from FTS5
vector candidates from sqlite-vec
existing behavioral scoring like speaker match, stage affinity, recency, recall frequency, and importance

That means the final prompt gets a smaller, more relevant recalled memory slice instead of just "the latest notes."

Better State-Aware Memory

This upgrade also improves self-memory.

Buzzkeeper can now store and recall things like:

what it said while tipsy
what it did while buzzing or cooked
incidents tied to stage shifts or hangovers
bot-state-linked memories grouped separately from ordinary user facts

That makes questions like "what did you say when you were super drunk?" much more realistic.

Operational Changes

There are now two persistence paths to keep mounted:

STORAGE_PATH=/data/tavern-state.json
MEMORY_DB_PATH=/data/tavern-state.memories.sqlite3

Optional controls:

MEMORY_MODEL_ASSISTED=true
MEMORY_ANALYSIS_MAX_TOKENS=220
MEMORY_EMBEDDING_MODEL=nomic-embed-text
MEMORY_EMBEDDING_DIMENSIONS=128

What This Unlocks Next

This architecture leaves room for future improvements without changing the deployment model:

stronger provider or local embedding generation
memory pruning or summarisation jobs
richer metadata filters
selective replay of long-running server lore

The main point is that Buzzkeeper now has a memory system that is still easy to deploy (no additional infra needed), but much more capable than naive prompt stuffing.

I also shipped a heap of performance improvements, bug fixes and other things that make it better too, all up on GitHub.

Source code is here: https://github.com/Vheissu/buzzkeeper — all contributions from everyone from all levels of development skill are welcome (including AI assisted/generated contributions).