Buzzkeeper Update: Persistent Intelligent Classifiable Memory

in #hivedevs12 hours ago

In case you missed the announcement, I released an open-source Discord bot written in Rust called Buzzkeeper (a play on Buzzkiller).

image.png

Buzzkeeper started with a very simple form of memory: persist a list of notes, then shove the latest few into the prompt.

That worked for early testing, but it breaks down fast in practice:

  • the newest memories are not always the relevant ones
  • the bot can remember facts, but it struggles to recall the right fact at the right time
  • state-linked history like "what did I say when I was absolutely cooked?" needs more than recency
  • running a separate vector database felt like operational overkill for a Discord bot

So the memory system was upgraded to use a hybrid SQLite-backed RAG layer.

What Changed

Buzzkeeper now keeps two durable stores:

  • JSON for guild state, persona state, permissions, drink/action config, and conversation state
  • SQLite for long-term memory indexing and retrieval

That SQLite memory index uses:

  • FTS5 for lexical recall
  • sqlite-vec for embedded vector search
  • model-assisted memory enrichment for summaries, tags, entities, categories, and importance
  • query analysis for topics, entities, requested memory categories, and requested bot states

Why SQLite Instead Of A Separate Vector Database

The goal was better retrieval, not more infrastructure.

SQLite gives us:

  • one local file
  • simple backups
  • no extra service to deploy
  • good enough performance for a bot memory workload
  • the ability to combine structured filters, FTS, and vector search in one embedded runtime

For this project, that tradeoff is much better than introducing a remote vector store.

How The New Memory Flow Works

1. Memory creation

When a user adds memory with /remember, or when the bot stores its own state-linked memories, the memory is enriched at write time.

If model-assisted memory analysis is enabled, the bot asks the model for:

  • category
  • summary
  • tags
  • entities
  • importance

If the model is unavailable, Buzzkeeper falls back to heuristics. But the idea is the bot classifies its own memories into specific sinks (categories).

2. Mirroring into SQLite

The canonical guild state still lives in JSON, but every guild's memory set is mirrored into SQLite.

That mirror stores:

  • searchable text
  • metadata like category, source, stage, and importance
  • a vector embedding for similarity search in sqlite-vec

The embedding path is now configurable:

  • by default, Buzzkeeper uses deterministic local embeddings, so the memory index still works offline
  • if MEMORY_EMBEDDING_MODEL is configured, Buzzkeeper will try provider-backed embeddings and fit them to the SQLite index dimensions
  • unchanged memories are skipped during sync, so the bot does not keep re-embedding the same corpus on every reply

3. Query analysis

Before recall, the incoming message is analysed into:

  • topics
  • entities
  • requested memory categories
  • requested bot stage, if any
  • whether the user is asking about the bot's own past words or actions

4. Hybrid retrieval

Recall then combines:

  • lexical candidates from FTS5
  • vector candidates from sqlite-vec
  • existing behavioral scoring like speaker match, stage affinity, recency, recall frequency, and importance

That means the final prompt gets a smaller, more relevant recalled memory slice instead of just "the latest notes."

Better State-Aware Memory

This upgrade also improves self-memory.

Buzzkeeper can now store and recall things like:

  • what it said while tipsy
  • what it did while buzzing or cooked
  • incidents tied to stage shifts or hangovers
  • bot-state-linked memories grouped separately from ordinary user facts

That makes questions like "what did you say when you were super drunk?" much more realistic.

Operational Changes

There are now two persistence paths to keep mounted:

STORAGE_PATH=/data/tavern-state.json
MEMORY_DB_PATH=/data/tavern-state.memories.sqlite3

Optional controls:

MEMORY_MODEL_ASSISTED=true
MEMORY_ANALYSIS_MAX_TOKENS=220
MEMORY_EMBEDDING_MODEL=nomic-embed-text
MEMORY_EMBEDDING_DIMENSIONS=128

What This Unlocks Next

This architecture leaves room for future improvements without changing the deployment model:

  • stronger provider or local embedding generation
  • memory pruning or summarisation jobs
  • richer metadata filters
  • selective replay of long-running server lore

The main point is that Buzzkeeper now has a memory system that is still easy to deploy (no additional infra needed), but much more capable than naive prompt stuffing.

I also shipped a heap of performance improvements, bug fixes and other things that make it better too, all up on GitHub.

Source code is here: https://github.com/Vheissu/buzzkeeper — all contributions from everyone from all levels of development skill are welcome (including AI assisted/generated contributions).