Sort:  

We have several ideas for how to roll out the technology to public nodes, but it's too early to comment much until we've completed more performance tests on exactly which method we'll use to support api nodes that don't have their own GPUs (which is probably every api node right now, ours certainly doesn't have a GPU).

Just to give you an idea of some of the options: 1) use a zfs snapshot to avoid doing a massive sync, then use CPU-computed embeddings for live sync, 2) use a publish-subscribe model where the embeddings table is subscribed to, allowing nodes that can't compute the embeddings to fetch them from ones that can, and 3) api nodes that don't want to support it can just redirect such api calls to api nodes that do.