You are viewing a single comment's thread from:

RE: Standalone HAF Apps- A Request for Feedback

in HiveDevs8 months ago (edited)

I think all the features/benefits described here are already available thru the publish/subscribe features of postgreSQL. In other words, you can run a HAF server that publishes some of its tables and you can create what you're calling "standalone haf servers" by running separate databases on other servers that subscribe to those tables. It's all baked into PostgreSQL.

Sort:  

Absolutely, I think it's all baked into Postgres. Didn't know about publish/subscribe. I was thinking we could use the Postgres LISTEN/NOTIFY and that upon notification the standalone db can query the full HAF server for exactly the data it wants. My understanding of publish/subscribe is very limited, but does it allow to selectively copy over only some of the data from a table, e.g. only custom_json ops with a certain ID? It didn't seem it allows for that from the docs I looked at. And also, I suspect in many instances an API might want not replication but to select and then create its own derivative data, as with the example of the polls protocol where the custom_jsons will be evaluated based on the protocol rules.

But in any case, I fully expect the mechanics of how it is done to undergo iterations as we hopefully build some useful real-world applications and we apply lessons learned. Much more of interest to me is any feedback on the proposed overall approach of creating an API layer made up of a large number of small databases, which get their data from the much smaller number of full HAF servers. Do you have any thoughts on that as an approach?

My understanding of publish/subscribe is very limited, but does it allow to selectively copy over only some of the data from a table, e.g. only custom_json ops with a certain ID?

It doesn't, but there's a simple answer for this: run a HAF app on the server that just generates tables with the desired data, then publish those tables.

Much more of interest to me is any feedback on the proposed overall approach of creating an API layer made up of a large number of small databases, which get their data from the much smaller number of full HAF servers. Do you have any thoughts on that as an approach?

I've always envisioned some of this approach, but I also suspect you're overestimating the cost of running a HAF server because you're only looking at "full" HAF servers that capture all blockchain data.

We're also developing quite lightweight HAF servers that will only require a few gigabytes of space. That's still not "tiny", of course, but it would be quite easy to use the publish/subscribe features to further reduce the amount of storage required on subscribing databases.

So what I ultimately see happening is people running lightweight HAF servers (i.e. ones that filter operations/transactions they are not interested in) and some amount of replication of the tables of these databases on subscribing databases. To put this into perspective, just filtering out splinterlands and hive engine transactions makes for a very comfortable size database.

I also suspect you're overestimating the cost of running a HAF server because you're only looking at "full" HAF servers that capture all blockchain data

For sure. When we wanted to develop the polls protocol using a regular HAF app, the limiting factor wasn't HAF (which we could fill up with only the data we need) but hived. We couldn't run a tiny HAF without hived, so we started looking for some other way. So this is how the idea of the standalone HAF app came about and the reason behind it.

So what I ultimately see happening is people running lightweight HAF servers (i.e. ones that filter operations/transactions they are not interested in) and some amount of replication of the tables of these databases on subscribing databases.

Sounds like a viable way to do it. Just thinking through it for the polls protocol. If we used publish/subscribe, we'd need hived + HAF filled with polls custom_jsons and account data. And then we can subscribe to those particular tables. But we end up still needing hived. Is there any way you can see us avoiding hived in this kind of setup?