5th update on BlockTrades' Hive development progress

in HiveDevs3 years ago (edited)

SQL account history plugin for modular hivemind

Lately our blockchain work has focused on the SQL account history plugin that injects operations, transactions, and block data into a PostgresSQL database (for use by modular hivemind apps such as wallets, games, etc).

We’ve been testing and fixing bugs related to this hived plugin and the associated hivemind sync. As mentioned last time, a replay of hived into hivemind takes 11 hours (as opposed to the 15 hours required by the rocksdb-based plugin, so we already have a decent win on replay time).

We ran a hivemind sync using this data and it completed in 46 hours (just under 2 days), but unfortunately we had a fail near the end of the sync (probably due to use of a different hived to supply some dynamic API data not yet provided by the plugin), so we still don’t have definitive data on the speed of a full hivemind sync using this method, but I wouldn’t be surprised if we cut the time in half for a full hivemind sync (i.e. from 4 days down to 2 days or even less) by the time we’re done. I say this because we’ve found that the hivemind sync using this method is currently CPU bound (with about 2/3 of CPU being used by the python-based hivemind indexer and 1/3 being used by postgres), so I think we will be able to reduce the 46 hours after we’ve profiled the python code (and maybe the SQL code as well). My best guess right now is that we're CPU bound due to some data transformation prior to writing the data to the database.

Performance results for SQL account history plugin

We’ve run some performance benchmarks using the new SQL account history plugin (versus the rocksdb account history plugin) with excellent performance gains resulting:

API call:
{"jsonrpc":"2.0", "method":"account_history_api.get_ops_in_block", "params":{"block_num":2889020,"only_virtual":false}, "id":1}
ah-rocksdb: ranges from 26s to 64s
ah-sql: 0.9s

API call:
{"jsonrpc":"2.0", "method":"account_history_api.get_account_history", "params":{"account":"abit", "start":125000, "limit":1000}, "id":1}
ah-rocksdb: ranges from 0.3s to 1.2s
ah-sql: 0.03s

API call:
{"jsonrpc":"2.0", "method":"account_history_api.enum_virtual_ops", "params":{"block_range_begin": 4000000, "block_range_end": 4020000 }, "id":1}
ah-rocksdb: ranges from 36s to 45s
ah-sql: 0.8s

These gains are good enough that we should be able to eliminate the artificial 2000 operation lookback limit currently set on the get_account_history API call when filtering for specific operation types (assuming the node is configured to serve the data from the ah-sql plugin instead of the ah-rocksdb plugin, of course).

Progress on modular hivemind

Now while the above sync and API response times represent substantial performance gains, that’s not the primary goal of this work. The primary goal is to support the creation of 2nd layer app development using “modular hivemind” deployments, allowing for the creation of unique hive-based apps that can scale to large numbers of users with real-time responsiveness.

The first step in the development of the modular hivemind technology was to validate swapping to a model where we directly inject blockchain data into hivemind’s PostgresSQL database, and as we can see from the benchmark data above, this work is going well. We’ve proved it has better performance and we were also able to switch from “pulling” the data from hived to having hived push the data to us using the existing hivemind indexer code with minor modifications.

The next step is a bit more challenging, design-wise: we plan to add support automatic handling of forks in the hivemind sync process. This will allow for modular hivemind apps to serve up data from the current headblock without any additional delay and cleanly revert whenever a micro-fork occurs (currently 2nd layer apps either have to accept a delay penalty or risk the chance for error if they don’t properly revert the state of their internal data in the case of a micro-fork). Modular hivemind application will gain a built-in micro-fork handler that automatically executes to undo the effects of blocks from the previous fork and replay blocks from the new fork in the case of a fork, eliminating the need for their developers to manually write code for fork handling. We’re currently looking into various algorithms for this fork handling logic and next we’ll be experimenting with some prototype implementations.

Hivemind-based account history API

We also recently completed a python-based account history API that reuses the SQL queries that we developed for the hived SQL account history plugin (this could be used, for example, to reduce API loading on a hived instance). The performance of this API was roughly comparable to that of the ah-sql plugin, but preliminary performance tests show it a little slower (maybe 30% in some cases), probably due to more overhead during data conversions. Still this could be an acceptable tradeoff in some situations and we haven't done any tests in high load situations, where hivemind may perform better than hived.

Hivemind API (2nd layer microservice for social media apps)

We fixed bugs and created new tests related to the follow code and decentralized lists and merged them to the develop branch (https://gitlab.syncad.com/hive/hivemind/-/merge_requests/396).

We also merged in a few more remaining optimizations to hivemind queries: https://gitlab.syncad.com/hive/hivemind/-/merge_requests/470

And created some more tests for community API calls: https://gitlab.syncad.com/hive/hivemind/-/merge_requests/466

Condenser and wallet (https://hive.blog)

We made one change to fix a bug related to decentralized lists: https://gitlab.syncad.com/hive/condenser/-/merge_requests/209

We completed a preliminary review of vision (ecency code base) versus condesner (hive.blog code base). In general the vision code looks more up-to-date in terms of web technologies and libraries used, and the code is cleaner overall, but the vision code currently relies on too much closed-source APIs to operate it easily as a white-labeled application yet. So before we begin contributing to this work, we’re awaiting changes from @good-karma team.

Near-term work plans and work in progress

We’ll continue working on modular hivemind code, with an emphasis in the next cycle on design of micro-fork handling logic. We’ll also continue tests and optimization of SQL account history plugin and hivemind sync process.

We began testing the current head of develop branch of hivemind (this contains follows and decentralized list fixes) on api.hive.blog today. If it performs well, we’ll merge it to master on Monday so that other API node operators can deploy it.

I’ve also given some thought in the last week to ways to improve the upward side of the HBD peg (when HBD goes above $1 USD). I have one idea that looks relatively simple to implement that could probably be fit into HF25. I’ll write more about it later in a separate post in the Hive Improvements community so that there can be some discussion about the economic considerations involved.

Sort:  

I have heard about hivemind a lot. But still have difficulty of understanding how one can use it. I see you mention SQL a lot with hivemind. I have been using HiveSQL and python to learn and experiment with things on Hive.

My question is are there ways to use SQL to get data from hivemind, similar to HiveSQL? Are there resources for beginners programmers to learn from and utilize hivemind using SQL and/or python?

While using HiveSQL and making experimental apps, I came across references to Steem and SBD in Hive Developer Portal. For example here: https://developers.hive.io/docs/tutorials-recipes/understanding-dynamic-global-properties.html

Somebody probably needs to review and get those things up to date. I can help if necessary. My knowledge and skills are limited though.

Thanks for all the work you and your team doing.

Hivemind as it stands to do is essentially a python application coupled with a postgres database that stores blockchain data. It has various APIs (implemented internally using SQL queries on the blockchain data) that are used by Hive apps such as condenser (hive.blog), peakd, vision (ecency), leofinance, dpoll, quello, etc. These APIs are documented here: https://developers.hive.io/ under APPBASE API (although it documents both hived and hivemind APIs together). So hivemind is essentially a microservice that can serve up blockchain data using REST API calls. You can also add your own new API calls by writing SQL queries.

We're also developing a new microservice called "modular hivemind". The goal of this is to replace the monolithic design of hivemind with a customizable microservice. A basic implementation of modular hivemind will get all the blockchain data and have a very compact API that can serve up basic data: account info and transaction history and block data. An operator of a modular hivemind node can then optional select additioinal "modules" that create additional tables to support other categories of API data (for exact, installing the social module would allow that hivemind instance to support APIs for posting data and post voting information).

The idea behind this is that new Hive apps can be built just by adding SQL queries over the blockchain data and other modules installed in the hivemind instance to create whatever APIs are needed for their application. And to do this in such a way that the app developer doesn't have to worry about low level details such as how to get efficient access to the blockchain data or how to handle forks (this is where blocks that have happened get discarded and replaced by new blocks, it requires the ability to "undo" operations that already been processed).

The account history sql for get_account_history interests me VERY much. Its one of the slowest part of most what I do on a lot of projects and that should speed it up GREATLY. Will you be changing it so that instead of looking only on the last 2000 of a person's history when nothing is found with a certain filter, it'll keep looking?

Yes, that's the plan.

Good to see great progress hoping to start my dapp asap big plans to work with you guys on this project let me know anything further https://peakd.com/c/hive-123620/created this is my community I plan to start a dapp through here just getting started

Welcome, it's always great to have new devs join the Hive ecosystem!

I check out r/hiphopheads some times, welcome to Hive!
What kind of dapp have you been planning? I run a community incubation program and would be happy to help your community with curation.

I want a dapp that would be where rappers can promote producers can sell beats and they can all interact with each other like on Hive I do want to get to a point where we can have advertising too and have it broad. I believe Hive would be the perfect network to do this on. I would appreciate the curation as of right now I'm focused on getting people in here from outside the chain to get this going

Feel free to shoot me a dm on Discord when you feel you're ready! Acidyo #2501

Those performance numbers are very impressive.

Will you be done with code refactoring and API restructuring by the end of June?
What are the next priorities?
How many devs are currently working on the code (in total, with regular contributions to the repo)?

Code refactoring of hivemind has been done for a while. A week or so ago we refactored some code in hived to make it easier to updates to the API without requiring as much boilerplate code changes. But in general, code refactoring is never done, you often find ways to improve the code when you work in it.

There hasn't been a lot of API restructuring. We've optimized the performance of the API calls and also brought some of the responses from the various API into greater conformity with each other (mainly in the face of error cases). But restructuring the API itself is a big task, because there's many 3rd party dapps that depend on the API, so any major changes in the API would require coordination changes to all the dapps. So we've limited our work of that type.

Now as part of our modular hivemind work, we're likely to start introducing new orthogonal sets of API calls depending on the configuration of the hivemind selected by the API node operator. For example, a simple "bare-bones" modular hivemind would only have API support for blockchain operation data and account information. And then we'll likely introduce new APIs for expanded versions of a module hivemind that have access to additional data like posts.

But that will probably have lower priority over the smart contract platform that will be built on top of the base module hivemind layer. Our priority for now is 1) get out the base layer modular hivemind and then 2) create the smart contract layer on top of that.

I'll check back on the dev count later, I need to go for dinner now. I think it's still roughly the same at the moment, around 10-12 devs.

"create the smart contract layer on top of that." ethereum +defi+no fees= hive to the moon ? :D

Thanks for the detailed reply🙂!
I know this is all very technical stuff (hard to break down in laymen's terms) and you are still working on the foundation, but investors like to hear more about timelines and near-term projects and priorities.
Maybe try to include more of that in future updates.
Edit: after rereading your post I see that you kinda did that in the last paragraph, but the details are not really exciting from an outsiders perspective. It reads like: "minor improvements/update coming soon - we are doing a lot of testing right now - nothing fancy in the near- to mid-term"

The performance results seem really good, but you are not highlighting it like that. I actually mearly missed it (didn't get it) in the first read.

There's a reason for that: I'm not really writing these posts for investors. These posts are primarily intended to communicate technical changes as we make them and they are mostly aimed at other developers.

I'm not directly involved in marketing for Hive, although I'm happy to share technical details with the various folks handling that (it's just a bunch of volunteers who are willing to commit their time right now, as there is no funded marketing proposal yet to compensate them for their work). Since the BlockTrades team is already contributing a lot on the technical side, I've always felt it was best for other people to handle marketing for Hive, although I'm happy to contribute technical info to help them in their endeavors.

In addition to these progress reports, I do plan to occasionally make roadmap reports for longer term plans by our team (note this is just for the work done by our team, there's other dev teams that have their own long term plans as well). My last such roadmap report was here: https://hive.blog/hive-139531/@blocktrades/roadmap-for-hive-related-work-by-blocktrades-in-the-next-6-months

I'll probably be making an update on our overall roadmap progress in a few more weeks (essentially a mid way progress report).

Whenever we have something extremely good to share with the outside world make sure the whole hive community is aware and can spread a pre fabbed message on social media.
With 6000 coins we have to stand out some way.

These posts are not meant to be marketing messages, just progress reports and technical info dumps for other devs. Marketing folks will pick data out of these posts to create the types of messages you're talking about.

I'm wondering how things are going with respect to this issue: https://hive.blog/hive-139531/@ammonite/thanksfortheseupdatesise-hb77nv1g54av362o8devkx5p5op46tl3#@ammonite/thanksfortheseupdatesise-hb77nv1g54av362o8devkx5p5op46tl3

It seems that comments are similarly affected now as well.

Edit: On hive.blog using Firefox on either Windows or Android.

The fork resolution logic I talk about in this post will fix this issue; lack of fork resolution in hivemind is the ultimate reason we have the delay. But this will be a challenging task technically, so I guess it will take a few months to complete.

Congratulations @blocktrades! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s) :

You received more than 1040000 HP as payout for your posts and comments.
Your next payout target is 1045000 HP.
The unit is Hive Power equivalent because your rewards can be split into HP and HBD

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Check out the last post from @hivebuzz:

Hive Power Up Day - March 1st 2021 - Hive Power Delegation
Hive Tour Update - Communities