Below is a list of Hive-related programming issues worked on by BlockTrades team during last week or so:
Hived work (blockchain node software)
Further optimization of sql_serializer plugin
The sql_serializer plugin is responsible for reading blockchain data and pushing that data to a HAF server’s Postgres database. As a practical matter, this means the speed of this plugin sets an upper limit on how fast a HAF app can operate (especially during a replay of a hived node), so it is important that this process is as fast as possible.
After our latest optimizations (and bug fixes), we’ve brought the time required to do a full replay with the sql_serializer plugin down to 5.5 hours (previously about 7 hours, so about 17% speedup). Note that both the old and the new benchmarks were performed on one of our fastest systems (a Ryzen 9 5950X with 4 NVME drives in a raid0 configuration).
The latest changes for the sql_serializer are now merged into the develop branch.
Cleanup display of results for several hive CLI wallet commands
The hive CLI wallet is a command-line interface for creating hive transactions. It is mostly used by Hive apps, exchanges, and Hive “power users”.
The most recent improvements include minor fixes to the display of wallet command results, such as case consistency, indentation for tables, headers and other misc. display fixes for
get_order_book, etc. There were also various improvements and fixes to CLI wallet internal docs.
We eliminated some false errors generated by GCC 9.3.0 linter and fixed a Cmake-related issue (compilation of the sql_serializer plugin requires Postgres dev libraries as a dependency):
Hivemind (2nd layer applications + social media middleware)
The most recent hivemind work was focused on optimization of the hivemind indexer (the process that builds social data from the raw blockchain data).
Reduced peak database disk usage (20% reduction in peak storage requirement)
We noticed that postgres database size temporarily grew to around 700GB during the massive sync phase (this is when a hivemind server is first being initialized with old blockchain data). We determined that this temporary increase occurred during record cleanup resulting from the reputation calculation process, and by changing from using DELETE calls to
DROP TABLE and
TRUNCATE calls we were able to eliminate this temporary peak usage requirement, resulting in a 20% reduction in peak storage requirements (approximate 141GB less storage used at current headblock). We’re also hoping this will lead to further speedup in the post-massive sync initialization of table indexes, but we haven’t had a chance to benchmark that yet. https://gitlab.syncad.com/hive/hivemind/-/merge_requests/509
Elimination of post’s
active property speeded up hivemind sync by 20%
We discovered that none of the Hive apps used the
active field for posts (this field indicated that a post was still actively competing for rewards from the reward pool) so we removed the field from the database schema and eliminated it from API responses. This turned out to be surprisingly beneficial to hivemind’s full sync time, at least according to our latest benchmarks: we completed a full sync in 10.5 hours, approximately 20% faster than our previous time on the same system. https://gitlab.syncad.com/hive/hivemind/-/merge_requests/511
Completed optimization of update_post_rshares for Postgres 12
We’re planning to move to Ubuntu 20 (from Ubuntu 18) as the recommended Hive development platform soon, and this also entails a move to Postgres 12 (from Postgres 10) because that’s the default version of Postgres that ships with Ubuntu 20. So we’re working thru performance regressions associated with differences in the way Postgres 12 query planner works. Most recently we changed the query for
update_post_rshares to fix a performance killer and changed when we executed some vacuum analyze calls:
Condenser (open-source codebase for hive.blog, etc)
We tested and deployed various fixes by @quochuy to https://hive.blog:
Hive Image Server
We’re in the process of doing a major upgrade to the hardware that runs the image server for Hive. The new system has a LOT more disk space, faster drives, more CPUs, more memory, etc. Currently we’re in the process of moving over the huge amount of images to the new server.
Hive Application Framework: framework for building robust and scalable Hive apps
Optimizing HAF-based account history app
We’re currently optimizing and testing our first HAF-based app (codenamed Hafah) that emulates the functionality of hived’s account history plugin (and ultimately will replace it). Our initial benchmarks at 5M blocks had good performance, but we’ve seen a slowdown in indexing performance when operating with a fully populated database (i.e. 50M blocks) so we’re working now to optimize the speed of the queries. We’re also preparing benchmarks for the speed of the API calls.
Database diff tool for testing HAF-based apps
We're developing a new testing system to test the port of hivemind to a HAF-based application: essentially it is a "database diff" tool that will allow us to detect differences (via SQL statements) between the tables created by the old hivemind app and the upcoming HAF-based hivemind.
On the hived side, we’ll be adding support to the sql_serializer for directly injecting “impacted account” data. After that, we can compare the relative performance of this method of inserting the data into postgres versus using the C-based postgres extension for computing this data, in order to make a decision about the best design alternative. We’ll also likely merge in the first hardfork-related changes that 1) allow more than one vote per three seconds by an account and 2) don’t kill curation rewards entirely when someone edits their vote strength on a post.
Our HAF work will continue to focus on our first example HAF apps (account history app and HAF-based hivemind implementation).