Below is a list of Hive-related programming issues worked on by BlockTrades team during last week:
Hived work (blockchain node software)
We moved hived tavern tests and improved the
get_account_history test to allow testing results against a golden reference server. This will enable us to verify a HAF-based account history server against a known-good hived-based account history server.
TestTools work (python-based test system for hived)
Visible changes for test developers
Added support for test parallelization (in processes with pytest-xdist)
- Run following jobs in parallel:
- Rewrote loggers system
- Rewrote managing of directories for data generated in tests
- Run following jobs in parallel:
Improved interaction with snapshots
- Added support for loading snapshots from a custom path
- Added snapshots comparison
- Removed generated snapshot files (after automated tests only)
- Added remote node (Use remote node for
- Extended documentation
- Added tutorials about replays and snapshots
- Document cleanup policies
- Checked TestTools sources and tests with pylint
- Fixed or temporary suppressed all reported problems
- Added CI job running linter
- Introduced scopes which allows easier resource management. Use them for loggers and directories for data generated in tests.
Hivemind (2nd layer applications + social media middleware)
We’re planning to move to Ubuntu 20 (from Ubuntu 18) as the recommended Hive development platform soon, and this also entails a move to Postgres 12 (from Postgres 10) because that’s the default version of Postgres that ships with Ubuntu 20. So we’re working thru performance regressions associated with differences in the way Postgres 12 query planner works.
As reported last week, we changed the query for
update_post_rshares to fix a performance killer and changed when we executed some vacuum analyze calls. One of the fixes involved temporarily disabling just-in-time compiling for this query(https://gitlab.syncad.com/hive/hivemind/-/merge_requests/513).
We now have benchmarks for the overall performance improvement for update_post_rshares after massive sync (where this performance regression was particularly noticeable although it is also an important enhancement for live sync performance as well):
- Old performance: 6.9 hours
- New performance: 38 minutes
Hive Application Framework: framework for building robust and scalable Hive apps
Fixing/Optimizing HAF-based account history app (Hafah)
We’re currently optimizing and testing our first HAF-based app (codenamed Hafah) that emulates the functionality of hived’s account history plugin (and ultimately will replace it). Our initial benchmarks at 5M blocks had good performance, but we saw a slowdown in indexing performance when operating with a fully populated database (i.e. 50M blocks).
Benchmarking time to fill a Hafah database
This week we’ve fixed this slowdown and have some preliminary performance benchmarks for filling up a Hafah database from scratch. As reported previously, a full replay to fill up a HAF database with 57M blocks (i.e. the entire hive blockchain) for Hafah usage takes 5.5 hours. Next, we can run a Hafah index and it creates all its tables in 4.3 hours, taking 5.5 + 4.3 = 9.8 hours for the entire process. This compares very favorably against the time required for a hived account history node to replay: ~17 hours. Also, we haven’t yet tried to run both of these tasks concurrently, but there’s reason to believe that this will allow us to further reduce the time required to fill up a Hafah database.
Benchmarking API performance for Hafah
We also need to benchmark the API performance of a Hafah server. We’ve created a script that uses jmeter to measure how quickly Hafah can process the various account_history API calls under heavy loading conditions. The script currently compares performance of three types of servers: a) direct query to postgres server holding Hafah data, b) json-rpc call to Hafah’s python-based jsonrpc server, and c) a hived node.
Preliminary benchmarks show that the Hafah queries are very fast when served directly from postgres itself, but under loading conditions, we have observed that the python-based jsonrpc server is restricted to one cpu and becomes a bottleneck to performance. It is also worth noting that this is essentially the same code used by hivemind to handle jsonrpc calls, so this bottleneck probably also exists in hivemind, but just wasn’t noticed because the query times for a typical hivemind API call is much longer than the query time for a Hafah API call. In any event, we’ll be investigating ways to eliminate this bottleneck in the coming week, and hopefully it will allow for further scaling of hivemind API performance as well.
Conversion of hivemind to HAF-based app
We’ve completed the first step in converting Hivemind to a HAF-based app (converted hivemind’s massive sync code to use HAF methods). I’ve been told massive sync indexing time is already faster than old-style hivemind, but I don’t have firm numbers yet to report. Also, I expect further improvements as we restructure hivemind’s massive sync procedures to take better advantage of the new way it is being fed data.
Upcoming work for next week
For hived, we’re adding a command-line based option to allow a hived node to wait during a replay if it loses contact with a HAF database that it is filling (this issue arose when one of our devs restarted our HAF postgres server during mid-replay, but it seems like a generally useful feature).
For HAF testing, we’ll be using the hived fork generator to verify that Hafah functions robustly under heavy forking activity on the blockchain. Once we’re further along with Haf-based hivemind, we’ll likely test it the same way.
For Hafah, we’ll be 1) investigating the jsonrpc bottleneck, 2) further benchmarking API performance, 3) verifying results against a hived account history node, 4) benchmarking concurrent hived replay and Hafah massive sync, and 5) setting up continuous integration testing for Hafah.
For Haf-based hivemind, we plan to restructure its massive sync process to simplify and optimize performance by taking advantage of HAF-based design. Next we’ll modify live sync operation to only use HAF data (currently it still makes some calls to hived during live sync).