27th update of 2021 on BlockTrades work on Hive software

in HiveDevs3 years ago (edited)

blocktrades update.png
Below is a list of some of the Hive-related programming issues worked on by BlockTrades team during the past week:

Hived work (blockchain node software)

Most of our work in the past week has focused on HAF, and the hived portion of HAF (sql_serializer plugin) is mostly done now, so we didn’t work on hived much this week.

We did more testing of the sql_serializer plugin on different servers, but we lost some time to hardware problems on the new server (finally resolved now, we had to replace a bad IO card used by the 4xnvme raid drive subsystem because one of the 4 channels was operating too slowly and destroying the performance of the overall raid0 drive).

We merged in a fix to ensure that the account_created_operation is generated in all cases where an account is created: https://gitlab.syncad.com/hive/hive/-/merge_requests/296

We fixed a problem with the way operations were stored inside account history: now normal operations contained in the transactions in a block are stored before the virtual operations generated by processing of the block (previously opposite order).

Also as part of HAF-related work, we modified the build process for hived to ensure that shared object targets are created of several libraries (fc, protocol, and schema): https://gitlab.syncad.com/hive/hive/-/merge_requests/298
This was done because these libraries are re-used by the sql_serializer plugin (which has been moved from the hived repo to the new haf repo discussed further down in this post).

As part of continuing work on the Hive Command-Line Interface (CLI) wallet, we added a ctrl-c handler for the non-daemon mode to allow for more graceful shutdown:
https://gitlab.syncad.com/hive/hive/-/merge_requests/295

We also did a further review of @howo work on rc delegations for hived.

And finally, although no code was written yet, we began reviewing the resource credits code, to see if we can make some quick improvements. So far I can only say that this code is a little more complex mathematically than we originally anticipated.

Condenser

We deployed a new version of hive.blog where @quochuy had reverted the cache buster, and confirmed that this resulted in a good speedup in the average render time of account avatars. We’re also testing a change he made to show RC levels for accounts, which we’ll probably deploy soon.

Hive Application Framework: framework for building robust and scalable Hive apps

We continued to do a lot of testing and performance studies of HAF this week, mostly using the HAF account history app. We completed CI tests for HAF itself (build test, linting, plus unit tests for the postgres extension).

The HAF framework is performing well in our tests so far, so in the past week we have been working to move out of “prototype” mode and into “production” mode with this tech:

Created haf repo to ensure compatibility of components used for HAF-based apps

One difficulty we noticed during testing was ensuring that our testers deployed compatible versions of multiple components of the framework (for example, changes in the sql_serializer or psql_tools could break a HAF application such as the HAF-based account history app). This didn’t pose serious problems for us, because we have relatively tight communication between our internal team members, but we could see this would be a problem once external developers started to create HAF-based apps. To solve this problem, we’ve created a new repo called “haf”.

A HAF app developer will create a repo for his application, then add the haf repo as a submodule of his repo. In this way, anyone deploying the app can be sure they are also using the appropriate version of the HAF library (correct version of sql_serializer plugin and psql_tools). In a similar way, the HAF repo has a submodule to a version of hived that it is guaranteed to work with.

Improving app initialization process (required for efficient app development and testing)

Another issue that we saw when repeatedly testing the account history app was the importance of being able to easily reset a HAF application after a test run. A HAF app walks through a stream of transaction data provided in HAF-generated tables and produces its own tables from this data. After making a change in the way a HAF app processes this stream of data, it is important to have an efficient way to throw away the results of the previous run and start a new one from scratch, retaining the blockchain table data and just discarding the application’s table data. For this process to work simple, and especially when working on a server with multiple HAF apps, it is expected that each HAF will maintain its data in one or more distinct schemas from other HAF apps.

We’ve streamlined and standardized this process now, so that it very simple to reset any HAF-based app’s processing state back to block 0:

  • select * from hive.app_remove_context(‘my_app_context_name’); to clear the app’s context object which the HAF framework uses to track the last block processed by the app.
  • drop schema my_app_schema cascade; to only clear the database tables generated by the app while processing the blockchain transactions.

Considerations on combination of HAF-based apps

We also started thinking this week about how we can potentially combine the functionality of HAF apps.

The simplest way that two HAF apps can communicate with each other is via their respective APIs (one app can simply make an API call to another app). This is the primary way most Hive-based apps communicate today. But this method does have some weaknesses.

Problems can arise when communicating apps are out-of-sync

One potential problem is that the two apps can be out-of-sync when it comes to which block they are currently processing. Depending on the use being made of the data, this may or may not cause a problem for the app consuming the data from the source app. This is a well known issue. For example, as a health check, some Hive apps like hivemind can also be asked via API what is the block number of the last block they’ve processed, allowing the calling app can switch to another API data supplier if the hivemind app is too far behind in its block processing to supply reliable data.

High performance communication via custom SQL queries

Another potential problem of inter-app communication is performance-related: collecting data from an app via its API can be much slower than directly reading and processing its internal data via SQL queries. Since all HAF apps will employ the same primary data representation method (SQL tables) and they will often run on the same HAF server, it will generally be feasible for such apps to have this highly performant access to each other’s tables, assuming the HAF server admin grants read privileges between to an app’s schemas. But with such internal access, it is likely even more important that the two apps are well synced in terms of their internal state by block number, so I’ve been considering ways of ensuring a 1-to-1 block sync between HAF apps.

Maintaining lockstep sync via a “super app” (blame Bartek for the name of this one)

The simplest way to ensure that two or more HAF apps on a server stay in sync is to create a “super app” that has its own context and calls the block-processing routines for its sub-apps in lockstep. To make it easy to combine HAF apps in this fashion, it would be best to define a standard naming convention and parameter interface for the top-level block processing routine of any HAF app that might be usefully combined with others (probably just about every HAF app, in practice).

In this methodology, the super app would fetch the next block number to be processed, then pass this block number to each sub-app in an order determined by their inter-dependencies. For example, if app B depends on data from app A, then the super app would first call the block processing routine of app A, then the block processing routine of app B. For obvious reasons, this method can’t work if the two apps are interdependent.

One weakness of this approach is that by only passing a single block number, it is not possible to rapidly initialize the sub apps via any “massive sync” capability they have (“massive sync” is a mode where a HAF app processes multiple historical blocks at once to improve the speed at which the app can sync up to the current head block).

It also means that a fast app can’t be run any faster than the apps than that depend on it while it is syncing up (for example, it would be nice to run this fast app in parallel with its dependent apps and let it get ahead of some of them).

“State per block” and “single state” HAF apps

While thinking about the issue of ensuring block sync between HAF apps and the limitations incurred by lockstep operation, it occurred to me that there are two potentially distinct types of HAF apps: 1) a “state per block” app that keeps a history of internal state as it process each block and 2) a “single state” app that only maintains its current state at the last block it has processed.

Even for the app itself, there can be benefits to each approach: the first type of app can do rapid meta-analysis on its state (for example, an account that tracked the balance of an account could quickly provide the data to graph the historical change in the value of the account) whereas by contrast the “single state” app benefits from much smaller data storage requirements.

In practice, we can see that only apps that maintain small amounts of state data can reasonably be operated as “state per block” apps due to the potential storage requirements.

But despite this limitation, I think these types of apps may be a useful subset of HAF apps when they generate small amounts of data that is likely to be used by many other HAF apps. Normally, incorporating an app into a super app requires incorporating a copy of that app’s data into the super app. By contrast, the data for a state-per-block app can be shared across multiple super-apps operating asynchronously to each other. These state-per-block apps function similar to the way the sql-serializer generates the raw blockchain data tables, which are also shared across all HAF apps on the server.

Optimizing HAF-based account history app (Hafah)

This week we continued to do performance testing of hafah. We resolved one issue where the python app was consuming more memory than the C++ app which was causing the python version to fail at somewhat random points in time. Along similar lines, we’ll probably do a little more work to manage how much memory the app uses, in order to allow it to safely run on systems with smaller amounts of memory (we’ve been testing on relatively powerful systems with 128GB + swap space).

We’re still looking into why the python version of hafah is somewhat slower than the C++ version. From observations so far, we can see the python process occasionally bottlenecking the app’s overall performance. One possibility is differences between the libraries used to access the database (pqxx for the C++ app vs sqlalchemy for the python app), so we need to look into the relative performance of these two libraries.

Hivemind (social media middleware app used by social media frontends like hive.blog)

We made one improvement to hivemind this week: a speedup in the time to compute account notifications inside the hivemind indexer process (aka hive sync process) . The average time required to process notifications in a block was decreased to 100ms from a previous average time of 170ms.

Work in progress and upcoming work

  • Release a final official version of hivemind with postgres 10 support, then update hivemind CI to start testing using postgres 12 instead of 10. This week we deployed the new version to production and we’ve observed one possibly new bug (an occasional notification missing a valid timestamp) that needs further investigation (the dev who is investigating suspects it was actually introduced earlier when changes were made to how hived processes reversible operations, but it is prudent to delay the new release until we’re sure about that). We’ll probably release an official update to hived soon as well (just for API node operators).
  • Run tests to compare results between account history plugin and HAF-based account history apps.
  • Cleanup and create more documentation for the HAF code as well as recommended guidelines for creating HAF-based apps.
  • Finish setup of continuous integration testing for HAF account history app.
  • Experiment with generating the “impacted accounts” table directly from sql_serializer to see if it is faster than our current method where hafah generates this data on demand as it needs it.
  • Test and benchmark multi-threaded jsonrpc server for HAF apps.
  • Finish conversion of hivemind to HAF-based app. Once we’re further along with HAF-based hivemind, we’ll test it using the fork-inducing tool.
  • Continue work on speedup of TestTools-based tests.
Sort:  

Congratulations @blocktrades! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s) :

You received more than 1390000 HP as payout for your posts and comments.
Your next payout target is 1395000 HP.
The unit is Hive Power equivalent because your rewards can be split into HP and HBD

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

To support your work, I also upvoted your post!

Check out the last post from @hivebuzz:

Hive Power Up Day - November 1st 2021 - Hive Power Delegation
Bee ready for the 2nd Hive Power Up Month challenge!
Trick or Treat - Share your scariest story and get your Halloween badge
Support the HiveBuzz project. Vote for our proposal!

Thank you for your work in developing our beautiful site!

Hola. Recién hice una transferencia y quedó multiplicada x 3. Solo recibí una y las otras dos nunca llegó el importe al monedero.

I just broke my raspberry-pi out again for winter program learning, and I did see that Python is now up to ver 3.10.0. I tried to read through the whats new and I saw several mentions of old stuff being removed, so I don't know if maybe that is part of the slower time verse C++.

Anyway I continue to appreciate the update post, even if I do not understand much of it, understanding only comes when one tries to learn.

Figuring out "why something is slower" is a tough task. We can make some guesses, but in the end we'll probably just have to do a bunch of experiments to determine the answer.

Hello. I just made a transfer and it was multiplied by 3. I only received one and the other two never reached the wallet.

Sounds like great work being done here and maybe some day I will figure it out :)
But know that we appreciate all that you do for us.

I read every one of these hoping that some knowledge will sink in as to how all this magic works. Seems like some great progress is being made and I thank all the developers for that.

I did have one question that maybe you or the @peakd team could answer. Why do votes show up twice in my notifications with one dating from before I was even born?

repeat.JPG

Hope you don't mind me asking this here.

That's the bug I'm referring to here:

This week we deployed the new version to production and we’ve observed one possibly new bug (an occasional notification missing a valid timestamp) that needs further investigation (the dev who is investigating suspects it was actually introduced earlier when changes were made to how hived processes reversible operations, but it is prudent to delay the new release until we’re sure about that).

It started happening after we deployed this experimental version to our api node, and I haven't seen the bug appearing on any other node, so I think it is a problem with the new code that our testing didn't detect. It'll probably take us a couple of days to figure out the issue and fix it. We're leaving the bad code running on our node so that we can analyze it better. In the meantime, if it's bothering you, you should be able to swap to using a different API node.

What's scary about this bug is it says on my feed votes were from 52 years ago. I then realised that's referring 1970 which is a default date. Its before my time but it's scary 1970 is nearly 52 years ago 😕

Imagine how I feel, since it's not before my time...

I was wondering if this is what you were referring to in the post. From my notifications, the bug started 20 days ago so seems to predate your changes made last week, maybe that is useful to help track it down.
I was wondering was it creating digital dust on the blockchain or worse still forking it backwards in time.

Do you have a link to an error from that time? It would be helpful if so, since I think we deployed the new code on the 19th (i.e. 10 days ago). But it was probably running on another server prior to that and we initialized our server from that one, so I would also need to get the date when that one was setup. Nevermind, I guess I can just look at your notification history, I didn't read close enough the first time!

Here is when the bug seemed to start for me. 20 days ago as far as I can tell. I am connected to hived.emre.sh node run by @emrebeyler .

bug.JPG

Ah, but that's on emre's node, not ours. On ours, it doesn't happen till later. I can only guess that maybe he updated to the experimental code earlier than we did (or possibly it is a different issue in that case). For sure our node doesn't show a similar issue until we updated.

I see. I will try another node and see if it resolves it. Thanks for the interaction.

I checked your notification history, and unless I misread it, the first occurrence was 9 days ago, which fits in the window for the code update on our server. Still, that's helpful since it seems to confirm it was probably the new code.

first one that i seen was 9 days ago too.

Thx for the confirmation.

Hello @blocktrades… I have chosen your post about “-27th update of 2021 on BlockTrades work on Hive software-” for my daily initiative to re-blog - vote and comment…
05.jpg
Let's keep working and supporting each other to grow at Hive!...

Congratulations @blocktrades! Your post has been a top performer on the Hive blockchain and you have been rewarded with the following badge:

Post with the highest payout of the day.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Check out the last post from @hivebuzz:

Hive Power Up Day - November 1st 2021 - Hive Power Delegation
Bee ready for the 2nd Hive Power Up Month challenge!
Trick or Treat - Share your scariest story and get your Halloween badge
Support the HiveBuzz project. Vote for our proposal!

Me parece excelente todo el trabajo que han estado haciendo. Me encantó leer todo esto. Felicidades y mucho éxito.