This is just a quick update on some of our work on HF24 this week. As always, this is a fairly detailed report, so you can skip to the end if you just are interested in our updated expectations for the hardfork date.
Last week we continued testing and fixing discrepancies between output of the old hivemind and the eclipse version of hivemind.
One of the more time-consuming fixes required us to modify hived itself to report some post rewards information as virtual operations so that the hivemind indexer could store this information into its database.
We also found during testing that we needed to add an additional virtual operation to hived to inform hivemind when a comment was deleted. The related hived and hivemind changes for this are expected to be completed in the next couple of days.
Another “discovery” we made is that we now need to implement hot and trending tracking directly in hivemind (previously hivemind obtained these lists via realtime calls to hived). Our first attempt at this had some performance problems, but we’re planning to implement a faster solution tomorrow.
Until now, we’ve been working with a relatively small number of tests for hivemind (60 tests currently), which were sufficient to keep us busy fixing bugs initially, but we just recently added two new programmers to our hivemind team that will focus on creating a more comprehensive set of tests.
Investigating solutions to SQL view issue
In my previous post, I mentioned a performance problem that arose when we used a query that referenced a SQL view instead of inlining joins directly in the query. We traced this slowdown to a default limit in Postgres on how “deep” the Postgres planner will go when looking for an optimal query plan. Introducing the intermediary view exceeded the default join_collapse_limit. By increasing this value from the default value of 8 to 16, we were able to obtain the originally expected performance. So at this point we’re just investigating the best way to make this change.
Initial hivemind sync performance testing
We’ve started measuring the time it takes for an eclipse hivemind to sync. Our automated test system already tests this for 5 million blocks, and it takes around 40 minutes there. But when we tried this on another similarly configured machine that we were planning to use to measure the time to sync to the chain’s current head block, it took nearly twice as long to sync to 5 million blocks, so we need to figure out what’s causing the discrepancy.
At this point, we have a couple of possible reasons for the discrepancy: 1) the database instances are “tuned” differently, 2) the slower system may be overheating causing the CPU to be throttled, or 3) the slow system is configured with nvme drives rather than SSD drives (which should be faster, but possibly there is some driver problem).
Plans for upcoming week
While doing more testing of the new snapshot feature that allows fast saving/reloading of state information, we found that we weren’t storing the state information from plugins (we were only storing the state information in the core code), so we’ll be adding a mechanism for signaling plugins to save off their state data and implementing this functionality for the account history rocksdb plugin.
As previously mentioned, we’ll also be adding virtual operations for deletion of comments and creating more hivemind tests.
Micro-fork handling in hivemind
While working in hivemind, we found that hivemind’s current handling of forks is less than ideal. The forked blocks themselves are discarded from the hivemind database, but the effects of the forked-out blocks are retained in the database. With the movement of more data from hived to hivemind, this “solution” becomes ever more problematic since hived’s internal fork recovery mechanism would previously mask some of these problems, so we’re planning to look at a more robust method of handling forks in hivemind.
After a couple of days of discussion, we have at least two viable methods to investigate: 1) storing additional data into the tables to allow the effects of blocks to be rolled back up to the last irreversible block or 2) using a 2nd copy of the database that’s filled up to the last irreversible block that we can switch to in the case of a fork, and then play the fork blocks onto this second database (the mechanics of this are a bit more complicated, but that’s the general idea). We’re generally favoring method 1 right now, since it seems likely to be faster, and we currently think it can be done without too much difficulty and without requiring ongoing code maintenance to support it (the solution is generic to the types of data being saved and updated in the database).
We don’t plan to work on improved micro-fork handling until after we’ve released a fully tested hivemind, since it’s not a feature that will be needed by third party apps during their own testing work.
Updated release estimates (TL;DR)
I’d hope we would have a release candidate for hived out this last week, but we had several hivemind bugs that required hived changes AND we still need to make the fix to the snapshot feature, so we’re pushing out the expected date for the release candidate by another week (next Monday). We’re planning to release hivemind within a few days thereafter, barring any further major bugs.
As mentioned in my previous report, we plan to schedule the hardfork itself for 3 weeks after the date we publish the release candidate itself.