Update on Hardfork 24 work by BlockTrades team

in HiveDevs4 years ago (edited)

Below is a summary of the Hive core work done by the BlockTrades team last week and an update on hardfork 24 timing.

Hived (blockchain network software) testing and bug fixes

Earlier last week, @gtg found an intermittent error in the new snapshot functionality (non-consensus related feature). This error was fixed here:
https://gitlab.syncad.com/hive/hive/-/merge_requests/117

Yesterday, @good-karma began testing ecency against our Eclipse API service hosted by @gtg (https://beta.openhive.network/) and reported a problem: the API method find_account_names doesn’t return the post_json_metadata field.

We discovered that this was because this API call was previously handled by the fat node, which kept this data in memory. We no longer require a fat node in Eclipse-based API servers, so this call is now handled by the account history node (which is a low memory node, so it doesn’t collect the data to return this field in the response).

In theory, Eclipse-based apps should use the get_profile API method to get this field now, but as a temporary workaround to avoid requiring apps to update their code before the hardfork, we’ve added a compile flag COLLECT_ACCOUNT_METADATA to allow account history nodes for API services to collect this data:
https://gitlab.syncad.com/hive/hive/-/merge_requests/116

This change will require an 18 hour replay of our hived node, which was began about 10 hours ago, so the data won't be available on our Eclipse API service until the replay completes.

This is a temporary fix and will be deprecated in the coming weeks. In the long term, all apps should update to use get_profile instead to get post_json_metadata.

Note that the COLLECT_ACCOUNT_METADATA flag should only be set to ON when building an account history node that will be used as part of a API service configuration. This is a not a consensus-related change and there’s no need for update/replay by any other types of hived nodes. Also note that this flag is currently set to ON by default, so if you’re not building a node for a Hive API service, you will want to set it to OFF.

There are no currently known issues with hived.

Hivemind (social layer microservice) progress

Most of our effort last week was spent on further testing, bug fixing, and optimization of hivemind performance, with an emphasis on testing using condenser (hive.blog) and condenser’s wallet app to perform “real-world” style testing.

Changes that may potentially impact API method usage:
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/222
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/223

Fixes and improvements to notification methods (especially related to mentions):
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/185
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/213
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/212
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/224
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/225
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/220

Various bug fixes and optimizations:
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/210
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/214
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/217
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/218
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/219
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/215
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/184
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/227
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/228
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/229

Fix performance problem with get_profile reported by @good-karma:
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/233

The above problem was representative of one of the more tricky issues we often face with Postgres: on databases with different histories, we sometimes see Postgres selecting a bad query plan, which leads to bad performance. In this case, it performed well on our development database, but was slow on our testing database. We worked around the problem by eliminating the index that was being used for the bad plan. We don’t think that elimination of this index will impact performance of any other queries, but we’re enhancing our hivemind CI system to detect such potential regressions in performance.

Allow integration with pghero (SQL database profiling/monitoring tool):
https://gitlab.syncad.com/hive/hivemind/-/merge_requests/231

Hive API service maintenance (api.hive.blog)

Currently, the API severs api.hive.blog and anyx.io handle the majority of Hive API network traffic.

Last week, the anyx.io API service was temporarily out of service due to problems in the datacenter where it’s hosted. And several other API services were also down for various reasons (some were updating to Eclipse).

This caused most of Hive’s API traffic to be handled by api.hive.blog (the production API service maintained by blocktrades) for several days. When this happened, we started to see “database timeouts” on our hived nodes due to overloading, which caused intermittent failures in transaction broadcasts by users.

In an attempt to alleviate this situation, we added some Eclipse-based hived nodes to our API service to spread the load. We used Eclipse-based nodes to do this because these consume much less memory, so we could fit them on our existing servers.

In the process, we discovered that neither condenser nor the web wallet had yet been updated to use the version of hive-js that is compatible with Eclipse. This caused the Transaction broadcast error: Obsolete form of transaction detected, update your wallet error that many hive.blog users experienced when posting or commenting.

As a temporary measure to resolve this problem, we reconfigured our jussi to only use the Eclipse nodes to serve blocks and use pre-Eclipse nodes for broadcasting transactions, until we had time to update condenser and the web wallet and test the changes.

In the long term, we will be looking into replacing chainbase, which is the root cause of these database timeouts, with a full database implementation such as rocksdb (i.e. effectively re-implementing MIRA more efficiently) because rocksdb has a more robust and efficient database locking mechanism.

Condenser (hive.blog) and web wallet (wallet.hive.blog) work

Per the above mentioned issues, we discovered that neither condenser nor the web wallet had been updated to the latest version of hive-js. We made the necessary changes here:
https://gitlab.syncad.com/hive/condenser/-/merge_requests/110
https://gitlab.syncad.com/hive/wallet/-/merge_requests/28

We’ve been doing limited testing of condenser and the web wallet starting early last week, but we began more intense efforts this weekend, after we discovered that hive-js hadn’t yet been updated. To simplify the testing of condenser and the web wallet in the develop branch against Eclipse, we’ve modified them to ease use of the Eclipse testing API service:
https://gitlab.syncad.com/hive/condenser/-/merge_requests/110
https://gitlab.syncad.com/hive/condenser/-/merge_requests/114
https://gitlab.syncad.com/hive/condenser/-/merge_requests/115
https://gitlab.syncad.com/hive/wallet/-/merge_requests/31
https://gitlab.syncad.com/hive/wallet/-/merge_requests/32

We also merged in one change to condenser that we couldn’t merge in until we were only testing against Eclipse nodes:
https://gitlab.syncad.com/hive/condenser/-/merge_requests/113

We’ve also seen some references to a hardcoded chain_id in the configuration of condenser and the web wallet. We don’t believe these references currently have any functional effects, so we’re going to remove them today and test to make sure there is no impact from the removal.

Call to arms for Hive Apps developers

We really need all Hive apps developers to begin testing their apps against our Eclipse API service as soon as possible.

We don’t want to execute HF24 until all major Hive apps are confirmed to work against Eclipse API services. @roelandp has created a spreadsheet tracking Hive apps where you can indicate the current state of your testing for Eclipse-compatibility:
https://docs.google.com/spreadsheets/d/10ahdZMR6AgYd6_dXQnioFN8FJAuXteEoWobhgClsGbI/edit?usp=sharing
If your Hive app is not listed already, please add it, it’s an open list.

Also, before you begin testing, please make sure you've made the changes described in this post at a minimum:
https://hive.blog/hive-139531/@mahdiyari/how-to-prepare-your-applications-for-hf24

And as a reminder, our testing node for the Eclipse API service is located at: https://beta.openhive.network

Plans for coming week

The top 20 witnesses are delaying the final hardfork date until all major Hive apps have signaled that they are Eclipse-compatible. Based on current apps testing levels, this means the hardfork will NOT happen on Oct 6th.

The current plan is for witnesses to review the status of apps compatibility on Oct 8th and decide at that point if most of the Hive apps are ready for HF24. This will require no changes to the hived blockchain software, as the hardfork time is ultimately controlled by when a super-majority (at least 16 of the top 20 witnesses required at minimum) signal for hardfork 24 to happen.

We’re also planning to create a dump file of our hivemind database, to allow other Hive API services to get their own Eclipse API services operational as quickly as possible (otherwise, they would need to do a full hivemind replay which takes 4 days on fast hardware).

Sort:  

PeakD.com has been testing and hasn't come up with any issues as of yet thanks for the hard work on your side with the blockchain code and performance upgrades on Hivemind etc.

Thanks for the feedback, I was hoping you guys had been testing already, but good to know for sure.

Keychain is ready and the new version (Eclipse compatible) is now available on Chrome and Firefox stores

Will it reset the extension if I download and apply the update?

If you use the extension from a store, it will update by itself. If locally installed from github , you can update locally and keep the data. As long as you don't uninstall, the extension won't reset

Thanks for the information. It was quite helpful. 😄

I noticed the active_votes array returned by 'hive.api.getContent()' no longer include the 'time' and 'weight' fields. Are these omitted on purpose?

I use these on hiveblockexplorer.com. For example:
https://hiveblockexplorer.com/@blocktrades/update-on-hardfork-24-work-by-blocktrades-team

hive.api.getActiveVotes() returns an error (RPCError: Server error) using beta.openhive.network. It works properly when I use a non-eclipse node.

I think this error is similar to this one:

Yesterday, @good-karma began testing ecency against our Eclipse API service hosted by @gtg (https://beta.openhive.network/) and reported a problem: the API method find_account_names doesn’t return the post_json_metadata field.

Since most metadata is now being moved to hivemind, the eclipse nodes don't have it, but I am just guessing.

Yeah, looks like there are some fields being dropped in get_content call

From a layperson's perspective this sounds a bit unsettling. Hopefully things improve soon.

Excellent job testing and finding bugs Before the go live, as well as for exercising proper caution and requiring the major apps to re-test. Thanks to all the devs involved for this extra diligence to keep the platform stable!

Sorry to hear about the delay, but as some of the apps have been testing it did bring one issue to light, so hopefully not many more will be found on or prior to the 8th.

I think the spreadsheet was a great idea, it lets all of us, even non developers get a small glimpse of where we are at progress wise, and what apps are putting out solid efforts.

Hive-Roller.com has tested on the new Eclipse API and had no issues.

@blocktrades, my posts are not showing up on hive.blog but show just fine on peakd.com. I've been told you run the hive.blog front end, so I'm reaching out to you to ask why this would be.

Never mind... it just became visible. I'm guessing it was delayed because of back-ground operations of the server?

Our api server was extremely loaded for a bit today, so that was probably the issue.

Pity about the probable delay, but I am glad that there is a chance for more Apps to get up to speed, as I think there may have been some issues surrounding this already.

are there any nerves out back, or is everyone expecting a smooth run?

I don't expect any problems with the actual hardfork.

Good to hear.

This is great news. :)

Last week, the anyx.io API service was temporarily out of service ...

Never ever a better example of a blessing in disguise!

Sounds like the work of professionals. Thanks for the updates - they're real confidence builders.

Yeah, despite the headaches it entailed, we definitely collected some useful info as a result.

Note that the COLLECT_ACCOUNT_METADATA flag should only be set to [ON?] when building an account history node that will be used as part of a API service configuration.

Thanks for all work you and team have put into this hard fork.

The top 20 witnesses are delaying the final hardfork date until all major Hive apps have signaled that they are Eclipse-compatible. Based on current apps testing levels, this means the hardfork will NOT happen on Oct 6th.

The current plan is for witnesses to review the status of apps compatibility on Oct 8th and decide at that point if most of the Hive apps are ready for HF24.

Well, which one is it? You want all apps updated (not gonna happen)

Or most of them? (Also, not gonna happen)

The phrase I used was "all major Hive apps", not "all apps". This was referring to the apps used daily by most Hivers. They shouldn't require many changes per se, mostly it's a matter of testing for any problems.

Shouldn't the onus be on those apps to be up to date prior to your (the 20w) deadline? This wasn't sold as a delay due to problems. It's a delay due to apps not up to date on their code. That's their problem. Not ours.

It's easy to say that, but those apps are how most Hivers interact with the blockchain. Delaying a couple of days is worth it, IMO.

And we also can't be sure they won't find real problems with hived/hivemind, when they do their testing. We've been developing automated tests as rapidly as we can, but there's a huge number of Hive API methods to test. At this point, the largest set of "tests" is the Hive apps themselves.

Man it's really disappointing to get this far and hear that some apps still need to do hivemind testing. Will @crimsonclad even get to make a post in 2020?

Sorry I come off so aggro, but this is kinda bs. You guys have set multiple deadlines and everyone has been nixed. How can this many of our "killer apps" be so far off the ball!

I'm going to say that part of this is me (I really can't manage like a lot of these people to do the personal stuff the way I want alongside the community stuff) but also, something that we are learning on the fly about "decentralized ecosystems". We thought and hoped that enough of us were working on testing or were spinning up nodes, but at the last minute, realized everyone was waiting for everyone else, and part of that is a legacy of the ecosystem we come from. Now, I at least am comitting myself to doing some extra legwork to help us realize that we have to take personal responsibility and encourage all our dapps to do the same... there's only so much blame we can place knowing that in the past it was just all about waiting until you were told "yep this works" by a team who didn't totally have the freedom to care if it actually did.

Like, you gave them 2 extra days. If they couldn't do it in the time it's been since Oct 6th was set in stone and announced, what makes you think they can get their head out of their ass in 48h?

Previously, no one was actively hounding apps to make sure they were testing. Several people are now doing that. I think this will make a big difference.

I think some apps devs held off because they wanted to see no changes to the hivemind code before they started testing, because they didn't want to waste any time on their side. But that doesn't work very well right now, because we don't have tests for all the API calls that an app may use, so if they don't test, we don't know there's a problem. So at least for this hardfork, and probably the next one as well, this mindset needs to change.

Woof, that sounds a bit rough. But that makes a LOT of sense. Thanks for taking the time to break it down for me. Have a good one out there and keep up the hard work!

We really appreciate the time you dedicate on summarizing information and posting this updates. Delaying the HF a couple of days or more when it is necessary is a good decision.

People want the things now, and sometimes is better to wait.

So...how do the exchanges feel about this?

It doesn't really impact the exchanges much, as they only needed to upgrade to hived 1.24.2 (which several have successfully completed). A couple are still working through issues with their node.

Congratulations @blocktrades! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s) :

Your post got the highest payout of the day

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Do not miss the last post from @hivebuzz:

Feedback from the October 1st Hive Power Up Day

Thank you for providing us with accurate information on what is happening with the new enhancements.

Everything change takes time to testing and then avoid the least amount of errors after activation.

Greetings. @blocktrades

Okili Dokili.^^

Our Project will go to the HF too.

Looks like we reopend our doors in the perfect time for changes. :-)

Thanks for the update and good luck to all with the codes.

Salve
Alucian

Que alegría saber que están trabajando duro para que exista solvencia y cada día la familia hive confíe más en nuestros desarrolladores, mucha gente que hemos traído al mundo hive nos preguntan ¿que es lo que sucede? y a veces nos quedamos de brazos cruzados buscando que respuestas darles y aquí en buenahora #blocktrades nos aclara todo, para nosotros poder ayudar a estar tranquilos a nuestros compañeros hivianos... Inmensas Gracias a todos los que hacen este gran trabajo...

Sorry to say this, but this info about the delay should have gone out way earlier than 20 hours or so before the hardfork was supposed to happen, plans that can't be easily changed has already been made to the projects that I run, with a pause today which could not be avoided due to this short notice.

We will not make plans for a pause for when the hardfork now actually happens, even if a date and time will be released, but treat it like shit happens deal with it.

I would suggest having a dedicated person to handle communications, it's not just developers that needs to know.

Love these detailed posts, thanks for your work.

I'm grateful for all the work you and the coders/witnesses put in. I took a rudimentary programming course once, so I know this isn't magic, but what all of you create is awesome. And it's important. I never worry about the bumps along the way. You'll sort it out and at the end we'll have our platform, stronger and healthier.

Here's hoping the ride to the other side of the hardfork is relatively smooth.

Yes, we will wait a little longer.

Am newbies and this is my first time coming to this HiveDev community. But I must confess that am impressed by your reports and am encouraging you guys, you must understand that nothing good comes easy. I can say that the reason for this delay is for it to be perfectly execute and . I believe in this team, with my little knowledge, I can say that you guy will deliver a great job that everyone will be proud of. Good luck

I hope something like this doesn't interrupt what's improved the platform.

excellent work and grace to all developers