This one was an interesting real case... 🤔 introspecting!

in #hive9 months ago

Yeah... probably my longest time away from validating the chain. 🤦‍♂️ More than 24 hours! 😬😩 - on HIVE I am already back, but on Hive-Engine this takes a bit more time... but soon!

$ cat .pm2/logs/hived-error.log* |grep "Broadcasting block"
2171624ms p2p_plugin.cpp:574            broadcast_block      ] Broadcasting block #77473815 with 20 transactions
3182607ms p2p_plugin.cpp:574            broadcast_block      ] Broadcasting block #77425010 with 40 transactions

For the ones not understanding the above output. My usual block distance (time/blocks between block production) for my place, is around 5k-6k blocks. The above was almost 50k blocks! 😤


(My UpTimeRobot stats)

How?

Well, due to the recent bulletining from AMD Bulletin ID: AMD-SB-7005 I have decided to update my motherboard firmware to cover for the first wave of problems detected...

Guess...

yep...

Well, ...more or less. The firmware did manage to upgrade, but what my board manufacturer mentioned as "performance improvements" was where the hidden pain was, without me understanding really where I was going through... and these can actually happen for any vendor. That's why with firmware, if you are OK, don't change unless you have a really strong reason for it.

These "performance improvements" where probably for newer CPU's... and I got lured about it without any need, because I could just wait for these to be confirmed, as usually "performance improvements" means something was "hardened" or "tightened" to allow for shorter timings. Whatever it was, it caused a whole new pain on stability...

VERY long, short story, BIOS got updated, and then you need to re-do some of the manual re-configurations you might have done a while ago. Simply because the settings that you had previously from the older firmware might not always be "migratable" and the usual and most safe approach is to force them to reset (general consumer products do this by default).

This means that if you use XMP or EXPO, whatever learning your system had done, is lost. And unless you dived deep on any long term stable settings you had, if you move to another firmware, you will be starting from zero again (or rely on automatic detection).

That was sort of what happened to me this weekend. And because the more memory a and bigger the DIMMs, the more complex it becomes, it all went sideways. My system became unstable and on top of it I decided to mangle with the tunables in an attempt to fix the instability. And while playing with it, something might have happened (or that I did), because after a couple changes, it stopped posting (going to BIOS) even after a CMOS reset. Which I found very strange... but the reality is that I was going nowhere...

Almost 8 hours in troubleshooting and I have given myself the RIP flag! At least until I have time to dig in without compromising time without having my stuff running.

I then had to do one more thing... check if it was just motherboard or more... (at the time I didn't had a clue yet). So, using other hardware I knew it was working find, it becomes a game of switching parts and testing each individual one. CPU, PSU, GPU, RAM, NVMes... etc (I have also a SATA expansion card). All good! 😬 Lucky... it was just the motherboard. And one that had no warranty anymore 😠

But the good part of all this, yes, there is a good part! Was that I spent literally the entire weekend with my oldest son, playing with these, like mates!

Felt really nice... and this is why I am writing too. To record a memory on the chain. 😉

The fix?

Next day, got a cheap replacement from the store, and re-validated the entire thing. I got a less expensive one because to be honest this was not in my plans, and my budget (accruing) was to go for a DDR5 mobo (but because things are still a bit expensive for the high capacity DIMMs, I was waiting a bit more).


(over exposure photos from an already very old kit - with a HIVE 💓 heart!)

Result, I have less PCIe lanes and need to reduce the amount of disks. So I had to go and consolidate a few "data" files in order to boot the VMs.

Done! ✔️

I am now re-synchronizing on the Hive-Engine infrastructure, already up for a couple hours on HIVE, and even produced block 77473815 to verify all this stuff was ok.

2171624ms p2p_plugin.cpp:574            broadcast_block      ] Broadcasting block #77473815 with 20 transactions
2171627ms block_flow_control.cpp:61     on_worker_done       ] Block stats:{"num":77473815,"lib":77473814,"type":"gen","id":"049e28170efb629cf5be9284667043e235cf2fb6","ts":"2023-08-13T08:36:12","bp":"atexoras.witness","txs":20,"size":3780,"offset":-375585,"before":{"inc":39,"ok":39,"auth":0,"rc":0},"after":{"exp":0,"fail":0,"appl":0,"post":0},"exec":{"offset":-380017,"pre":136,"work":4296,"post":2775,"all":7207}}
2171627ms witness_plugin.cpp:443        block_production_loo ] Generated block #77473815 with timestamp 2023-08-13T08:36:12 at time 2023-08-13T08:36:12
2172241ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness blocktrades for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172415ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness pizza.witness for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172459ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness gtg for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172475ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness ocd-witness for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172507ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness quochuy for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172508ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness deathwing for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172517ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness smooth.witness for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172518ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness arcange for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172546ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness therealwolf for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172566ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness roelandp for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172587ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness threespeak for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172590ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness abit for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172591ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness steempeak for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172627ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness emrebeyler for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172672ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness stoodkev for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172673ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness steempress for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172674ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness themarkymark for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172674ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness yabapmatt for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172675ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness guiltyparties for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172675ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness good-karma for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2172896ms database.cpp:5535             process_fast_confirm ] Accepted fast-confirm from witness ausbitbank for block #77473815 (049e28170efb629cf5be9284667043e235cf2fb6)
2175061ms block_flow_control.cpp:61     on_worker_done       ] Block stats:{"num":77473816,"lib":77473815,"type":"p2p","id":"049e2818021bb9216a267a8dc874e589a3784f27","ts":"2023-08-13T08:36:15","bp":"threespeak","txs":27,"size":7741,"offset":60491,"before":{"inc":51,"ok":51,"auth":0,"rc":0},"after":{"exp":0,"fail":0,"appl":2,"post":0},"exec":{"offset":57041,"pre":165,"work":3285,"post":604,"all":4054}}

All this mess will cause a bit of a delay (due costs), but it's not even a concern in my mind. Just a hiccup... if anyone is working on node infrastructure from home and you don't account for these, then I would advise you to reconsider.

Still not the end of the world, but better be stable and here than gone wrong because of lack of planning. That was a reason why I would like to keep a buffer on BEE before going nuts spending it. Well, now its inevitable... I will have to cover some costs.

Still on track though... just a month or two behind probably. I will get there.

New 😍 candidate code incoming...

While doing these, I have managed to give a peek to the new release candidate (rc0) branch for the current HIVE code... (no fork), version 1.27.5. I did start compiling it... but haven't gone through testing yet. Will wait for others in this case... but if I get some quality time in between I might give it a go. I always enjoy to test stuff for my own personal experience.


My 🖐️ motivation!

Have fun, play games, learn, and when possible, teach new things to newcomers, showing them how interesting this place can be.

🤝 Follow me on Twitter

@forkyishere 😈 (@forykw dark side) is a character I created, which emerged from Crimsonclad imagination 🙏 while dwelling over the dark dungeons of Discord chat levels.

Follow for #news about the #HIVE #Blockchain, and other stuff. I sometimes get crazy with what happens around social media. I am following all HIVE users! No promises of behavior. 😁

In addition, if you are looking for a nice place to either reach out, share or just have a great time, come along to @atexoras.pub gatherings. We welcome everyone on the blockchain.

👉 Vote for Witnesses

(⚡Vote) - Hive-Engine here - voting uses staked WORKERBEE
(⚡Vote) - HIVE here - voting uses staked HIVE

(✍Delegate) - 3Speak Network - You win 0.015% SPK tokens if you delegate LARYNX to other nodes, as opposed to only 0.010% from your powered LARYNX!

@forykw is running 😎 @atexoras.witness on all the above 💪

📰 Public Hive Engine Infrastructure

Any feedback/problems, feel free to contact me! My stuff is being monitored via UpTimeRobot where you can find their current status or just come along to the ATX Discord server.

😎 Looking for some #HIVE merch? 👉 GO HERE 👈


by

Sort:  

Ahh sounds like you had a real battle with the hardware there! It's been ages since I had a battle with my pc, think the last time it was just a Psu problem.

But as you said it was an experience with the little guy and that is worth more than anything ey!

!BEER

PSUs are actually the easier part of my life =) I know how to test them very easily!

CPU's... RAM... BOARDSSSSS!!! 😱 Completely a different game!

PS: Babysitting other processsessss.... and reading emails....

Reminds me of one of the basic sysop rules: "If it ain't broke, don't fix it".

yep... 🤣 and I should know it! 🤦‍♂️

The good thing is to see you back after a long time 😌

💪 I was around... but yep 2 days is "a long time for me".

wow i had similar problems with mining rigs, i found out the hard way also that many times we are better not updating nothing at all heheh

Those guys... fun times... But nah, not anymore.

It is good to have you back
Sorry about what you went through. It can be very annoying
You're welcome

2 days issss such a long time! !LOL

Did you know that protons have mass?
I didn't even know they were catholic.

Credit: reddit
@rafzat, I sent you an $LOLZ on behalf of forykw

(1/10)
ENTER @WIN.HIVE'S DAILY DRAW AND WIN HIVE!

Was this a case of “keep fixing it until it breaks”

I do it all the time.

More or less! !LOL

In this case, the craziness of thinking I would have hardware exploitable, was stronger than me.

I keep hitting road blocks trying to get a witness node working, I give up then come back to it, hit another road block, give up. Rinse and repeat till i get it working I guess.

Is yours on docker or native Linux?

Native. I like to compile my own stuff and be sure I can reproduce some stuff independently.

Where you are stuck at?

What do you say to comfort a friend who's struggling with grammar?
There, their, they're.

Credit: reddit
@ctrpch, I sent you an $LOLZ on behalf of forykw

(2/10)

PLAY & EARN $DOOM

Sometimes the hardware world has problems that we don't expect, and too bad what you went through, but I feel that well sometimes these things happen.

All they need a !BEER and some good troubleshooting hours 😜


Hey @jsph, here is a little bit of BEER from @forykw for you. Enjoy it!

Do you want to win SOME BEER together with your friends and draw the BEERKING.

Well done @forykw! You successfully guessed the match result and unlocked your badge!
Click on the badge to view your board.

Thank you to our sponsors. Please consider supporting them.

Check out our last posts:

Women's World Cup Contest - Recap of the second Semi-Final
Women's World Cup Contest - Recap of the first Semi-Final