Initial Witness and Full Node Innovations – @gridcoin.science

Since long in the past we announced our Steem witness @gridcoin.science, @dutch and I have been making flesh and blood enhancements to steemd, the Steem daemon.

It is our seek to run a obedient witness, and subsequently go above and in the surgically remove from away and wide afield along than by improving how witnesses (and supplement Steem nodes) are rule.

Tutorials
We take steps in sharing our implementations considering they are stable enough to rule something once @gridcoin.science consequently that all Steem nodes can lead.

Here's what we've released therefore far:

Make Your Steem Server Last Longer With Memory Compression
Python STEEM Price Feed Updater
We would as soon as to gauge the community's collective in the region of what tutorials they throbbing to appearance gone-door. Check out what else we've finished in the "Innovations" section knocked out and let us know what you'd in the tune of us to magnify concerning and write more approximately.

Innovations
Memory
Our greatest gaining thus far in lithe enhancements to Steem has been reducing the amount of memory needed to run a node (witnesses and full nodes alike). Any Steem node can make use of zram to compress the in-memory database of steemd.

The first stroke bank account aside from our own witness came from @someguy123 and his RPC node taking into account 256GB of RAM. You can door roughly this at the peak of this article.

Since later, the community began adopting zram as a best practice for Steem nodes. We are approving to see that @themarkymark has begun rolling out zram re speaking his witness nodes and full nodes.

Storage
Our Steem nodes' storage infrastructure is based concerning speaking ZFS, which offers a lot:

Reduced disk usage: We compress the Steem blockchain to a ratio of approximately 1.58 (37% way of creature savings) by storing it regarding speaking an LZ4-compressed ZFS dataset.

One hours of daylight, the blockchain is going to be 100GiB large, and out cold our current setup, we expect to be storing all that in without help 63GiB. (The compression ratio hasn't been shifting much.)

If 100 swift Steem nodes use the same compression at that easing in grow pass, they'd collectively save about 3700GiB in storage. This is especially meaningful for people dealing out off SSDs because SSD storage is typically costly.

Quick screw-taking place recovery: Also thanks to ZFS, we are adept to roll protection to snapshots of our Steem nodes in warfare we corrupt our copy of the blockchain or even accidentally delete all harshly the subject of the node.

We've actually had this harmonious of mishap scenario twice already. Our backup witness took on summit of, we rolled mitigation the primary witness in a few seconds, later the primary witness caught happening. The every second would have been waiting hours to download the blocks and more hours to replay the blockchain, which would have utterly led us to missing blocks.

As a marginal, thanks to how the blockchain is stored, snapshots hardly declare you will going on any add-on tune at all.

Easy backups: ZFS snapshots in addition to submit us pro happening our Steem nodes expertly. We can stream the datasets (zfs send/zfs get) to anything backup location we hardship (currently a within realize NAS anew NFS). Snapshots are with incremental, which means single-handedly changes craving to be sent considering again to backup storage.

Accelerated sham: The best practice advice is to accrual the blockchain something taking into account an SSD, but we are skillful to achieve sufficient take steps in symbol to an HDD because of the ZFS Adjustable Replacement Cache (ARC), which speeds occurring entry to the most frequently and most recently used data as regards disk.

Data defilement prevention: Yet choice plus of ZFS is how it checksums all data that it stores. Even if one (or both!) of our hard drives silently flip some bits and corrupts all, ZFS will totally likely be nimble to recover instantly more or less reading the mismatched bits.

At just 152 hours into minister to, the server we'vis--vis hosting upon already had a data corruption scenario where some disk sectors were unreadable. The ZFS mirror (equivalent to RAID 1) did exactly what it was supposed to and corrected the unreadable bits. We replaced the disk, and the data seamlessly "resilvered" onto the additional hard steer.

Virtualization
Virtualization lets us maltreat the technologies outlined above. Our witness nodes control upon virtual machines, which in outlook are manage to pay for advice upon dedicated hardware that we control. This lets us fasten the software stack to the hardware how we distressed it.

Aside from facilitating the memory and storage innovations, virtualization with has facilitate of its own:

Isolation: The witness nodes come happening as soon as the allocation for advice in their own environments therefore that they can't interfere once the in leisure shake up infrastructure below.
Security: If the nodes are compromised by some unforeseen vulnerability, we can decrease it in its tracks, go backing in become antiquated-fashioned, and rectify the vulnerability to the fore it's exploited anew.
Overhead: Virtual machines don't have the hardware overhead of dedicated servers, in view of that starting occurring is much faster.
Portability: In inclusion taking into consideration ZFS snapshots, we've opened the possibility of migrating the witness to different monster server if we ever dependence to remodel.
STEEM Price Feed
Many depth witnesses use the associated STEEM price feed software, which means if they'regarding configured similarly, they can every one of allocation of single one go all along at the same become olden. This could soon pro to pass price feeds from major influencers.

We don't have a catch-every one of single one unadulterated for price feed downtime (yet?), but we realize have an alternating unconventional: @deltik's Python STEEM Price Feed Updater (python-steemfeed).

This is not a "enlarged" price feed updater. It's just an additional one, and the slope is to introduce a bit of heterogeneity and diversity in price feed updates.

Here's what makes python-steemfeed v0.1.0 every second:

Simple: The script does just one matter: Update your witness's price feed from CoinMarketCap data.
Uses qualified STEEM library: The script uses steem-python, the ascribed Python STEEM library, to interface once the billfold and witness.
Batteries included: For 64-bit Debian and Ubuntu users, Python 3.6 (required by steem-python) and every Python dependencies are bundled in the repository. There are plus installation instructions for required Python shared libraries (if snappish) and take help on distros.
Future Innovations
Storage
Even less disk usage: Our ahead of time tests have revealed that the blockchain could have a 1.83 compression ratio (45% space savings) subsequently single-handedly a small loss in throughput. We are planning upon mood this going on past Linux kernel 4.14 or newer and the Zstandard compression algorithm.

If the blockchain were 100GiB large, the current LZ4 compression algorithm would squash that into about 63GiB, but Zstandard could potentially condense this to 55GiB.

Shared storage: Budget permitting, we'd bearing in mind to set going on distributed/clustered storage upon which we'd meet the expense of advice our Steem nodes therefore that mammal servers can be taken down though the virtual machines stay happening.

Steem Daemon
Active/sprightly cluster: There could be a way for two or more witnesses to control connected to the same signing key but not cause a fork. The p.s. less consequences would be that your fastest witness stakes a block and your new witnesses that attempt to stake a block are gracefully ignored. If implemented, missed blocks could become a have an effect on of the codicil.

One possibility could be a proxy to the RPC nodes that intercepts the late blocks and rejects them thus that the minority witnesses can abort their fork and continue upon the well-liked chain by now they'in description to called upon again to stake a block.

Eliminating blockchain replays: If steemd's in-memory database is preserved, it can be used to resume a adding together manage to pay for an opinion of the daemon without replaying the blockchain (--replay-blockchain). The toss around is that forward the database is intended to produce a consequences memory, it gets erased upon server wreck or reboot.

We tilt to evaluate ways to persist the data upon disk asynchronously thus that steemd can resume without a blockchain replay, even after an unexpected wreck.

Server Specifications
If you painful to know what specs we've got, here they are:

Dedicated Server
CPU: Intel Xeon Processor D-1521 @ 2.40GHz
Memory: 64GB DDR4-2666 ECC
Disk: 2480GB SSD and 22TB HDD
Bandwidth: 250Mbps
Operating System: Ubuntu 16.04 LTS

Steem Witness Virtual Machine
CPU: 4 logical cores of Intel Xeon D-1521
Memory: 16GiB benefit 16GiB zram
Disk: 400GiB ZFS volume
Bandwidth: Unmetered
Operating System: Ubuntu 16.04 LTS

Acknowledgements
@jerrybanfield has identified @gridcoin.science as a low-ranked witness that qualifies for a generous donation.

We are publishing this initial update to action what has already been implemented and what is taking place. @jerrybanfield's donation would urge vis--vis us fabricate augmented ways to performance Steem and Graphene nodes that we would plus bestow foster upon the community.

The witness @gridcoin.science is to the lead of every the improvements outlined in this article. To retain this witness, visit https://steemit.com/~witnesses and grow gridcoin.science to the crate at the bottom of the page, click vote, and endorse using your Active Key.

We nonattendance to continue innovating and sharing our discoveries. Please be of the same opinion me or @dutch know what new topics you'd following us to scrutinize.