[GUIDE] Optimise your RPC server for better performance

What is an RPC server?

An RPC server is a type of Steem server usually ran by witnesses like myself (@someguy123). They are used by third party applications such as MinnowBooster, voting bots, interfaces such as Busy and ChainBB, and AnonSteem.

They provide an API (Application Programming Interface) to the Steem network, allowing developers to create applications which read data from the Steem network such as posts or transfers, and to allow them to broadcast transactions such as votes and comments.

These servers are extremely expensive, as they currently require 512GB of RAM to operate, and growing every day. The majority are operated by TOP 20 witnesses due to their high costs of operation.

It can be difficult to find a 512GB server, but Privex Inc. sells them for just $600/mo (DISCLAIMER: I am the CEO of Privex), and accepts cryptocurrency such as Bitcoin, Litecoin, STEEM, and SBD.

What causes RPC servers to run slowly?

There are a mix of issues at hand:

  • steemd is single threaded while resyncing, and does not make good use of cores even after syncing. Single core performance is extremely important.
  • RPC nodes require a huge amount of RAM to operate at good speeds. Running on NVME or SSD will cause it to perform very poorly. RAM speed may influence the performance
  • SSDs are necessary for the blockchain, if not NVME. RAID 0 is strongly recommended for increased performance
  • Public nodes run into various networking problems from the high load they suffer.

Hardware

The first and foremost thing is to obtain good hardware. You want a CPU with good single core performance, rather than a CPU with 10s of cores. This cuts replay time and improves performance.

Storing the shared memory in RAM massively reduces replay time and improves stability. Alternatively you will want several NVME drives in RAID 0 dedicated to the shared memory file. Due to the heavy reads and writes, it may be advised to use a high performance filesystem such as XFS on the NVME drives, disable access times, and have it write the journal to a different disk (e.g. SATA SSDs).

Public RPC nodes can chew through massive amounts of bandwidth. The public PRIVEX load balancer (steemd.privex.io) has gone through 20 TB in just under 3 months. Network speeds of 300mbps+ are recommended (Privex sells 1gbps (1000mbps) servers), with at least 5TB bandwidth per month minimum.

Network Optimization

It's recommended to use NGINX in front of your RPC node, disable access logs and set up rate limiting as such:

# ----------------------
# nginx.conf
limit_req_zone $binary_remote_addr zone=ws:10m rate=1r/s;
# ----------------------
# sites-enabled/default.conf
limit_req zone=ws burst=5;
access_log off;
keepalive_timeout 65;
keepalive_requests 100000;
sendfile on;
tcp_nopush on;
tcp_nodelay on;

This will restrict the rate at which individual users can send requests, preventing abuse.

Further network optimization

One problem is that some applications make excessive connections. This causes detrimental effects to performance.

To protect against this, you can use iptables (use iptables-persistent/netfilter-persistent to hold this on reboot) to restrict each IP to 10 connections at a time.

iptables -A INPUT -p tcp --syn --dport 443 -m connlimit --connlimit-above 10 --connlimit-mask 32 -j REJECT --reject-with tcp-reset
iptables -A INPUT -p tcp --syn --dport 80 -m connlimit --connlimit-above 10 --connlimit-mask 32 -j REJECT --reject-with tcp-reset


Notice the massive drop in connections from adding these iptables rules. This dramatically freed up connections and improved response times.

Another issue many RPC nodes face, is stale connections. This may be related to poor networking code within steemd or third party libraries for interfacing with Steem.

This can be resolved by tweaking the TIME_WAIT re-use, recycling and timeouts.

echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse

To retain this on boot, place this in /etc/sysctl.conf (Taken from LinuxBrigade)

# Decrease TIME_WAIT seconds
net.ipv4.tcp_fin_timeout = 30
 
# Recycle and Reuse TIME_WAIT sockets faster
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1

As can be seen above, tweaking these networking flags helped reduce TIME_WAIT connections massively, further cleaning up the connection pool and improving response times.

BEFORE network adjustments

 curl -w "@curl-format.txt" -s --data '{"jsonrpc": "2.0", "method": "get_dynamic_global_properties", "params": [], "id": 1 }' https://direct.steemd.privex.io
    time_namelookup:  0.067882
       time_connect:  0.098762
    time_appconnect:  0.173686
   time_pretransfer:  0.173719
      time_redirect:  0.000000
 time_starttransfer:  0.469058
                    ----------
         time_total:  0.469133

AFTER network adjustments

curl -w "@curl-format.txt" -s --data '{"jsonrpc": "2.0", "method": "get_dynamic_global_properties", "params": [], "id": 1 }' https://direct.steemd.privex.io

time_namelookup:  0.004555
       time_connect:  0.033890
    time_appconnect:  0.105844
   time_pretransfer:  0.105878
      time_redirect:  0.000000
 time_starttransfer:  0.137760
                    ----------
         time_total:  0.137781

Conclusion

As can be seen, the network fixes caused a four-fold improvement in response time. The Privex RPC server was ranging from 400ms to 600ms prior to the fixes applied. After enabling connection restrictions, and cleaning up TIME_WAITs, the response time was stable between 120ms to 150ms.

I hope that these tips will help you to improve your servers performance.


GIF Avatar by @stellabelle


Do you like what I'm doing for STEEM/Steemit?

Vote for me @someguy123 to be a witness - every vote counts.

Don't forget to follow me for more like this.


Have you ever thought about being a witness yourself? Join the witness channel. We're happy to guide you! Join in shaping the STEEM economy.

Are you looking for a new server provider? My company @privex offers highly-reliable and affordable dedicated and virtual servers for STEEM, LTC, and BTC! Check out our website at https://www.privex.io


Sort:  

Upvoting for the effort of making things better, but I can't agree with tips. Name lookup while being a part of a benchmark is crazy long and irrelevant. Nginx config is kind of generic use case optimization. We are using it for reverse proxies, so sendfile on is irrelevant.
Network level optimizations is what devices in front of your rpc node are used for.

time_namelookup:    0.004
time_connect:       0.004
time_appconnect:    0.087
time_pretransfer:   0.087
time_redirect:      0.000
time_starttransfer: 0.088
time_total:         0.088

That's my endpoint under load, but queried from a VIP zone. No TCP tweaking on a host. 128GB RAM with RAID0 NVMe

BTW, yeah, I know that my node's performance for general public is currently not as good as it used to be but I'm still trying to serve some high frequency requests to service providers (despite multiple notices of deprecation / making that endpoint obsolete).
I will switch endpoint somewhere in May to what I have currently under tests. Same hardware, new software, you will see the difference :-)

The name lookup part is strange yes, but it was staying between 400ms to 600ms regardless of that until I made the optimisations.

I found one of the most common reasons for RPCs being slow, was too many connections. I saw some IPs making 100s of connections regardless of the nginx rate limiting. I also saw an ungodly amount of TIME_WAIT (waiting to close) connections that were not being cleaned up.

I copied the sendfile on part from our Privex config, I was a little confused as to why that was there too, but I just left it there as it didn't seem to be hurting anything.

If you take a look at the graphs, you can see the insane open connections that my RPC node, and the minnowsupport one were suffering from. This is partly due to asshats using get_block over HTTP rather than websockets, thus opening 100s of connections (of which by default linux takes 4 minutes to close... which is why time_wait optimisation is needed). This does slow down public RPC nodes due to the fact the network scheduler is having to deal with 2000 connections despite the fact less than 300 of those are actually active.

Yeah... I was considering disabling get_block entirely and using separate, smaller and much more robust instance for that (pre-jussi times) but there were also troubles with vops. I'm planning for improvements for June, there's no point in wasting time for temporary solutions.

Hi will graphql at least save those poor rpc nodes from too much request? maybe websocket + graphql? Even facebook are serving clients using graphql. It's way better than REST api.

Well, yes, using GraphQL would be a perfect way for interacting with various microservices, moreover it can live alongside standard REST routes.
Oh, by the way June turned out to be July or even August ;-( Time flies.

well yeah :) time flies. I learned a lot from coding.. Dude you should totally have graphql setted up :P So i can query :D I'm not technical enough to setup one. Do i need to be witness? I thought of relaying rpc nodes then to graphql server. But that's just redundant no? so better be a witness?

No need to be a witness to run your own API endpoints, however, due to the fact that witnesses are compensated for block production, they are expected to provide infrastructure / services for the community.
If you have a good idea for Steem related service that need to run it's own high performance API server I can help you with setting that up.

This is partly due to asshats using get_block over HTTP rather than websockets

@someguy123 As a witnesses you have strong knowledgebase about stuff working behind the scene, so perhaps you have some preferences / advices aimed to API users?

I mean people using API do not worry about performance, but perhaps your hints (such as this quoted above) make a difference? Thinking about simple Do's and Do Not's list.

Agreed. For people using an api, what would be helpful?

Upvoted myself because those comments from voting bots are super annoying. Nobody is scrolling below that trash.

So what's the tl;dr?

hardware firewall in front to deal with network level hassle, nginx with ssl termination, jussi for caching, then specialized nodes; appbase+rocksdb, enterprise nvmes and 640kB should be enough for everybody ;-)
Soon in your blockchain (June, after my vacation)

Thanks. Unfortunately unable to upvote at this moment.

hello dear i liked you're posst please follo me

I actually just followed you, please follow me back.

follow me and I will follow you. that's all.

Unfollow me unfollow you?

oh, me :(

Oh my!

Oh d-d-dear!

Thanks for this, I just tweaked the time_wait params on rpc.steemviz.com and saw a nice boost in response times :)

Really helpful tips for the house keepers. I will sure support you as a witness for this act of selflessness. Thanks for the heads up

What you have done is important. RPC nodes greatly simplify Steem based apps!

Done pressing the witness button to you sir ,for a person like you and your activities in steemit world that shows big impact of many.. Thanks for your work and godbless.

I have often wonder why more applications have not been running there own dedicated server for their related service. With that kind of price tag and other issues of ram prices this paints a much better picture why.

Is there a reasoning why steemd is single threaded or do they simple have plans in the future to make it take advantage of multicore cpus? It seems rather odd from a layman like myself why with how many cores cpus are coming out these days would things still choose to be single core focused.

Thanks for including the disclosure of your relation with Privex.

Making multi-threaded applications is extremely difficult, even more so with things like replaying a blockchain since every block relies on the previous one to be correct.

Check this post for more info: https://www.quora.com/Why-is-multi-threading-so-damn-hard?share=1

I hope that the developers can improve this in some way. It's possible many of the issues are already solved in EOS, but may never make it to light in Steem.

Hello there @someguy123 thanks for sharing this. May I ask you some questions about being a witness?

  1. Is it possible to rent a server and make the money you need to support it?
  2. Besides votes, what is the main factor to be within the 20 top?
  3. Is it possible to get a special price if I am planning to run a witness in order to grow @steemitvenezuela? This account would be a supportive one as cervantes.

Thanks for your time.

STEEM ON.

The specifications are getting more serious all the time, I am guessing fast quad channel memory would also be quite helpful.

I am curious, what is your opinion on the post from @steemitblog that it is not necessary to use a 512GB server for a full node. Specifically this part:

A technique that we have been using to lower the memory requirements on a “full node” (one with everything including account history), is to split the API node into two servers. One server runs only “account history,” and the other server runs everything else. This allows both servers to use less than 256 GB RAM, instead of running everything on a 512 GB RAM server.

You can do it with less than 512 but your replay times will suffer and highly dependent on disk io speeds. I tried with raid 0 nvme for 3500mb/s and took 3.5 weeks to do a replay

As @themarkymark wrote, nobody believes @steemitblog's results.

I have never successfully gotten an NVME (without /dev/shm) server to replay without crashing. It is also slow as molasses.

At @privex we're experimenting with high quality NVME drives and locating CPUs with good single core performance, to try to make it more scalable. We think it may be possible to get half decent performance on a non-RAM node with 4 to 5 NVME drives in RAID 0, using XFS as the file system, storing the blockchain on a separate SSD, boot drive on a separate SSD, and various tweaks to XFS e.g. disable access time, move the journal onto the boot SSD so that it does not impact the NVME performance.

It is a lot more difficult than using RAM, but we're quickly approaching 512gb, and the next level can triple in price...

From their publication I had the impression that the scaling issues were not as severe as they have been portrayed in other blogs. At some point I considered setting up a full node but I realized that I need to learn a lot more and the cost is now beyond my budget. I appreciate that you took the time to respond.

I think he touched on it with this:

Another issue many RPC nodes face, is stale connections. This may be related to poor networking code within steemd or third party libraries for interfacing with Steem.

....stale connections can eat RAM too. Having more RAM than necessary is always ideal.

I love your posts @someguy123 -- very detailed and they help people realize what they should be looking into and researching.

Wow. This is so helpful and we should be researching, it will sure help improve the server performance.
This is the first post am reading from you cos I just started following you

it's good information for me, maybe one day i worked for steemit.

Congratulations! This post has been upvoted from the communal account, @minnowsupport, by Someguy from the Minnow Support Project. It's a witness project run by aggroed, ausbitbank, teamsteem, theprophet0, someguy123, neoxian, followbtcnews, and netuoso. The goal is to help Steemit grow by supporting Minnows. Please find us at the Peace, Abundance, and Liberty Network (PALnet) Discord Channel. It's a completely public and open space to all members of the Steemit community who voluntarily choose to be there.

If you would like to delegate to the Minnow Support Project you can do so by clicking on the following links: 50SP, 100SP, 250SP, 500SP, 1000SP, 5000SP.
Be sure to leave at least 50SP undelegated on your account.

As a guy that is not very technical and has very basic computer skills, I understand almost nothing about this post, just that you fixed some shit that was causing the servers to slow down. For this, I salute you. Keep kicking ass

This is very helpful information... I love your posts @someguy123 -- very detailed and they help people realize what they should be looking into and researching... thanks very much for sharing this kind of server setup.

1-2-3.jpg
"1,2or3-GuessTheNumber" - Use your logic only in order to guess a secret number selected by the random number bot every day. If you wish to participate, please put your number (i.e. "1", "2" or "3") in memo and send exact 1 SBD to @monitorprivate Steemit account. Check at the end of the day if you guessed the correct number randomly selected by our bot. By giving the right answer you will be awarded with 2 SBD to your account. Anyone is welcome to participate only once every day. Good luck.

I have tried to optimize my equipment in many ways and I can not find the way in which this format hangs very well and it spends a few days well and then starts with the same problem

OT: Hey there, thanks a lot for the refund!! :-)

I was looking for review of your vps service by googling but to no avail. It seems like even Reddit has very least or close to none users has been able to provide some insight for the service. Guess it was due to you crypto payment and I found great amount of response here on Steem.

This post alone gave me the right amount of confidence that I can go with the Privex as the owner seems like knows his shit. Thanks and I enjoy the reading.

Loading...