Getting the List of Hive Usernames & Their Wallet Balances the Hard Way

in LeoFinance4 years ago

beem1.png

beem2.png

At this time there are 1405513 usernames registered on Hive. A few weeks ago I saw a conversation about how often Savings of the Hive Wallet is used. I believe @ats-david suggested someone should write a script to find out the usage of the Savings features. I didn't commit, but thought it was an interesting idea and project for me to experiment more with Beem.

So, I came up with the hardest way possible to achieve that. I am sure there has to be easier, more efficient, and faster ways to accomplish this. The reason I say this, because the script I wrote takes a very long time. In fact, I started it on Monday and it is still running.

The goal of the project is to get all of the usernames on Hive, then get wallet balances associated with the user, and store them in an excel file. First I get the usernames by using the following lines of code:

bc = Blockchain(blockchain_instance=hive)
bc.get_all_accounts(start=starting, stop=stopping, limit=1000)

The second line is part of the for loop that gets each username one at a time. This just returns usernames in an alphabetical order starting with 'a'. Then I need to get balances for each user within a function as following:

act = Account(act_name, blockchain_instance=hive)
balances = dict(act.balances)

Afterwards, all balances are stored in excel file with columns for Hive Power, HIve Liquid, HBD Liquid, HIve in Savings, and HBD in Savings.

As I already mentioned this is probably the least efficient way of accomplishing the desired goal, as it is taking almost a week for the script to run. I think it will finish tomorrow.

Fun fact to mention that I didn't know before starting the script is that one Excel worksheet can only store 1048576 rows of data. But the amount of Hive usernames is 1405513, which is over the max limit for Excel rows. I still wanted to see what would happen when the script goes over the max limit. However, yesterday I had a brief internet interruption which stopped the script. Luckily everything it processed until then was stored in the Excel file, which was more than 800K usernames.

I was able to restart the script from where it left off. Another issue now is that, I have half of the usernames in one file, the rest in another. This won't help in sorting the amounts, and producing some sort of meaningful report. But I noticed there are a lot of accounts with 0 balance for each wallet compartments like Hive Power, Liquid Hive, Liquid HBD, Savings Hive, Saving HBD. So, I will probably programmatically remove the accounts that have zero in all and combine the remaining accounts into one excel file.

I know this could have been achieved more efficiently and a lot faster using Hive SQL. I don't know any other options. If you know a faster way of doing this, please let me know in the comments.

Posted Using LeoFinance

Sort:  

The process you outline is not that different from how tinman does it:

https://gitlab.syncad.com/hive/tinman#taking-a-snapshot

If you have a local node, it's really not that time-consuming. Using tinman to do this, you end up with all accounts saved to a .json file.

Nice. Thank you @inertia. I will take a look.

@arcange

Posted Using LeoFinance

Kinda wish python had easy way to parallelise loops like the parfor syntax in MATLAB

Posted Using LeoFinance

There probably is something like multithreading, I don't know much. I just experiment with limited knowledge.

Yeah, you can do multithreading in python, but it's quite a lot more involved than the parfor syntax in MATLAB.

Thank You!

If only we have lots of Hive users @geekgirl, but for now Hive had still a lot of room for improvement but like the real world we are reeked many stuffs that pulls the value of hive down.

You are right, there is always room for improvement and growth.

Finally i came to know the exact numbers of hive users. Reblog..

It is not actually the number of users, but rather registered usernames. Many of us have multiple accounts. Many accounts are in active.

It would be interesting to see how many registered usernames have never posted... I suspect the number of accounts set up just reserve names is huge

I didn't check for that. But if I had to guess it would be a lot. At least half of the accounts or more.

Two suggestions.

  1. Include a link to the code, so people can just run it or put it in text form using triple ` on the line before and after to create a code block.

  2. Used shared instance so you don't have to keep passing it around.

from beem.instance import set_shared_hive_instance

hive = Hive(node=nodes)
set_shared_hive_instance(hive)

Posted Using LeoFinance

Thank you for the suggestions @themarkymark! They are very helpful.

In the past I tried single ` which didn't help with indentation. Triple seems to work perfectly.

Will use shared instance. Thanks

Upvoted by GITPLAIT!

We have a curation trial on Hive.vote. you can earn a passive income by delegating to @gitplait
We share 80 % of the curation rewards with the delegators.


To delegate, use the links or adjust 10HIVE, 20HIVE, 50HIVE, 100HIVE, 200HIVE, 500HIVE, 1,000HIVE, 10,000HIVE, 100,000HIVE


Join the Community and chat with us on Discord let’s solve problems & build together.

Now you can target the whales for upvotes! ;)

Posted Using LeoFinance