Crab Bucket - The STEEM Blockchain as a Local Database

in #steem8 years ago (edited)

CrabBucket - a Rails Plugin

SELECT * FROM bucket_blocks AS blocks
  INNER JOIN bucket_transactions transactions
    ON transactions.block_id = blocks.id
  WHERE witness = ?

The main purpose of this project is to give developers a way to build a copy of the STEEM blockchain to a local database for reporting and analysis.

It's not intended to copy the whole blockchain, but there's nothing stopping anyone from trying this. It just might take a while. If you do intend to copy the entire blockchain, you will need to use a local node. Otherwise, it'll take months (or years, if you're reading this in the Mysterious Future).

How to use the CrabBucket Rails Plugin

There are two main uses. One, is to embed it in your existing rails project. The other is to just run it in a new project for the sole purpose of building a database for external tools to use.

To start populating the database, you have a few options. This will populate from the first block. Use this if you have a fast local node and you want to import the entire blockchain.

$ rake bucket:replay

Or, to specify a specific block to start on:

$ rake bucket:replay[5000000]

Or, to start populating from the latest block forward, this will follow the head block.

$ rake bucket:stream_head

Or finally, if you would like to fill in any gaps ...

$ rake bucket:rebuild

You can always interrupt these tasks with ^C (Control-C) and resume them later. You can check your progress with:

$ rake bucket:info

Which will output the current state:

You are running in development environment.
Blocks: 4187
    Transactions: 237
        Operations: 237
            Pow: 237

Once the database is populated to your liking, querying the database with ActiveRecord is as follows:

$ rails console

Then type in the console:

Bucket::Block.first

Which will output:

  Bucket::Block Load (0.2ms)  SELECT  "bucket_blocks".* FROM "bucket_blocks"
ORDER BY "bucket_blocks"."id" ASC LIMIT ?  [["LIMIT", 1]]
 => #<Bucket::Block id: 1, block_number: 1, previous:
"0000000000000000000000000000000000000000", timestamp: "2016-03-24 16:05:00",
transaction_merkle_root: "0000000000000000000000000000000000000000", witness:
"initminer", witness_signature:
"204f8ad56a8f5cf722a02b035a61b500aa59b9519b2c33c77a...", created_at:
"2016-09-19 16:14:31", updated_at: "2016-09-19 16:14:31">
Bucket::Operation.pow.first.work.worker

Output:

  Bucket::Operation Load (0.4ms)  SELECT  "bucket_operations".* FROM
  "bucket_operations" WHERE "bucket_operations"."type" = ? ORDER BY
  "bucket_operations"."id" ASC LIMIT ?  [["type", "Bucket::Operation::Pow"],
  ["LIMIT", 1]]
 => "STM65wH1LZ7BfSHcK69SShnqCAH5xdoSZpGkUjmzHJ5GCuxEK9V5G"

Or, if you just want to browse, start your rails server and browse to:

http://localhost:3000/crab_bucket/blocks

Or, you can use an external tool, The database can be found in:

/path/to/your/project/db/development.sqlite

Installation

For Existing Rails Projects

Assuming you already have an existing rails project, add these lines to your application's Gemfile:

gem 'radiator', '~> 0.0.4', github: 'inertia186/radiator'
gem 'crab-bucket', '~> 0.0.4', github: 'inertia186/crab-bucket'

And then execute:

$ bundle

Add this to your routes, before the last end keyword:

mount Bucket::Engine, at: '/crab_bucket'

Install the migrations:

$ rails bucket:install:migrations
$ rake db:migrate
$ rake bucket:stream_head # optional, see above

For New Projects

Have a look at this article on setting up a new rails project, then use the steps above to enable this plug-in.

How to Write a Ruby on Rails App for STEEM

Tests

  • Basic tests can be invoked as follows:
    • rake
  • To run tests with parallelization and local code coverage:
    • HELL_ENABLED=true rake

Temporary Image

Get in touch!

If you're using CrabBucket, I'd love to hear from you. Drop me a line and tell me what you think! I'm @inertia on STEEM.

License

I don't believe in intellectual "property". If you do, consider CrabBucket as licensed under a Creative Commons CC0 License.

Sort:  

I think it's great :).

I'm working on performance tuning for my golang implementation of this and similar ideas, and it does target ripping the whole chain. Tell me: Where did you find the limiting factor to be?

Your DB, or STEEMD?

I think it's steemd, but I'm not 100000% certain, so I'm curious what your experience has been so far.

I think my db is a bottleneck. I was only testing with sqlite. I can easily switch to postgres and see. But I have a feeling that steemd presents its own bottleneck. It's fine if you're streaming because the blocks come in once every 3 seconds. But replay/rebuild takes 1 second per block over node.steem.ws (not recommended), which is pretty dang slow.

I haven't gotten my local node to work quite right, so I haven't seen the perfect scenario yet.

oh, I use a local node. Do you you want a hand getting that straight?

For me I top out at about 500 blocks per second and average about 350-400. Nonetheless, takes a solid 24 hours to do the dump.

Cool. Blockchain has a lot of applications.