Seed - Development Design - Transaction & Block Storage

in #crypto6 years ago (edited)

Storage.png

Overview

Whether a cryptocurrency is based upon blockchain or directed acyclic graphs, there is one aspect that is certain; Blocks and transactions are meant to be stored. An app is not meant to stay open, users should be able to close an app without worrying about losing the data. Losing data would mean requesting more data over the network whenever a user tries to log back on. We want to minimize network requirements, especially if Seed is to achieve its goals of being a high-throughput, low-latency blockchain solution.

This article will discuss long term storage, differing requirements on differing clients, our 'database' schemas, and compression.


Browser vs. Desktop

What type of storage we have access to differs greatly from system to system. The Seed LLAPI is a lightweight JavaScript API with minimal depencies outside of using NodeJS. It can be run anywhere NodeJS can be run, such as on a web browser, a server, or an Electron app. The Seed Launcher and HLAPI is an Electron app, with extra dependencies users must install, such as Electron.

Therefore, our launcher is a proper Desktop app with full access to a desktops file storage, while the LLAPI must be expected to be run in a web browser. This means we must expect some clients will exist which do not have access to a desktops file storage, and must instead rely on local storage or cookies.

Modular Storage

This adds the constraint of modular storage. The act of "storing a block" must not care of which platform is being used, however the implementation behind that action must be able to choose which database provider suits our current platform.

Storage Size

On Desktop, the size of our storage is not a concern. Through transaction squashing, Seed should not grow nearly as fast as traditional blockchain or tangles, despite aiming to be a high-throughput, low-latency blockchain.

For web browsers is where the constraints become more difficult. Local storage and cookies have very minimal storage capabilities, therefore we may not be able to store everything in local storage. Despite transaction squashing helping keeping the entanglement lean, it will eventually surpass the max local storage capabilities of 5MB eventually. This cannot be avoided, therefore a work-around is required.

A simple solution is to simply not download all Modules, and therefore not validate all transactions. This has the drawback of potentially creating temporary chains, as some DApps may have their users only validating eachother and not regrouping with the core chain as often as ideal. However, it would help tremendously a DApp's view of the DAG only saw transactions relevent to the DApp, and excluded all transaction data in blocks related to other DApps.

This should be easy enough to implement. Transactions state which module checksum their code execution belongs too, and this data is extracted into blocks. Therefore, it would be trivial to local and remove transactions who's checksum does not match our loaded module's checksum. This is the only section of a blocks storage which grows linearly with transactions, therefore it would be the greatest change we could make to minimize data storage. The other portion that may become large, the 'ChangeSet' object which represents change in the blockchain, is divided in its root by each module to keep the data seperate. Removing the other modules from storage would be very easy, however may not be required, as this segment of the blocks growly logarithmically over time.

Storage Format

As the constraints lie on the JavaScript web browser side, not the Desktop side, it seems best to keep the data in JSON format. If the storage still is too large, we can switch from JSON format object storage to comma-delimited storage. However, for the time being, storage will be in the form of JSON objects.

In a /data/ subdirectory will consist of two subfolders, /blockchains/ and /entanglement/. Within these folders will be their stored JSON objects, organized by two seperate schemas.


Blockchain Storage Schema

Blockchains are organized by the generation of blocks within the chains. Within each chain is an array of blocks named by their block hash. Within each block is the data it represents.

Desktop

This relationship can be easily represented with files and folders. In the base folder, /data/blockchains/, will be strictly subfolders named by which generation of blocks reside within it. If all our blocks are first generation blocks, there would simply be one folder, /data/blockchains/1/.

Within each subfolder would be a bunch of files. Each file is in the .json file format, representing blob storage of an object. These files contain a single block, where that file is named after the blocks hash.

Web Client

On the web browser side, local storage will be targeted as our preferred storage.

As local storage follows a key/value pattern, the subfolder approach is not viable. Instead, an object will be stored under a special key called "index" to aide in navigation. The "index" object will contain the subfolder navigation of the data folder. Each block will be named after its block hash, just like for file storage.

The sample index object may look like the following

{
    "blockchain" : {
        1 : [BlockHash1, BlockHash2, BlockHash3]
    }
}

Entanglement Storage Schema

The entanglement is simply a directed acyclic graph of transactions. These transactions can be saved in any order, however it is best to load them sequentially for the validation process.

Desktop

Due to it being best to read the transactions in order, we cannot simply name them the .json representations of transactions by their transaction hashes without an extra step, as they will be read in the ascending order of their hashes, NOT the order of transactions.

We could name them by the transaction timestamps, however there is nothing stopping two transactions from having the same timestamp. Alternatively, they could be named after their timestamp with their block-hash checksum appended afterwards. This would make it easy to read the transactions in the proper order, while handling collisions in the timestamps

Web Client

Differing from the desktop route, the web browser implementation using local storage will follow the same syntax as the blockchain storage did. In the same index object, the entanglement's transactions and transaction order will be stored. However, they will be stored following the same naming convention as they did in the desktop approach. This is to be consistent in naming, as well as to keep the storage size down, as hashes are much longer than timestamps and checksums.

The sample index object may look like the following

{
    "entanglement" : [
        "192345234_TransactionHash1", "192345253_TransactionHash2", "192345312_TransactionHash3"
    ]
}

Compression

When storing lots of historical data in the form of text, compression is a common solution to the storage size problem. The primary benefit of compression is it often results in a large reduction of storage size, as that is its intended purpose. On the contrary, this comes at the cost of performance, as compression can be expensive computationally, depending on how often it must be done.

Compression Benefits

The full list of compression benefits comes down to storage size and reduced networking requirements. On top of reducing how much data will be stored, we can send the compressed versions of transactions and blocks to users when sharing memory. For example, when a user connects to the network and asks for all transactions and blocks, sending a compressed version will surely be benefitial.

Compression Costs

There is usually one, potentiall two costs associated with compression. The first and foremost, as stated above, is the computational cost. Compressing and uncompressing data is very very costly. If compression only occured when reading the initial state or when saving blocks, this cost would be negligible, as a small increase in a one-time loading time is understandable. However, if compression and decompression was occuring multiple times per second, there may be an obvious performance drop viewable by the user.

The other possible flaw is that compression does not always work well. Not all data is the same, with some being more compressable than others. For example, a comma-delimited array of numbers which follows any sort of pattern will compress very easily, while random data will compress much worse. Our data will most likely compress well, however most of the data is in hashes and cryptographic signatures, which may be random enough to worsen our compression algorithms.

Turn On/Off Compression

The simple solution is to implement compression, and see how it performs. This could be an option users can turn on and off, depending on whether they want performance or reduced storage size.


Implementation Design

Storage

The "Storage" file will be added to the Seed LLAPI, offering an export for saving and loading storage. This export will be decoupled from the actual database injecting implementation, allowing for differing clients to use differing database injectors. This decoupling will be done by interfacing the required functions for a "database injector".

Database Injector Interface

The database injector interface will handle the acts of storing and reading data. Regardless of whether storage is being done on a traditional file system, in local storage, or even by an external web service such as MongoDB, many actions are shared between them. Technically, JavaScript does not have interfacing built into it, however with JavaScript being a dynamic language, our objects can simply follow a consistent function naming convention and work similarly to an interface. Interfacing is a good practice for creating this convention.

Below is a simple design of our interface

IDatabaseInjector:
    writeBlock(storageName, storageObject)
    writeBlockchain(generation, blocks)
    writeBlockchains(generation, blockchains)
    writeTransaction(storageName, storageObject)
    writeEntanglement(entanglement)
    readBlock(storageName)
    readBlockchain(generation)
    readBlockchains()
    readTransaction(storageName)
    readEntanglement()

DatabaseLocalStorage

The Seed LLAPI will default to using the DatabaseLocalStorage implementation. This implementation will handle reading and writing to local storage, following the blockchains and entanglement schema listed above in the "Blochains Storage Schema" and "Entanglement Storage Schema" segments.

DatabaseFileSystem

The Electron launcher and HLAPI will offer an alternative, the DatabaseFileSystem implementation. This will be passed into the Storage provider upon launching the desktop application, allowing all DApps through the launcher to take advantage of the local file system.

Compression

The Storage LLAPI will also handle compression. The option to turn on or off compression will be available to the clients. The compression algorithm to be used will be gzip, as it is a fairly lightweight and fast general purpose compression algorithm for small individual files.

Sort:  

Access through the browser more profitable storage from all points of view, and necessary security will create crypto.