Response to Cosmos white paper's claims on DPOS security

in #steem7 years ago

There was recently a discussion between the Cosmos team and some BitShares / Steem users regarding the security of DPOS compared to the claims made about it by a competitors white paper. I wrote a long / detailed response and felt I should re-post it here.

In this thread the term "witness", "miner", "validator", and "producer" are used in imprecise ways. Lets define some terms:

Block Producer

A block producer is responsible for grouping transactions into a block and broadcasting it to the network.The number of block producers per confirmation period is limited by block production frequency and incentive structures.

Block Validator

Anyone running a full node is a block validator. A block is considered valid if and only if it follows the open source rules of the blockchain. It does not matter how many block producers collude, a block validator (and the rest of the world) will reject all blocks that fail to conform to internal consensus. In theory there are an unlimited number of block validators, limited only by the desire of individuals and businesses to independently verify block producer behavior. This motivation is often to prevent a business from being defrauded by following a fork that violates the consensus rules.

Last Irreversible Block

A block that has been widely acknowledged as being valid and immutable. This block must be accepted by all validators you trust and confirmed by the majority of block producers.

Now that those terms are out of the way we can conclude that the act of producing a block is independent of the act of validating a block. Under DPOS every produced block can be viewed as a proposal and nothing more. This is similar to the first step in a multi-party block producing process such as Tinderment or Ripple.

Under DPOS every subsequent block is both a confirmation of multiple prior blocks and a proposal for the next block. The set of blocks between the last irreversible block (confirmed by 2/3 of producers) and the head block is like a pipe-line of pending block proposals.

By using a pipeline approach, DPOS has an average latency until irreversibility of 2 * BLOCK_INTERVAL * WITNESSES / 3. Which for STEEM is 40 seconds. While the latency for a single transaction is high, the pipe line allows for higher throughput where new transactions are made irreversible every 3 seconds. Furthermore, the pipeline gives everyone variable probability of irreversibility as blocks grow from 1 signature to 2/3 of required signatures.

What we can conclude is that the last irreversible block is identical to advancing blocks in Ripple and Tinderment in that it is a proposal accepted by 2/3 of block producers.

Security Provided by Non Producing Validators

The act of producing blocks means nothing if the produced blocks are not accepted by everyone else in the world. There can be 100,000 independent validators all talking with each other (relaying blocks and transactions). Each of these validators is running a state machine that will not allow them to roll back past their own perception of the last irreversible block.

If the majority of the block producers collude to produce a longer chain in an attempt to fork beyond the last irreversible block, then no exchange, block explorer, merchant, or other validator would switch to that fork. The entire world will agree on the "first fork seen".

This means that the only way to effect a double spend / reversal beyond last irreversible block is to isolate your victim and partition the network + collusion of producers. With a small amount of non-consensus changes it would even be possible for a particular node to require proof of TaPoS from a majority of non-block-producing trusted peers that periodically broadcast transactions. Under this model, even colluding block producers who isolate one victim will not be able to incorporate transactions from the other parties. TaPoS does this organically, but it could be made explicit.

In terms of Collateral

A Job has a net-present-value equal to the value of the future revenue stream combined with a sunk cost of campaigning. Losing the job has a real economic cost. Getting fired has an even greater loss due to value of reputation.

If there was any way for the majority of block producers to collude and cause actual harm to someone running a full validating node with good, long-standing, connections to a large number of peers then I could see the need for additional collateral. But considering every peer is able to independently verify that they and everyone they do business with is on the same irreversible fork, then there is no ability to deceive a single node.

There are two things a block producer can do to "harm" the network:

  1. not produce a block
  2. skip the block producer before them (this will likely orphan the attacker rather than the attacked)

As a group the block producers can prevent the advancement of the last irreversible block until one of the potentially reversible forks is able to elect a majority of block producers who then advance the last irreversible block. This means that absent a clear majority, any minority of producers can successfully hold an election and keep the last irrevrsiible block advancing.

Aside from halting advancement of the last irreversible block, the majority of block producers can also:

  1. Ignore all minority producers and effectively increase the average block interval by 50%
  2. Ignore / censor transactions / hinder election process

In all conceivable attack scenarios, there is no potential for a double spend of a traditional transfer. Users are only ever at risk if they face financial loss due to the censorship of their transaction. This risk applies to every blockchain and is therefore pointless to consider.

Security, Accountability, and Liveliness

I believe I have proven the the block producers cannot defraud 1000's of independent validators without partitioning the network and while physical network partitions may be possible, logical network partitions defined by non-consensus TaPoS trust links cannot be partitioned. The most that could be said is that TaPoS trust links are not currently deployed as a pro-active defense against a network partition attack.

The only reason for bonded block producers is to enforce a penalty for the network partition attack. If it is possible to prevent it in the first place by TaPoS links then no bond is necessary.

The network of 1000's of validators provide accountability through the ability to detect and report production of fraud chains. The probability of getting caught is 100% and the consequences involve both job loss, reputation loss, and potentially legal consequences of theft/fraud (because the parties are known and the double spend involved and off-chain business transaction).

The blockchain will remain live (advancing the last irreversible block) so long as at least 1 block producer is able to process enough pending transactions to elect a new set of witnesses who then start producing the last irreversible block. Even a loss of 100% of block producers will not prevent the network from advancing assuming a hard fork to enable one witness to hold a new election.

Conclusion

There are no known strategies by which a well connecting full validating node can be defrauded and the damage any individual block producer or even collusive majority group can do is so insignificant that a bonding requirement beyond job/reputation loss is unnecessary.

Do bonds actually make a platform more secure? They act as a barrier to entry that keeps block production (and therefore censorship rights) in the hands of the rich. The power to censor is far more valuable than any microscopic probability for producing two alternative chains in an effort to defraud an isolated victim.

Sort:  

In regards to IRB, actually I think a block accepted by majority (number) of block producers is not enough, instead it should be a block accepted by majority of stake.

The issue is about the stake voting process. It's a result of several factors combined.

Firstly, let's list the rules, using Steem for example:

  • every account can vote for 30 witnesses
  • votes are transactions, which need to be included into blocks
  • witnesses schedule updates every 21 blocks

Here is a scenario:

  • current active witnesses are X1, X2, ..., X21
  • one account P, is voting for another set of witness Y1, Y2, ..., Y21
  • current active witnesses produced blocks B1, B2, ..., B20, B20 is the head block, so there is one block to be produced by current witnesses
  • the @steemit account, which has a big amount of stake, was not voting, now decided to vote and proxied to P, broadcast a voting transaction VT, then it's included in block B21, produced by X21. Under normal circumstance, the next block should be produced by one of the Y's.
  • due to network issue or whatever reason, the first Y, we assume it's Y0, missed its first block.
  • due to network issue or censorship, neither VT nor B21 was received by X1, then it produced another block B21' without VT, so the active witness set is not changed according to this block. We assume the next witness is Xn. Note that the next block of B21 and B21' will have same timestamp.
  • due to coincident or collusion, Xn decided to build a block following B21', and other X's followed this chain. In the meanwhile, the Y's started producing blocks one by one. Now, both fork has same height and same HEAD timestamp, and IRB of both fork will advance. The validators will have to make a choice, but it can't be done automatically by the consensus code, instead it can only done with a checkpoint, so the automatic systems which relying on IRB will have trouble.

The reason: currently witness rescheduling is not based on IRB. But it can't be based on IRB, because in case when 1/3 of witnesses went offline simultaneously (in a same round), IRB won't advance anymore, so new witnesses can't be voted in.

Thoughts?

Witness scheduling isn't based on IRB, only the extent that a witness will roll back is. Therefore, it is possible for votes cast in the pending state to impact the pending witness schedule and ultimately advance the LIB.

Can you please review this algorithm for me: https://github.com/tendermint/tendermint/wiki/Byzantine-Consensus-Algorithm

Under the model used by tendermint, no block can contain data that changes who can confirm it.

Good writeup, as always.

How difficult would it be to catch a single block producer who includes transactions in the wrong order? It seems like it would be pretty easy for a block producer to front-run market and voting transactions to skim a little more cash for himself. There's not a lot of money in voting transactions, but perhaps quite a bit in market transactions.

Obviously if you went looking for such activities with the proper statistical tools, it wouldn't be hard to find evidence of them if they're happening. But in the current implementation, are there any automatic safeguards for this? Would block validators know if this were occurring?

it would be fairly obvious to detect consistent manipulation of transaction order.

Well yeah, but is anybody looking for it? It would be particularly easy to detect it in realtime (rather than looking for correlations in the blockchain), but is anybody doing this? If it's not built into the code, it seems like someone could get away with it for a long time before they were found out.

We haven't looked for any code to check for it. To do it right would require instrumenting multiple nodes in different geographic areas and connected to many different peers. Then with this information calculate a weighted order of receipt.

Once you have those stats you could score each block producer based upon how close their order was to the weighted order. Over time you would discover if there was any significant bias among block producers.

Bias alone only indicates that a producer was not well connected to the network. Once bias is discovered, then it is a matter of looking for which accounts were most frequently biased earlier. This would identify potential sock puppets.

Great! This kind of writing should be on the Steem whitepaper. It needs to fully explain how the system works. Otherwise people have misunderstandings and they make false claims about Steem (and of course this applies to Bitshares also).

When the Steem whitepaper will be updated?

Very good and clear article. Thanks for sharing!

Censorship of transactions, isn't much of a harm as long as you have at least one honest block producer that can include it. In the event of all producers to collude, the only way out would be to use a POW block every round and require the first witness in the subsequent round to build on top of a POW round (unless the mining queue is empty, ofc).

BTW, if you read through the article again and keep in mind the current discussion about segwit/bu on bitcoin, you may realize how difficult a situation there are in:

  • miners decide which protocol/fork to build blocks for
  • every validating node operator having his own choice

This means, that in order to "upgrade" bitcoin to either one of those, you not only need to convince a majority of miners (which doesn't seem to happen) plus convince all the validating nodes to join aswell (otherwise they would end up in a stuck fork).

Worst case scenario here would look similar to the ETH/ETC drama.

With DPOS, this is not as much of a problem, still, validators and producers need to switch over to the new protocol at the same time, however, since shareholders have to vote/approve on a fork, we can actually coordinate a hard fork. This is one of the major advantages of DPOS over ANY other consensus scheme, namely that it comes with a stake-based governance mechanism! Thanks for inventing it @dantheman!

Loading...

Simply said, I LOVE your response and it surely underlines the importance of being aware of such details that have been blurred by other authors. Thank you so very much for placing the power back in the hands of our system!

All for one and one for all!!! Namaste :)

Hi Dan. This is really interesting. I'm wondering - is it possible for the block producers to collude and produce a block with a fake transaction, and then proceed to produce subsequent blocks that confirm the one with the fake transaction, until 2/3 of them have validated it?

It sounds like what you are saying is that the network of validators would know / catch this, but how would they know that the fake transaction was fake?

because it wasn't properly signed and/or it violated the constraints defined by code.

All transactions are digitally signed. You can't create a valid transaction without the keys to sign it. All block validators can easily verify the transaction.

Upvote and resteemed.

Informative!

Awesome response and very informative Dan.

I was wondering, wouldn't steem be more secure and less corruptible if more than 19 witnesses got to decide the state of the blockchain? Why was this specific number chosen?

That would increase time to irreversible confirmation. Having runner up producers helps add some variety.

I have some questions and concerns with the section on "In Terms of Collateral," specifically with your use of majority. Could you comment on how 'majority' still achieves a solution to the Byzantine Generals Problem?

What if reputation loss or job loss is not enough damage to stop some malicious validators from colluding?

Ad do what? What do they gain?

what if they were external sponsered attack? I mean, bribed validators have nothing or little to lose. May be I need to study graphene engine more, Why don't you give some easy explanation about TaPOS?

Loading...