Mempool Consolidation Bitcoin UTXO: Unspent Transaction Output

in LeoFinancelast month (edited)

image.png

On February 3rd I learned a valuable lesson.

This was a question I had for a while but never really bothered to follow up on until now. What exactly is a UTXO? And why do they need to be "consolidated"? What is the mempool and why is it important?

image.png

Let's start with the mempool.

This is probably the easiest concept to understand and explain. It's pretty basic. When a Bitcoin user signs their operation with their private key and broadcasts the operation publicly to the ledger: it gets thrown into the mempool. Why does it get thrown into the mempool? Well it quite simply doesn't have anywhere else to go because on average it's going to take 10 minutes for the network to mint a block. Any operation that isn't put into a block sits on-deck inside the mempool.

When the user's public operation is sent to a node, that operation gets put into that node's mempool. However, that node is also expected to play nice and broadcast that operation to other nodes in the network, which will in turn get added to the mempools of those nodes, and so on until most nodes inside the Bitcoin network have received the information and understand that this operation should eventually get put into a block.

Some of these nodes are mining pools, who generate the vast majority of actual blocks. Mining pools will prioritize transactions that give them the highest fees. Bigger fee equals more money for the miner. If the mempool is too big and the miner can't put it all into the next block... well then the mempool keeps getting bigger and bigger, and users have to keep paying higher and higher fees to get added to the next block. Many of us understand how this can become a problem... such as when fees spike to $50 a pop and everyone is complaining about them.

image.png

Hm that's weird

The truly interesting thing about Bitcoin fees is that they don't really go up over time. It's just a crabwalk sideways between $1-$50. In 2017 everyone lost their minds because costs got up to that $50 level. The same thing happened in 2021 but only for a very brief period of time. More recently with Ordinals and Inscriptions we got another pump/dump on fees.

We would expect that fees should continue trending upward as the price of BTC goes up, but we can see right here on the chart that this isn't really a problem yet. Many of Bitcoin's scaling problems seem to be completely made up and highly theoretical in nature. Try to remember that going forward.

Final thought regarding alternative systems.

The interesting thing about a DPOS consensus model is that we don't necessarily have to broadcast pending transactions to the entire network. When a node in the network gets a signed operation from a user they know exactly where to send it: to the witness that's going to sign the next block within the premade schedule. It will be interesting if this difference will ever be a useful scaling advantage.

mempool-cycle-btc-transaction-2.png

UTXO-pro-con-account-bitcoin.jpg

So what's a UTXO?

An Unspent Transaction Output is how the Bitcoin network keeps track of everything and avoids double spending. It is completely different than an account-based system like Hive or a brick-and-mortar bank. Rather than try to keep track of user accounts, Bitcoin and various other chains keep track of exact inputs and outputs.

UTXOs are the outputs that track all the spendable money on the blockchain... while all the inputs are spent coins that can no longer be spent again. We can see the advantages and disadvantages of this system on the infographic above.

melt-gold.jpg

How do operations work?

Every transaction on the Bitcoin blockchain is allowed to have as many inputs as it wants and as many outputs as it wants. The inputs are UTXOs. We can think of them getting melted down like gold and consolidated into one place. Once this happens the UTXO is no longer unspent. A UTXO can never be partially melted/spent. It must be completely melted and spent or not at all. Those are the rules of the system.

image.png

many:many input:output

Once enough UTXOs have been melted as inputs to the transaction they are then split up into new UTXOs. These new UTXOs represent new "gold bars" and are able to be spent by different public keys signatures. This is the foundation of how Bitcoin value is transferred within the network from one user to another.

image.png

The Refund UTXO

I've noticed that when I send Bitcoin there is usually one input and two outputs.
Why is that?

The reason almost all transactions are going to have at least two outputs is because one of those outputs is a refund on your own money. The idea here being that if you're forced to liquidate an entire UTXO... what are the odds that this is exactly the amount you wanted to send to someone else?

There must be a refund UTXO that's gifted back to your own wallet. Often times this wallet has a completely different pubkey but is still controlled by the same seed and associated passwords. This is done for privacy so no one can actually prove you sent coins back to yourself... and so that it isn't obvious how much money you have at any given time for anyone that cares to check on-chain.

It is highly recommended by Bitcoiners to not reuse addresses... although doing so can be very convenient for things like whitelisted centralized exchange withdrawals to a wallet in self-custody. At the end of the day it's up to the user to decide these priorities.

So why do they need to be consolidated?

Imagine you run a business and your business accepts Bitcoin as payment for goods and services. Now imagine you get paid something like $100-$300 per transaction. No big deal, right? Well, all those UTXOs can add up to a big fee.

It might look like a Bitcoin wallet has $20000 in it and moving that money should be no problem (maybe $5 average fee), but if that $20k consists of 200 separate UTXOs moving it could be incredibly expensive.

UTXO-bitcoin-fees-consolidation.png

UTXO-bitcoin-fees-consolidation-vaultec.png

This was my assumption.

I just assumed that if all the UTXOs existed on the same public address then it would not cost more to spend them all at once. This is absolutely false. Every UTXO can add a considerable amount of data to the transaction that can end up being extremely expensive during times of high volatility.

dust-magitek-hearthstone.jpg

Dust

A dusted account is one that doesn't have enough sats to pay the fee required to create a new UTXO. Given what we've just learned it should hopefully be obvious that even a wallet with 1 Bitcoin in it could be completely worthless if it had so many UTXOs that it would cost more than 1 BTC to melt them down and create a new one. Clearly this would be very uncommon but it is technically possible... especially for vendors accepting small amounts of coin.

~~~ embed:gwABJO-kaM8 youtube ~~~

How can we fix this?

It all comes back to the mempool and the demand to use the chain at that time. When the mempool is empty it is possible to get transactions on-chain for 1 sat per byte. It might even be possible to do it for free depending on the miner. Right now the cost is 27 sats per byte so it is possible to turn many UTXOs into one quite cheaply by exponential margins when the time is right.

How is this done?

Send all the Bitcoin in your wallet that has many UTXOs to yourself. This is all it takes. Set a custom fee of 1/sat per byte. If the mempool is ever empty then your transaction will go onto the blockchain, the many UTXOs will be consolidated into one, and this UTXO will point to a pubkey under your control. You can then spend this one UTXO later at a normal rate whenever you want rather than having to wait for the mempool to clear again.

Expiration

Eventually Bitcoin nodes will drop your transaction from the mempool if it sits in there too long. It looks like it depends on the node settings but this can typically range from 24-72 hours. After that time the op will expire and will be dropped from the mempool. This is something that can happen during UTXO consolidation because it doesn't matter if the operation takes days or not; what matters is getting the op posted for the cheapest cost possible, which is normally going to be 1 sat per byte.

image.png

Mining hash lottery.

Because the mining process is completely random it is very possible to sneak in a UTXO consolidation after 2 or 3 blocks are mined in rapid succession. On average, blocks should be 10 minutes apart, but this is only the statistical chance based on the lottery's difficulty. In fact, when I look at the blockchain right now we can see that 3 of the last 5 blocks were minted within the same minute. If the mempool had been empty during this brief moment... any and all consolidations would have went through for a cheap price... even if the average fee for that day was quite high.

Conclusion

Bitcoin is not an account-based system. UTXOs track every single operation separately in order to make sure that the decentralized network doesn't accidentally allow an account to double-spend currency. This is extremely counterintuitive infrastructure that can lead to confusion and wasted money/fees if users don't know how to game the system.

At the end of the day in order to consolidate many UTXOs into one (to avoid egregious fees) the user must send themselves their own Bitcoin for a very cheap fee that could take many hours or even days to get on the blockchain. This is somewhat akin to melting down many tiny gold bars into one big gold bar; the big difference being that this bigger gold bar costs just as much to move around as a tiny one. That's the gist of it.

Sort:  

Is there any risk of replaying old transfers which are already committed to previous blocks if you reuse a Bitcoin address which you drained before? Or how are these replays prevented? Obviously one reason for draining the address is to prevent uncommitted old transactions from being added in the new blocks as these transactions don't have expiry time if I remember correctly.

It's also important to point out that this is impossible on Hive because operations on Hive do actually expire quite quickly (after like a couple hours). This is due to the ref_blocknum variables that all Hive operations have.

I remember reading about the time limit on Hive transactions and I'm confused when anyone mentions cold wallets and offline signing of transactions. Apparently there is plenty of time for moving the transaction in and out of the signing station.

https://bitcoin.stackexchange.com/questions/9709/do-unconfirmed-transactions-expire#:~:text=Oh%2C%20and%20I%20forgot%20the,but%20that%20is%20really%20unlikely.

Oh, and I forgot the most important part: transactions on Bitcoin (tx frames in the protocol) don't have a 'time' field, which means that transaction expiration can't be a feature of Bitcoin.

To summarize: yes, the transaction can expire, but that is really unlikely.

It looks like you're sort of right.

Someone could troll you by saving an expired operation and rebroadcasting it to the network later.

However, none of those UTXOs would be spendable if they've already been spent.
Again because it's not an account based system the pubkey itself is irrelevant.
Only the UTXO matters in this regard.

Therefore if you consolidated your UTXOs you could use the same pubkey and this would create a new UTXO that was completely unspendable by an expired transaction (because that expired transaction points to a spent TX). Basically if you spend that UTXO it can't be spent again which is why this model is so important in the first place.

I mixed up UTXO and the unique output addresses and assumed that one could just fill an input address to the same balance as on a previous drain and replay the old transaction from the ledger. It seems that the implementation is much better thought out and that aligns with the reality as we would have seen much news about those replay attacks already if it was that simple.

I'm trying to wrap my head around this.

Ignoring transaction fees, if I have a public key with 5000 sats, but only intend to send 2000 sats, I would send the transaction to address btc.....123 the intended recipient, and the system creates btc...456 which is my own wallet. The first public key gets the 2000 sats and the remaining 3000 sats get put back in my wallet at the newly created address, which is the refund.

So one UTXO is only good for spending 2000 sats and the other UTXO is only good for spending 3000 sats. Those UTXOs at the recipient or refund addresses can then be split up or combined in their respective wallets with others for the next transaction. This is how I understand it.

It makes sense that you want to completely destroy the previous container(s) of sats and forge new ones in exact amounts.

I read something similar with Dash staking in which you need to send your wallet balance to your own address so that all the individual pieces received at different addresses from the same wallet are combined into one so that the validator can confirm you have the required amount of Dash.

It sounds like you're understanding it correctly.

The one think you don't mention is the fee... which goes to the miner of the block.
The interesting thing about the fee is that a UTXO doesn't get created because that would be immediately unsustainable (as a UTXO that small is already dust). Rather, all the fees in that block get melted together and added to the UTXO that is the block reward itself (+6.25 BTC before halving). This number is not explicitly stated everywhere, and has to be calculated by subtracting all the outputs from all the inputs. Whatever number is leftover is the fee.

In the example above if you sent 2000 to your friend and 2500 to yourself, the fee is implied to be 500 even though the blockchain doesn't actually write it down anywhere. This is called a "virtual" operation and it is simply expected that all nodes will do the math independently without putting it on the chain.

Great. I had to really focus to get a mental picture. I’m glad I didn’t go off the path.

I can see now that all of this can be done with whole numbers of sats for simple addition and subtraction to calculate fees. No fractions needed. It’s like the accounting joke, “subtract the $20 and my pants, and there’s your profit”.

I learned all this deep in the weeds of Lightning.

If there's one thing Lightning is good at it is consolidating transactions and avoiding small utx0.

A bit higher than my understanding. Maybe because I'm not a part of that system. It's good to learn about it

Thanks for letting me know... this seems to be the consensus on this one.

This is pretty interesting stuff. Way above my pay grade but interesting!

Great economics. Thank you for the analysis