Notes on reorgs and payments
This article was started before the last published one on addresses and intended to explore and capture what a blockchain reorganisation will mean in the context of an evolving ElectrumSV. These are the notes I added to, and refreshed with, and revised as I return to the task — which was set aside many times. I’ll wrap this up and post it.
There are two goals with the articles I write, the first is to hopefully engage people in discussion and get feedback. And the second is to capture ideas as a kind of journal of notes. Due to the limited resources ElectrumSV has, it might be some time before I work on the things that the articles touch on — being able to come back and revise the nuances and exploration of ideas has some value.
If you have thoughts/feedback on this article, please get in touch via the #electrumsv channel on either unwriter’s slack or metanet.icu’s slack. You can add also comment here on this article, or write an article reasoning out what you find interesting or objectionable. And you can also tweet, but I will actively avoid engaging there as it is a terrible forum for discussion.
Currently, I am polishing a low-level change to ElectrumSV that switches our wallet storage to use a database. Part of this change fleshes out the unit tests, and the last remaining aspect that needs to be unit tested is the change to handling of “reorgs”.
Put briefly, a reorg is where there is a new longest chain based off a common parent, and the wallet switches to use that instead of the now shorter one it was previously using.
The limited and buggy past
In the beginning there was Electrum for Bitcoin Core users, which we will refer to as Electrum Core. Then Bitcoin Core forked away from being Bitcoin with it’s Segwit alterations and we were left with Bitcoin Cash as the closest thing to the original Bitcoin. This resulted in the creation of Electron Cash as a fork of the Electrum wallet for Bitcoin Cash users. Next Bitcoin Cash forked off with it’s own questionable alterations, and we were left with Bitcoin SV as the closest thing to the original Bitcoin. ElectrumSV was then created by Neil and myself to provide a fork of the Electrum wallet for Bitcoin SV users.
As a developer who contributed to Electron Cash, before it forked away from Bitcoin SV I was looking at the blockchain synchronisation code and while testing it, I encountered bugs. Looking at it further, it was clear it only worked in limited situations. One specific problem was that forks were tracked by height, and there could only one fork per height — people reported corrupt blockchain state because of this during the period when Bitcoin Cash forked off from Bitcoin SV.
Electron Cash lacked recent Electrum Core changes
In the meantime, Electrum Core had rewritten their blockchain handling and quite a few other things, and Electron Cash had never updated their code to include these changes. So we were left with the same original Electrum Core bugs in ElectrumSV that Electron Cash had.
Neil has since completely replaced this code in ElectrumSV by creating the bitcoinx library, which is now used to manage blockchain state in ElectrumSV, among other things.
Outlining Electrum Core limitations
The Electrum Core wallet was limited and did not handle very much. And in turn, Electron Cash, and ElectrumSV inherited these limits. When your coin is based around holding until you decide to cash out your share of the Ponzi, then you only need something simplistic.
So what are the current inherited limitations? A wallet knows the addresses that belong to it’s keys. It knows a payment is made when it receives notification that one of those addresses was used. Then it requests the transaction using that address, which it gets along with the merkle proof that the transaction was in a given block, and then it verifies the transaction was in that block. Alternatively, the wallet might pay to an address, creating a transaction which it then broadcasts that transaction to the P2P network. Of course, it doesn’t keep a copy of that transaction, and only obtains a copy when it’s use of known past wallet addresses causes it to get notified that address was added to the mempool!
It’s a very simple, and as stated, limited model. There are cases where the state of the wallet may not reflect the state of the network, if rare problems happen. You have no idea what coins are spent until you receive your spends back from the network after the broadcast. And let’s not forget that the modern wallet will be based on Paymail and BIP270 — which adds a whole slew of new requirements.
Increased needs from increased flexibility
If ElectrumSV were to stay in the limited form that it inherited, then it wouldn’t be usable for much. Take Paymail for instance, as it becomes more and more ubiquitous, not supporting it makes a wallet less usable, and perhaps even unusable. Extending and fleshing out ElectrumSV to support this among other things, which you might have read about in my past articles, will increase the flexibility to support further additional things.
To this end we’ve been making low level changes like supporting multiple key stores and child wallets within a parent wallet. And moving the data storage into the database, so that among other things, we do not need to keep all wallet data in memory at all times (including every transaction you’ve ever made as we used to have to, before the database changes). Then there’s the increased amount of data we keep about transactions, including new transaction states.
As ElectrumSV becomes more and more flexible, it needs to ensure all the new things it tracks and can do, behave correctly in edge cases. And the edge case we are focusing on in this article, is the event of a reorg.
Reorganisation needs — moving forward
So, moving forward what improvements do we need to consider, in order to ensure that we are correctly reacting to reorganisations?
All we did up to this point, is remove verification from any transactions above the common base height at which the new longest chain and the previous longest chain originate. Then the verifier would pick up the presence of unverified transactions that are present in the new longest chain, and obtain proofs and re-verify them. Is this still enough?
Improved transaction state tracking
One of the changes to transaction storage, which comes in with the database wallet storage, is keeping better track of transactions. Where before we had two states in which we kept track of transactions, we are currently looking to have five moving forward.
- Settled: A transaction received over the P2P network which is unconfirmed and in the mempool.
- Cleared: A transaction received over the P2P network which is verified as being in a block. We attach the merkle proof to the transaction, as part of the verification/clearing process.
- Received: A transaction received from another party which is unknown to the P2P network.
- Signed: A transaction you have not sent or given to anyone else, but are with-holding and are considering the inputs it uses frozen.
- Dispatched: A transaction you have given to someone else, and are considering the inputs it uses frozen.
Sticking with these five states for now, only “cleared” is relevant to reorganisations. So what should we do with transactions in this state on the sidelined chain when one happens?
The simplest approach is to drop the merkle proof and reset the transaction back to “settled”. This is equivalent to what we do now. A concern might be that in a devtopian world we could hold all the state for the sidelined chain, and be ready to switch back if necessary. For the new longest chain, we will be re-verifying any transactions. Some of these might be the same transactions, but verified against their new position in a block and it’s differing merkle proof.
The simplest solution takes the least development work, and well, resources are short, so I’ll probably make it work.
Scripts and payment destinations
In addition to transaction states, perhaps we actually need to go further and track potentially unused scripts? Is there any case where we’ll hand off output scripts, perhaps as part of a direct wallet to wallet transaction? This gets more complicated.
Paymail has the receiver give the sender an output script to include in a transaction, with Paymail being a hosted service this would happen on an ElectrumSV user’s behalf. The ElectrumSV user has delegated control of an address stream they own to the Paymail host, and the Paymail host then gives out addresses in the form of output scripts to people wanting to make payments to that Paymail user. Conventionally, these have an address as the address stream will be P2PKH because that is how it currently is done, and for that matter, has to be done because of standard scripts limitations. But as we get updates to the Bitcoin SV protocol, addressable scripts will become less and less common.
An existing example of a type of non-addressable script that is present on the blockchain back through to before the Bitcoin Cash fork from Bitcoin Core, are the bare multi-signature scripts. These do not have an address. You either have to search and filter for scripts which include the combination of public keys, with the given participant parameters, or you just have to know about them in the first place. The latter should come as a natural part of what falls under the contemporary form of Satoshi’s originally intended “IP2IP” model, the former is a tedious second class alternative. It’s interesting (to me at least) to note that ElectrumSV can actually register for notifications about bare multi-signature outputs (or almost any other non-addressable type of script), because it’s indexing server operates on script hashes, and a deterministic stream that produces identical output scripts allows registration of atomic script hashes.
Returning to Paymail, a payment destination or script, given out by a Paymail host is a delegated action done in isolation from a user’s wallet. The user’s wallet cannot give out payment destinations from the same stream, as that would lead to key reuse (which would have historically also been address reuse). This is a bad thing primarily because it breaks a simple model and makes tracking wallet state significantly harder. Now that the development branch of ElectrumSV can differentiate between delegated payment destination streams and ones the user can dispense from themselves, this is not so much of an issue. But in the past people did do foolhardy things like load their Money Button keys into ElectrumSV, and then spend and perhaps receive from both places — breaking their Money Button wallet.
So the wallet has a payment destination stream it has delegated to a remote host, which hosts their Paymail identity for them. The wallet has perhaps two options for detecting payments received on their behalf remotely, it can either monitor the payment destination stream delegated to the Paymail host, or it can ask the Paymail host for the information it already has. At the least, that information might be the range of the stream that has been consumed, which would again require the wallet to poll the blockchain or register for notifications. But it might also be the payments made to them and received on their behalf by their Paymail host, which would be ideal. And it makes sense that a Paymail hosting service should provide an API, which is a subject for another time.
Are any of these scripts affected should a reorg happen? No, I think they are related and partially overlap the transaction case, and should be treated the same way.
Doing things badly
Let’s assume that any payments made and mined, or are pending mining whether in the mempool or not, are still present and valid and will be re-mined. This can be either the transactions, or the scripts. We can focus on all the ways this aspect can go wrong, but they’re loser topics. Just get the system working right, and they’re irrelevant. Why would you want it working wrong? You watched Coingeek Seoul too, right? Things have never been more on track for a Bitcoin ecosystem done right.
Maybe, just maybe, we will need a system that will monitor for the case of transactions being dropped from the mempool. But that’s a system that should be designed based on the secondary mempool, and a variety of other considerations. It’s very compelling that we should be able to diagnose and recover from wallet problems, and this is a possible one that might occur.
The secondary mempool
Someone went into the secondary mempool at Coingeek Seoul, they covered it pretty comprehensively, but I’ll recap my understanding here. There was a case where double spending would happen, where you could construct transactions so that depending on node configuration, some nodes would add that transaction to their mempool, and other nodes would reject adding them. This could be done intentionally, or even done accidentally. The secondary mempool is intended to include the dropped transactions, so that even a node that rejected a conflicting transaction has the information at hand to identify things like double spends.
The internals of ElectrumSV will change quite a bit as we modernise it for the new realistic model of Bitcoin usage, including non-addressable transactions and Paymail. For now, it is sufficient to just continue to do the same thing we currently do. It is problematic to try and project how ElectrumSV will change, and prepare for it.
It would be better if we could get closer to the final changes required, before the next release. The ideal situation for the database-based wallet is that we avoid doing releases where the data schema evolves on a regular basis. This would reduce the longer term maintenance requirements, and any resource obligations to support them.
Reorganisation needs are a neglected part of ElectrumSV, and we need to do a better job of factoring it in moving forward.