User-friendly wallet backup
The death of seed words, ensuring historical coin access and wallet backups moving into the future..
As always, I am not a Bitcoin expert. This document is the path forward for ElectrumSV based on my understanding of the subjects at hand. Consider it a draft proposal, soliciting feedback and additional ideas to help define a path to keep ElectrumSV on track to take part in Bitcoin SV’s continuing success as experienced through larger and larger blocks.
People have always been told that all they need to do to backup their wallet is write down their seed words. But how does this really work? If you are supposed to be able to take these seed words and restore them, that means that you should be able to go out there and find applications that will let you restore them. There are two ways this has worked historically, either the application will process all the blockchain and look for transactions belonging to the wallet or more commonly a user will take those seed words to ElectrumSV and attempt to restore them.
ElectrumSV relies on third-party services provided by generous community members to restore those transactions. Those services have no future and as Bitcoin SV’s success increases and block sizes become larger and larger, they will disappear because the blocks will be too much for them to handle. How far away? Maybe six months, maybe longer, it is not possible to predict.
We need to work out what to do when these servers are gone. That is what this document is all about.
Examining seed words
When people say they support seed-based restoration what they generally mean is that you can take your seed words to ElectrumSV and restore your wallet there. But now that the blockchain is becoming so successful and seeing so much use that there is no way the indexing servers can continue to be run and provide ElectrumSV with the data it needs, we have to ask ourselves whether seed words have any purpose any more.
Anyone who claims seed-based restoration will always work for their users, both old and new, should be able to tell you how. My suspicion is that after they become aware of the indexer issue if they still promote seed-based restoration, they never really understood how it was supposed to work and are clinging to an appealing fantasy that was sold to them as fact.
The way seed-based wallet restoration works is that a wallet takes the seed words and turns them into a master key. From this master key it derives child keys that could have been used by your wallet, and tries to find transactions that use these keys on the blockchain using the indexing services. This then gives you very basic information of:
- What transactions you have in the payment history.
- What unspent coins still remain in the wallet.
There is additional state in your wallet that does not get restored, like the description you might have given any transaction. Seed words have never restored this.
Direct blockchain restoration
One of the earliest forms of restoration, seen for instance in the open source BitcoinJ project, was direct blockchain restoration. What it would do was more or less examine every block in the blockchain looking for key usage. No-one does this any more. It was always slow and inferior to restoration through ElectrumSV, and as Bitcoin SV has become more and more successful, the idea anyone will be downloading the whole blockchain to look for their transactions is not realistic.
To restore a wallet from seed words, relies on the free to use indexing servers which the operators maintain and make available at their own cost, as a service to the community. And this is using an open source indexer named ElectrumX. ElectrumX was designed for 1 MB blocks, and we are now seeing 2 GB blocks. Anything above perhaps 200 MB blocks it struggles to keep up with, and it also has ever increasing storage requirements to keep the restoration data from the blocks it processes.
Some time from six months away to perhaps a year away, ElectrumX will no longer be able to keep up with the blockchain. Bitcoin SV’s success will have rendered it outdated and technically insufficient to do what it used to.
In the past there has been a problematic rumour that you could take your seed words and load them into another wallet, and just continue using them there. This was never a good idea and often resulted in users who did not understand the repercussions corrupting their wallet. A better idea is to export your metadata from your current wallet (tax records or whatever) and send the funds to your new wallet and start afresh there. Ideally if you are doing this right, you will not be merging coins which would lose privacy and will be simply moving each coin you have independently.
The future of seed words
There are several facets to this.
It is not acceptable to abandon all of ElectrumSV’s users who have been expecting their seed words to work, especially so because there are likely to be many users coming from both Bitcoin Core and Bitcoin Cash who have no other way to gain access to their transaction history let alone their unspent coins.
If seed words are only really useful because they are the way people expect to back up their wallets, and this no longer works because there are no indexing services, then keeping the concept of seed words for new accounts created into the future is just confusing and a negative user experience. They serve no purpose and only introduce confusion.
At this point, we know seed words were something created for a blockchain that had little to no usage, like Bitcoin Core or Bitcoin Cash. When you have a real world blockchain that is going to see more and more adoption by forward thinking businesses like Fyx who operate CryptoFights, WeatherSV recording valuable data as well as the many businesses still to come, it no longer works.
Historical coin access
The single-most important goal of the ElectrumSV project has always been that we will keep the gate open to the people who have coins on Bitcoin Core and Bitcoin Cash, so that they can access those funds. This hints at the solution, we know that seed based restoration has no future, but we can freeze the past. We can get businesses to run new limited indexers that only provide restoration data up to a certain blockchain height and not beyond. Anyone who restores a wallet using seed words may have to pay to use these limited indexers to make them viable for those businesses, but it fulfils that requirement of the later arrival of historical coins and facilitating access to them.
In order for this capped restoration height to make sense, we need a user-friendly wallet backup solution for users past that height. When ElectrumSV does a release that has this backup solution included, it will define the approximate capped restoration height.
Let me be very clear, user-friendly is not going above and beyond to do all sorts of complicated stuff so that users can take risks and hope that ElectrumSV can recover from it in the event of disaster. It is providing as simple and comprehensible model for users to know what they have to do, to ensure that their wallet is safely backed up, and when they are taking risks and what risks those are.
I think we’ve isolated two concrete and achievable requirements, so let’s move on to proposing how each of these will work. The goal is to propose viable solutions as a starting point, and hopefully people will point out flaws, additional requirements or even viable alternatives that meet the requirements.
Historical coin access
This will be a simple API likely with paging that allows access to pushdata within known standard script types up to the capped restoration block height. The wallet software would iterate through potential key uses in batches, and provide pushdata hashes of either the public keys, public key hashes and script hashes to the restoration service. When it reaches a BIP32 gap it would stop unless the user had for some reason overridden and extended the process to detect disconnected payments further out in the key derivation sequence.
Restoration would only make sense for restoring a new account in a wallet. No incremental updates to restored data would be provided.
The current ElectrumX API is provided using a protocol called JSON-RPC, and it makes a lot of sense to switch to REST and if necessary web sockets.
This service will require businesses to operate servers and to store data, and by ensuring that they are compensated for doing so should encourage those businesses and ideally competition in providing the data. There should be some mechanism for compensation, and it will have to be something like Paypal where the user is not required to have Bitcoin SV. Remember that we need to support users coming from Bitcoin Core and Bitcoin Cash, and they are looking to gain access to their Bitcoin SV.
There’s no reason that Bitcoin SV couldn’t be accepted, and this could be automated as part of the restoration process or the page with the Paypal payment process could also offer BIP270-based payment. Ideally a user should be able to pay one of these invoices with whatever other wallet they are using, whether Handcash, RelayX or MoneyButton. Let’s not add every web wallet’s payment sliders, I’m not sure that’s the way forward.
The simplest way to come up with a reasonably safe way for ElectrumSV users to be in control of their own backups is to start from the top down. The user interface should have one place where the user can see the backup state of their entire wallet, with levels of danger displayed.
The user interface might indicate any number of different issues:
- The wallet backup has not been configured (critical).
- The current data is not fully written to backup storage (often danger or sometimes warning).
- The wallet backup is as far as ElectrumSV knows fully backed up (normal).
Additionally, if the user attempts to close the wallet and the current data is not fully written to backup storage, they would have to confirm the action.
At this point the user is fully in control of the responsibility for backing up their wallet. ElectrumSV does it’s best to provide context on why they need to backup, helps them set up backup options, and tracks what data is or is not backed up. The goal is that we make our best effort to help the user understand what they need to do, and to help them do it.
If they wish to disable this backup system and make copies of their wallet database onto different devices manually after they close ElectrumSV, they should be free to do that too.
If we are going to tell the user that their data is stored and completely backed up, we need to know that this is actually the case. That rules out DropBox, OneDrive, iCloud and any similar products. Just because we put files in the stored folder does not mean we have any idea when and if those files have been replicated to the cloud. If we indicate to the user that their data is backed up, we need to know that the process is complete and that the data is remotely stored. Consider the case where the user is closing their wallet, and they opt to wait for the backup process to complete. If they close it and then shut down their computer and cloud storage hasn’t yet replicated the local files, they in effect have no backup for that data yet.
Better to have a remote API we write data to, where we know that successful response to writing a chunk of data means that it and all preceding chunks are in the custody of the storage provider.
The storage API should be an open protocol. Anyone that creates a service that exposes this API, should be able to have that API endpoint registered with an access key in their wallet, or in the wallets of whatever users can be convinced to use their service.
ElectrumSV would only bundle APIs of known businesses that have a track record, and not any API endpoint someone wants included. We would warn about the dangers of using an unknown storage provider but allow the user to add it and make use of it. It might be their own backup service they are running, or it might not.
Managing backing up the wallet to multiple providers means that we cannot know for sure which provider is correct, and it is planned to disallow this. Instead, it is expected that there will be one provider and one backup channel on that provider, for a given wallet file.
When a user reserves cloud storage with a service like Azure or Amazon S3, they can opt for a level of replication in their data storage. A storage provider can similarly offer a level of replication, although whether the user can tell that their data is stored with that level of replication is unlikely. This then suggests that less well known providers are less attractive if a user takes their wallet backup seriously, than a well-known provider with a good reputation.
Most users likely only have one copy of their wallet and it sits on their computer. Occasionally they start ElectrumSV up, and do something with it. They do not have multiple copies on different computers, especially not if they are giving their transactions descriptions and expect to have the payment history with all those descriptions in place.
We cannot guarantee correct restoration if a user has multiple copies of their wallet and they can be using any of them without concern for whether they are synchronised and the contents of that wallet is the latest correct state. Is there anything in the more up-to-date copy that is not in the older copy they have opened? What if it is something consumable like spent coins and they go to double spend them? What if there were received coins in the latest version that are not known to the older version they opened?
This then suggests the best approach might be an inverted model where it is expected that the backup channel with the storage provider is the authoritative state. If a wallet is opened and cannot connect to the backup channel to confirm it is the latest version, then it is considered to be in “dangerous mode” and the user has to accept full responsibility for data loss. We would want to do something on the level of colouring the UI red and have some large “POSSIBLE LOSS OF DATA AND MONEY” warning visible until the situation is rectified.
The wallet would be expected to have an active connection and be continually backing up the changes, and in the event it cannot back up, the user is in that dangerous territory. Then they are made fully aware of the consequences of not backing up any changes they make while in disconnected mode.
We can further shore up this model by having session state included in the backup channel. If one copy of a wallet connects to the backup channel, establishes that it is synchronised and proceeds to write changes made with any actions the user performs, but loses connection, then the session is not closed. With session metadata, a second copy of the wallet can detect if there is another copy of the wallet that started backing up and never closed out it’s session. However, having multiple copies of the wallet should be completely unsupported and the session state should be used to enforce this. We can never merge database state from multiple copies because the backed up state is representative of only one copy of those databases and is a sequence of iterative changes from that database we do not have to understand but merely replay to restore.
The model has to be that there is one instance of the wallet, and that it is authoritative on the wallet state, and that this is enforced by a requirement that it be compatible with the channel. We should make every effort we can to enforce this, we can get pretty close to knowing for sure. One way to do this might be to bake the path and file name of the wallet database into the wallet database itself, then if the wallet is opened and it’s storage has moved the user has to view the “user-friendly” user interface that steps them through the possible consequences.
The wallet data should be encrypted on the provider. The provider should not know what is stored, nor have any ability to differentiate it from any other stored data. Besides the fact that the provider is storing data with a given amount of replication, and for some period of time, the provider does not need to know nor care about the data.
One might ask if this is best done with SPV channels and perhaps some modified or extended API.
We’re talking about a user-friendly backup. The whole wallet should be backed up. Every single description the user put on a transaction. Any contact or contact identity they added to the wallet. Their history of payments with any metadata. Account structure. Seed words, master keys or derivations for accounts or sub-accounts. Credentials and account information for services used by the account. Whitepaper 42-based derivation sequences between either the wallet and a service, the wallet and a contact or who knows what else. Changes to any of the above over time. Everything in the wallet, and anything that ever gets added to it needs to be in the incremental backup.
The Bitcoin Core and Bitcoin Cash blockchains had infrequent payments and very few ways to use the blockchain, Bitcoin SV removed all the roadblocks and the ways a wallet may need to help the user use the blockchain probably increase daily. If the wallet supports something, and has allowed the user to do it, it should back it up.
So how do we make sure that the wallet contents are backed up, and how do we restore it? The simplest answer to this for ElectrumSV is to store the database writes. We do not in any way need to be compatible with any other wallets, that is not scalable and will only cause problems. We should be explicitly incompatible with other wallets.
ElectrumSV stores the data within a wallet in a database, using a specific database technology called Sqlite. Because of the way Sqlite works, ElectrumSV writes to the database sequentially in a writer thread. If a wallet backup is considered to be the ordered sequence of these writes, then what replaying them should give is the state of the wallet at the time of the last backup.
Backing up a new wallet may just be the sequence of SQL statements with accompanying statement data. But backing up an existing wallet may be a dump of the wallet database as the first entry, and the SQL statement entries after that. Statements are already batched into work items that serve as a database transaction, like the importation of a Bitcoin transaction. But it is important to ensure that these work items are standalone and that if any ends up as the last entry in the restoration log, it leaves the wallet in a usable state with no loss of integrity.
Post-restoration it makes a lot of sense to do processing on various parts of the wallet state. This would include incoming expected payments, which might have expired. It would include BIP270 payment requests, which also might have expired. And a range of other things we have to identify.
It is probably worth having integrity checks on the stored data periodically. It might also be possible to combine this with stored data reorganisation. This process could be taking the stored data, restoring it into a complete database, and then comparing it to the existing wallet database for key data points. All the stored data could then be replaced with a new consolidated database as a checkpoint to do further incremental writes beyond.
To do an integrity check and compare the current database to the existing restoration state, either requires freezing the wallet database while the process is ongoing, or just long enough to take a snapshot which would be expected to be aligned with the restoration state.
If a user restores from scratch a new copy of the wallet from a backup channel, then to start the restoration process it needs to be known that it will be taking ownership of the channel. This will forever prevent the original copy from being used, it must warn the user that if they operated in “dangerous mode” in the original copy that correct restoration of data cannot be guaranteed.
The process beyond claiming the backup channel, should be quite straightforward. It would just create a new wallet database and open it.
Storage provider access
When a user has opened ElectrumSV and they select the option described along the lines of “Restore from service provider”, we need some bootstrapping process. The user has to be able to point the application at the correct provider and to be able to decrypt their wallet contents.
This implies there are two levels of credentials needed by the user. They need to have credentials that allow them access to the service provider, and they need to have credentials that allow them to decrypt and access the backup channel for the desired wallet. The best way to do this is at this time unclear, but it can obviously be done.
Sometimes a user may be offline and backup may not be possible. Let’s say that they delegate some coins from their wallet to a mobile wallet which then spends some in payments. But the wallet ran out of mobile data and these were payments they gave to a merchant who may broadcast in their own due course, and the user dropped the mobile device off a wharf or in front of a steam roller. There is a value in baking worst case scenario recovery information into the payment transactions the user makes.
This is easy enough to do with a service that monitors for flagged transactions and posts them to a user’s SPV channel. And such a service is another variation of what needs to be done to collect the capped restoration data that might also be hosted. But it also requires that when a user sends or receives a transaction, they have the ability to add arbitrary tagging outputs to it. This is not currently the case, transactions are provided as-is with no protocol for negotiation in how it is constructed.
Additionally for the scope to be knowable, the resources exposed to disaster must be known and segregated beforehand.
If a service wants to provide backups for free, and you want to rely on them to store your data, then by all means use them. But good luck relying on their service to be there for your recovery if they have no incentive to ensure it is.
However, there should be a generic payment protocol for all the things that the wallet wants to do that involve interaction with a service on the user’s behalf. The user may want the service to be paid automatically so that they do not find themselves in the situation that the backup stops working and they need to deal with it.
I think this document provides a good overview of why seed words were a fantasy that only seemed to work for blockchains with little usage, small blocks and not very much data to index. I think it also provides a good overview of how user-friendly wallet backup is possible, and it highlights the most obvious problems even if it does not attempt to provide solutions.
Seed words rely on an indexer. The indexers are going away because no-one will be storing and indexing the data when ElectrumX chokes on the larger blocks. Without the indexers, what good do the seed words do for the free magic wallet restoration people were sold them as providing? Nothing, they are worthless. Even if there are new indexers, they rely on a sequence of keys that can be identified and that every single piece of on-chain data is tagged with those keys, and doing that correctly if at all is probably related to the halting problem.
The key to user-friendly wallet backup is:
- A good intuitive user interface that keeps the user informed, so that they can take responsibility for their own backups.
- It is having reliable services that have an incentive to keep the user’s backups.
- It is having a way to trigger a recovery that both accesses the latest restoration data and allows the user to decrypt and restore it.
- It is having a way to recover worst case scenario data losses.
At this point, we are in the same situation as we have been with similar designs that ElectrumSV has published. Hey look at a solution, it relies on someone to come along and provide services. But no-one ever does, and to be fair it is reasonable to expect that skilled developers have their own exciting ideas to work on.
This is a critical problem for ElectrumSV. We need to move beyond seed-based restoration before the indexers die. We cannot say, hey this is our need and this is how we want it solved, and wait for someone to provide a solution and then to hope it is a viable one. We need to take action. This is one of ElectrumSV’s highest priorities and we will be actively working on it in the coming weeks and months.
Opting out of backups
If a user opts out of backups we would just treat them as if they are in “dangerous mode”. It would be entirely on them to understand their wallet is not backed up, and they are responsible for only operating one copy. There’s only so much we can do.
Commercial wallets are in a better position because they can require a connection from their wallet application, and only allow its use when connected. However, they are inherently custodial whether they can spend the funds without the user’s interaction or not, because if they go down the user has no access to their funds. Is this the canonical or legal definition of custodial? I have no idea, but it is something to think about.
Thanks to AustEcon [SV] for the invaluable feedback that helped clarify this article.