ElectrumSV, Python and static typing

One of the things my fellow ElectrumSV developer AustEcon and I did a long time ago was hook up the mypy static type checker to our CI (continuous integration) process so that it gets run every time we push a change to our source code repository. If the type checker fails, the build is marked as failing. This then adds a strong incentive to run the type checker locally before pushing a commit.

An image for the story linked blurb you see on Twitter, Slack or wherever.

TLDR; All our code (with a narrow set of exclusions) has been updated to use strict type checking. Some bugs were found, but nothing anyone really cares about. Like aspects of the DigitalBitbox hardware wallet message signing support.

We started out with minimal lax type checking. This did not check our unit tests or our GUI code, or a few other select files. The reason for this is that ElectrumSV is a large and relatively old code base, and it had a lot of messy complicated hard-to-read code especially in the area of hardware wallet support.

As we are at the tail end of a large scale refactoring of the code base, before we can release that as a new version of ElectrumSV, we need to address all the loose ends and then test everything the application does. Any functionality the user may run has to be solidly tested. And as we find problems in the testing process, related parts of the testing process may need to be done over and over to verify any fixed problems are actually fixed and do not break anything else. Adding strict type checking before the manual testing will allow that manual testing to also ensure that the type checking changes did not break anything, and remove the need to do time consuming manual testing work more than once-ish.

With all that messy hard-to-read code, it is impossible to ever know for sure if there are problems in there. The best you can do is manually test it and cross your fingers. But if you add static typing to it all and use the strict mode, in order to do so you have to put in the work to specify what is being used by what, and the type checker will prove it is a lot more correct than it was.

There were probably 4000 strict typing errors that needed to be fixed as part of this work, and by adding more typing to fix one error sometimes other errors were fixed as part of it, but sometimes it revealed new previously hidden errors elsewhere. There’s probably a ton of remaining strict typing errors in our GUI code, and unit tests but they’ll have to wait for another day.

If your editor makes use of the static typing information you are adding, then it can jump to whatever code is referenced by it. It now knows that the mystery object formally known as keystore is the Ledger hardware’s subclass of the Hardware_KeyStore base class from the wider ElectrumSV code base.

Being type aware makes that editor when it is editing files that have this comprehensive typing support, a much more useful tool to the programmer. If you can’t tell what type a variable is let alone where it comes from, and your editor can’t tell you, then you’re in for a world of pain. If you want to experience this for yourself, I suggest you look at the hardware wallet code in our releases-1.3 branch, which predates the addition of static typing.

The editor I use is VSCode. VSCode uses Microsoft’s type checker, pylance or pyright or something. I don’t know, but it does not match the results from mypy, even if it is faster. This is not useful. Any errors shown in the source code that originate from type checking have to be ignored, and become noise, because you need your editor to apply the same rules as your code actually uses.

Out of all the errors that the strict type checking detected, it turned out there were only four or five and they were in existing older code from before the refactoring effort. Things like our least used (if at all) hardware wallet type DigitalBitbox, when it was signing messages and passing in byte objects to functions that expected a string and tried to then turn that bytes object from a string to a bytes object (which is a bug).

It did detect a couple of problems in preparing information for signing transactions for the hardware wallets. But this would have been detected in the manual testing that I will put every hardware wallet device through before release.

To recap, the biggest wins are:

  • Our text editors can understand the code now.
  • We can actually understand the code now, whereas before some of it was incomprehensible soup.
  • Our code can be better checked for correctness giving us another way to detect inadvertent bugs.

Type tainting

One of the things that pushed me to committing to strict type checking was that with the lax type checking, we were getting tainting with variables of unknown type. The unknown state seemed to propagate with the data and prevent the code that used the data from being typed checked. This then meant I had no confidence the type checking was giving us the benefits it could, and it was either go as close to full static typing as strict checking could get us or.. well, what other option was there?

Untyped or incorrectly typed dependencies

Not every project is going to add static typing support. And those that do, may not have complete or correct coverage.

One example of incorrectly typed dependencies, is PyQT5, which we use for our user interface. It has something called signals which callbacks can be registered with, but the typing is wrong. So we need to cast those to the correct type. However, the amount of UI code affected in the hardware wallets is minimal, but if we were to start type checking our actual wallet UI code it would be littered with these signals and require a lot of these casting workarounds.

A problematic dependency is the attrs module. We use the bitcoinx Python library from Neil, which uses this, but does not have typing information. This then confuses the type checkers and requires adding comments that disable type checking on the relevant lines of code.

If the base type does not have typing, then you’re probably seeing a source of type tainting data. However, it remains to be seen how effective strict typing is at detecting these, but it has detected some or perhaps all of it.

Python’s static typing

The benefits of static typing are clear. In my opinion a language that does not have it gets in the way of the programmer, and for that matter, the project. Python’s support seems adequate but problematic. It’s not as good as real static typing, but it gets a lot of the way to the same benefits.

One thing you might be able to see from this article, is that while it has given us solid benefits that made the work worthwhile, it is hard to know what is not correctly typed and does not trigger errors in the checker. This is a large flaw in the Python static typing approach, and hard to work out how to solve besides seeing if I can make VSCode use mypy instead of whatever it uses instead and then trying to use the UI hints for insight. But this is not a real solution, a real solution would be to have mypy highlight these things and I am not sure how it can.

ElectrumSV developer