On token contracts and balances

9 min readJul 3, 2023

If this quote from the German mathematician Leopold Kronecker doesn’t put you off reading this article, then I don’t know what will. Consider yourself warned.

“Natural numbers were created by God, everything else is the work of human beings.”

A few days ago I posted on LinkedIn about an ERC20 token I deployed, called DETS, a hundred of each which were airdropped into every single Ethereum address.

Farhan Khan declared in a comment that it was a trick, or a spoofed token, and Denis PΞtrovcic flagged the fact that Etherscan reports the token as only having eight holders. We had some good debates about this in the comments, and this article is a longer continuation of those debates.

I happen to disagree that it’s a spoof, and although it is a trick, it is no more of a trick than the standard way of implementing Ethereum ERC20 tokens. Thus ultimately making it not a trick.

What’s more, the debate opens up the opportunity to explore the difference in computing between:

how something is implemented,
how we interpret it, and
what it means to us.

Those last two may seem identical at first, but trust me — they’re not.

Before we dig into smart contract code, I am going to go through a simple example to set the groundwork, using a byte.

Implementation

A byte in computing is the name we give a sequence of eight bits. And a bit is a value that can either be 0 or 1.

In our computer hardware, one of the ways bits are stored is in electronic components using billions of tiny capacitors, which you can think of as microscopic cups holding electrons. If one of those cups is “full” of electrons, the bit it represents is 1, and if it’s “empty” the cup represents a 0. Eight of those cups in a row represent a byte.

Through the miracle of modern microelectronics, you can looks at the contents of cups, and you can change the contents. This is known as “reading” and “writing”.

Here is an example of a byte, using to ● indicate that a cup is full, and ○ to indicate that it is empty: ●○○○○○●●.

It gets a bit awkward to use black and white circles, so from now on I’ll use 1 to indicate a full capacitor, and 0 to represent an empty one.

And so, finally, here is our byte represented with ones and zeroes:

10000011

As software developers we don’t care about capacitors and electrons. They’re for the hardware engineers.

Interpretation

How do we interpret it? One interpretation could be that it is a representation of a number in binary notation. If we use what is called “most significant bit” or MSB ordering, then it is:

(1*2⁷) + (0*2⁶) + (0*2⁵) + (0*2⁴) + (0*2³) + (0*2²) + (1*2¹) + (1*2⁰)

I’ve marked the bits in bold, and you can see that they correspond to the bits of our original example byte. You can calculate that value, or just copy/paste it into Google Search, and get the answer: 131

Or is it? There is another way of interpreting the byte, called “least significant bit”, or LSB ordering, in which the first bit is considered to have the lowest value, and the last bit has the most:

(1*2⁰) + (0*2¹) + (0*2²) + (0*2³) + (0*2⁴) + (0*2⁵) + (1*2⁶) + (1*2⁷)

When you check the value of this, it’s 193.

A third way of interpreting the byte is that it represents a positive or negative number, and that the first bit represents the sign: 0 if it’s a positive number, and 1 if it’s negative. This gives us one bit for the sign of the number, and seven more for the value of the number, and the byte can represent any whole number from -127 to +127.

To work out what the value of the byte is in this case, if we are using MSB ordering, we use:

(1 * -1) * ((0*2⁶) + (0*2⁵) + (0*2⁴) + (0*2³) + (0*2²) + (1*2¹) + (1*2⁰))

Now the byte is interpreted as representing -3. Oh, and if we use LSB ordering, keeping the first bit to represent the sign, it’s -65. Or perhaps -129 if we use the last bit for the sign.

One byte, five different values: 131, 193, -3, -65, or -129.

None of these interpretations is objectively correct, incorrect, or indeed better worse than the other. The important thing is that we use the same interpretation throughout, and take the interpretation into account when we add logic for handling arithmetic — if we’re not, we’re going to get wrong answers when we use our bytes to perform additions and subtractions.

As an aside, the only time a developer usually cares about MSB versus LSB is when they are obtaining raw data from one system that uses one interpretation, and are working with that raw data on another system that uses the other interpretation. In that circumstance, we have to convert between them, but the rest of the time this is all taking place under the hood, and we don’t really think about it.

As software developers we don’t care about the implementation, unless we’re writing compilers or translators. We just want the implementation to be consistent and correct so we get the right answer.

Meaning

We have a byte, and we have an interpretation that handles how the byte is used under the hood to perform arithmetic. Now for the big question — what does it mean?

That is entirely up to us, depending on what the use case for the byte is.

In one program I may use a byte of stored data to count how many times someone has clicked on a button.

In another program, let’s say it’s a smart contract on a truncated version of Ethereum that can’t handle very large numbers well, I may use a byte to represent how many DETS tokens someone has.

Let’s go with the second case. As long as what is happening under the hood is consistent (and preferably efficient) I would argue that it doesn’t matter how the implementation works. Provided adding 3 DETS and 4 DETS gives 7 DETS, and so on, anyone using the system is not going to care how the additive function is implemented or how the numbers are stored.

As software developers we care deeply about the meaning of our data structures, because if the functional requirement and the specification don’t match, our code won’t do what the customer wants, and management will start bothering us.

Nothing is equal to zero

Having worked through the foundations of how numbers are created and worked with in computing, we can move up a layer to look at how smart contracts use data storage and numbers to allow us to deploy token contracts.

In Solidity, developers typically use a data structure called a “mapping” to store people’s balances. The mapping data structure is a two column list, with the first column containing addresses, and the second column containing the balances.

Of course, if you drill all the way down through the implementation of a mapping data structure you end up with capacitors holding or not holding electrons, and logic for changing the state of the capacitors, but we’re not thinking at that level any more.

We’ve abstracted our thinking, because it is a lot less cumbersome.

Back to mappings. You can give the mapping an address, and it will return a balance in the form of an integer. For example, ask the above mapping shown in the diagram what the balance of address 0xF6B6…5D6FE is and it will return 100.

Ask for the balance of 0xCd3e…6506a and you get back 0.

What happens if we ask for the balance of an address such as 0x5124…f1C34, which isn’t in the list? Ethereum and Solidity, by default, return 0. Even though the balance is actually nothing.

There is a subtle difference between 0 and nothing. The first means “there is a balance, and that balance is zero”, and the second means “there is no balance”.

So the implementers of the system could have chosen to return something else equally meaningful, for example an error message, or 2²⁵⁶ — 1, which is the biggest number Ethereum can handle without extra programming work. Or indeed, some other number. None of these is any more correct than the other, other than how easy or difficult it is to work with.

I would argue that returning an “error: not found” would be the most strictly typed response in that it allows you to distinguish between “zero balance” and “no balance”, but generally speaking returning 0 is a good compromise.

Unless, that is, you are implementing a DETS ERC20 token contract.

A hundred nothings

So what is the trick I used in the DETS contract?

Simple: in the balanceOf function of the contract I intercept the value returned from querying the balances mapping, and if it’s zero, I return 100 instead.

That way everyone initially has a balance of 100 tokens, because “there is no balance” has been replaced with “there is a balance of 100”.

Now you can argue that it isn’t really a balance for your address, because there isn’t some data somewhere in all the computers out there explicitly recording that you have that balance. But that just means you are demanding that “it’s not really a token unless there are some cups of electrons arranged in a certain order out there in all the computers running Ethereum nodes”.

It becomes a debate similar to whether a tree falling in the forest makes a sound when there is no one around to hear it.

As long as the tree always makes a sound if there is someone around to hear it, for all intents and purposes it doesn’t matter what noise the tree does or doesn’t make when there is no observer (or rather, a listener) present.

And so, as long as my DETS contract acts correctly when you go to transfer some of your 100 tokens, you have them. And the contract does, because I also altered the transfer and transferFrom functions to achieve just that.

The DETS specification matches the functional requirement for fungible tokens. It just implements it in a different way to a standard OpenZeppelin ERC20 implementation.

Flight of fancy

When I was an undergraduate one of my pastimes was to discuss philosophy with my fellow mathematicians and natural scientists after a few too many glasses of port wine. One of the interesting discussions I remember concerned a thought experiment that the universe might be some kind of very large computing machine, storing data and running programs. As inhabitants of that machine, we can’t ever know how it works because we can’t step outside it, so it’s not scientifically verifiable, but interestingly we can run our own “programs” in virtual machines within the universal machine to explore its behaviors in an empirical or even quasi-empirical manner.

And so, for example, the universe provides us with the natural numbers (the fancy mathematical name for “counting numbers”, and we can then investigate them empirically by reciting the equivalent of a little nursery rhyme — “one, two, three”, or moving around dried beans, and then hypothesize models for extending them into fractions, negative numbers, infinite numbers, and so on.

There’s (probably) no way of knowing whether the underlying implementation of numbers in the universe is based on the Peano postulates, Conway’s surreal numbers, or something else entirely.

We just work with what we’ve got if we’re engineers, and mathematicians and philosophers have fun arguing about it all in the meantime.

Similarly, there is no way of knowing what the “one true implementation of tokens on a blockchain” should be. Why should there be, if we can’t even work out what numbers really are. Ultimately, it boils down to:

✅ do I have a balance according to the contract?

✅ does that balance behave in a manner generally acceptable for tokens?

✅ can I verify that the code instantiating the token doesn’t have any back doors or bugs?

If all three boxes are checked, then for all intents and purposes you own those tokens.

Even if there aren’t any capacitors out there holding charge for you to point at and say, “There — those electrons are my tokens.”