Any information stored in a blockchain is supposed to be preserved forever, nobody will be able to change it or even less erase it. But is this really true? Is there any chance that governments or private groups with enough money to finance costly attacks might delete information from a blockchain?
Imagine a journalist who is looking for a secure place to publish a very important investigation, one that will reveal illegal activities from a certain group. Would this group be able to delete this report once it is forever stored in blockchain? We’ll see. After a short introduction to blockchain, we will briefly describe how to store information in it and discuss the major possible attacks on blockchains and what their implications are for censoring the journalist’s article. Finally, we comment on blockchain use for publishing purposes and provide an outlook of the possible alternatives.
Blockchain does not only store information in a database, it also allows the peers of the network to decide whether an update of the database state is valid. In a financial context, the new state corresponds to a new transaction, and a consensus protocol allows the peers to come to an agreement as to whether the transaction has enough funding and is properly authenticated by the sender. Validation requires peers to know the entire transaction history (at least the UTXO set) which they store and update in their local copy of the blockchain (see Fig. 1).
Figure 1. The blockchain network: Nodes have a local copy of the blockchain and communicate updates (source).
The essential properties that a blockchain must satisfy to run safely are the following:
- Immutability: Data integrity to ensure that transaction history cannot be changed. Any attempt to change would be noticed and rejected.
- Consistency: All peers have a common view on the blockchain, i.e. once an honest user accepts a valid transaction, all others will insert this transaction to their local copy and consider it for future transactions.
- Censorship-Resistance: Valid transactions are always accepted regardless of the sender, the receiver and the content.
- Decentralization: No single point of failure. Decentralization is measured through the size of the network. The larger the network, the more secure against failure or malicious behaviour of a small group it will be. This is the pillar of the blockchain idea.
Immutability is achieved by a clever use of cryptographic functions called hash functions (the method is explained in our article Censorship of Harmful Data in Blockchains). Consistency and censorship-resistance have been theoretically proven for some selected blockchain protocols. The Bitcoin Backbone Protocol article and subsequent investigations demonstrate the security of the Bitcoin proof-of-work (PoW) blockchain. Other blockchain consensus protocols, e.g. proof of stake (PoS) protocols, have security proofs as well (e.g. Algorand, Cardano, etc.). Their proofs rely on three assumptions:
- Majority of Honest Network Peers: The protocol helps honest peers to communicate state transitions only when there are enough honest participants.
- Network Communication: The peers must be synchronized to reach consensus.
- Secure Cryptography: The peers can authenticate the broadcast transactions and can verify immutability of the blockchain data.
In theory, these conditions are enough to ensure the safety of a blockchain. However, there are practical threats, for example exploiting bugs in the implementation. Notwithstanding, these assumptions are necessary conditions, and the violation of one of them would imply that the blockchain can effectively be attacked.
Why honest majority matters: Since there is no central authority which declares transactions as valid or not, the validity of a transaction becomes a matter of the entire community. The way blockchains handle this can be pictured as follows: valid transactions are packed into a stack, one over the other in chronological order. To revert a specific transaction, a malicious user has to first revert all transactions which are already above the one he is targetting. As long as honest nodes are in majority and continue adding new valid transactions to the stack, the malicious user will not be able to do this. But if malicious users become the majority, transactions can be reverted and double spending can take place, which means the transaction is taken out of the stack and used again, i.e., consistency cannot be guaranteed anymore.
Why good network communication matters: If someone manages to split the blockchain network into two groups by not forwarding messages (see Fig. 2), these groups will not be able to communicate with each other, and hence two different transaction histories will be built. If the partition comes to an end, nodes will reach an agreement regarding which history is the valid one, namely the one which created a longer stack, but this implies that consistency does not hold for one of the two groups.
Figure 2. The red nodes do not forward the communication between the honest nodes which are marked in green. This leads to an effective split of the blockchain network.
Why secure cryptography matters: Network nodes appear as nothing more than a bank account number in the network. Nodes have a PIN for their account to authenticate transactions – in technical terms, this is their private key. Broken cryptography would allow everyone to deduce the private PIN of any bank account; in other words, it would enable anyone to steal PINs and move the funds of other peers.
How to Insert Text into the Blockchain
Before we analyze what could happen to the information should one of the three assumptions be violated, we will quickly go over the necessary steps our journalist should take for inserting his investigation into the blockchain.
First, he must connect to the blockchain network. This requires to run a node in the network which allows him to communicate with the rest of the peers and inform them about his new transaction. Public blockchains provide the necessary software and new users can download their own blockchain copy. DappNode is one of the applications that can assist in this process. If the entire blockchain is too heavy for the journalist to run (see for example the hardware requirements for Bitcoin and Ethereum), there are other methods to communicate with the network: he could either use the “lightweight” alternative (check this for Ethereum) or an open public node to communicate with the network.
Secondly, it is important to comply with the standard transaction format of the selected blockchain, otherwise the peers will not forward a malformed transaction to other peers. Inserting a text file is usually not a common procedure for a blockchain, therefore doing it may require some programming skills (more information can be found for example here).
If all this is too complicated for our journalist, he may take advantage of an existing service for publishing in a blockchain. Nevertheless, this would introduce a degree of dependency he may not be willing to accept.
Is It Possible to Delete Data from a Blockchain?
Before we analyze this possibility, it is important to understand that the meaning of “deletion” in this context is two-folded: since there are two types of readers – those who have a local copy of the blockchain and those who access a node of the network to parse the blockchain for the text – deletion can mean either erasing the data file from the local database or, in the latter case, just preventing the reader from accessing the data.
Possible Censorship Through Majority
In this scenario the majority of the network nodes belong to the journalist’s adversary, who wants to delete the text file. Since permissionless blockchains are open networks, this can be achieved by acquiring majority of the computational power in PoW-based blockchains or by possessing a majority of the stake in PoS-based consensus protocols (see below for cost estimates). In both cases, the security assumptions of the blockchain are broken.
How does this scenario affect immutability of data?
Since the entire blockchain is stored locally at every node, deleting a text file would require to erase this data from all existing copies of the blockchain. The adversary can delete the data only in those copies over which he has control. The remaining nodes, just by following the protocol rules and without having any political opinion, will not do so. Furthermore, they can also detect the adversary’s attempt to tamper data.
Erasing the text file from a previously accepted block would lead to an evident inconsistency – the mechanism for immutability locks the blocks together not allowing any change afterwards. Therefore, the adversary may prefer to delete the entire sequence of blocks in the copies he manages, up to where the text file is, and build a new branch without the text file using his majority of power (see Fig. 3). When this branch reaches the same height of blocks the old branch has, new users can be convinced that the text never existed in the blockchain (this is known as broken Sybil resistance, which we explain below). Yet, the content would still be in the chain of the honest nodes in the orphaned blocks, i.e., those who had the copy before the change, would still carry the text file in their copy.
Figure 3. When malicious nodes form the majority, they can build an alternative valid chain (in red) which does not contain the journalist’s article, and can delete the green blocks 39, 40, and 41. This would be the new valid chain presented to new users. Nevertheless, the text would still be in the shorter invalid branch of the blockchain.
Under such attack, the replication of the journalist’s report can be stopped, but not erased. Readers without a copy of the blockchain have a low probability of finding a copy carrying the information of the text file, as it is only included in a minority of network nodes.
How does Non-Consistency affect Immutability?
Consistency allows peers to have a common opinion about valid transactions without any central authority. Although validity is not in the primary interest of our journalist -he cares only about storage- this has an impact on his choice: Bitcoin users would be afraid of a situation in which consistency is threatened, because their holdings would not be safe anymore. Attacks like double spending, as explained earlier, would be possible, and everybody would try to take their money out of the system and leave the Bitcoin network. Nodes maintaining the network, a.k.a. miners, would not be interested in serving a dead network.
Since honest nodes would leave the network, the adversary would get full control over it. As described before, the text file can then be deleted from all the local copies of the adversary, and no accessible source with text would survive. This scenario is highly critical because it would effectively achieve the censoring of the journalist’s article.
How does Non-Censorship-Resistance affect Immutability?
A failure of censorship-resistance affects the insertion of data into the blockchain. Since our journalist has already placed his text into the ledger, this would not be of his concern. However, common users would experience rejection or at least long delays in the acceptance of their correct transactions. Therefore, in an extreme scenario users would stop using the blockchain, and finally also the miners would leave the network due to lack of economic incentives, as the Bitcoin price would drop making their reward for successful mining uninteresting.
The consequences would be similar to the ones described in the previous paragraph: centralization of the network in the hands of the adversary. The danger is not as high as in the case of consistency problems (there is no loss of money), nevertheless this is also a critical scenario.
Can new users be tampered?
New users must bootstrap and synchronize their local copy with the blockchain. During this process the new nodes must ask their neighbouring nodes for the data. Sybil resistance refers to the ability of a peer-to-peer system to protect honest peers from malicious peers who try to tamper them by creating many entities which provide incorrect information. PoW blockchains are Sybil resistant, since it is enough to have only one honest node as a neighbour to know which blockchain information is the correct one: the longest chain which accumulates the majority of hashing power, i.e., the one of the honest majority (see Fig. 4).Screenshot from 2019-10-15 15-59-36New users must bootstrap and synchronize their local copy with the blockchain. During this process the new nodes must ask their neighbouring nodes for the data. Sybil resistance refers to the ability of a peer-to-peer system to protect honest peers from malicious peers who try to tamper them by creating many entities which provide incorrect information. PoW blockchains are Sybil resistant, since it is enough to have only one honest node as a neighbour to know which blockchain information is the correct one: the longest chain which accumulates the majority of hashing power, i.e., the one of thScreenshot from 2019-10-15 15-59-36e honest majority (see Fig. 4).
Figure 4. New peers need to get the blockchain information from the other network peers. In Bitcoin, it is enough to have one honest peer to verify that the received information is valid (picture taken from CoinDesk).
PoS needs a different approach which we do not discuss here (more information can be found in this article about Ouroboros Genesis). If an honest majority is not given, and the adversary manages to construct a longer chain without the text, new users will opt for this one instead of the chain containing the text. But there are more subtleties to be considered, e.g. the Ethereum fast bootstrap mechanism which does not start from the genesis block (first block). This would require an extra study for each blockchain design.
What would this mean for our journalist? The replication of the text file containing his investigation can be stopped. New users would opt by default for the chain not containing the text.
Cost calculation for acquiring majority:
In blockchains based on PoW, the majority attack requires to have the majority of the hashing power. According to this source (accessed on July 22nd, 2019), the costs to run this attack are:
For example, to run a double spending attack on a transaction placed in a block which has 10 successor blocks, it would take an attacker possessing 70% of hashing power on average about 4 hours, which has a cost of more than $4 million in the Bitcoin network. In the case of Ethereum Classic, a majority attack already took place.
These numbers are valid for a new single user with no previous hashing power. But the truth is that there are mining pools that already control 20% of the hashing power. Collusion between 4 of these pools could lead to an effective attack, and this is technically already possible. See the statistics below, in the section about decentralization.
In blockchains based on PoS the attack requires to gain majority of the entire stake. Blockchains which work in the spirit of Bitcoin require 51% (e.g. Cardano), and those which run a classical Byzantine Agreement protocol require only 33% for each round (e.g. Algorand). Measured at the total market cap (as accessed on July 22nd, 2019 from coinmarketcap), the calculations reach the following estimates:
Note that EOS and Tezos are not pure PoS protocols, but delegated Proof of Stake protocols, which are more vulnerable to Denial of Service attacks (described below) due to a more centralized validator set. PoS blockchains have not the problem of power concentration as in PoW through big mining pools; yet they might suffer from powerful stakeholders which can collude to run a majority attack. Notice that in the case of PoW it is possible to stop such a majority attack by adding new miners to the network. This is not the case for PoS blockchains, since a majority shift to the adversary cannot be reverted by the honest community. Supporters of PoS argue that such a threat is not realistic, as attackers would need to hold the majority of stake. This would imply a huge loss of capital, since the exchange rate between the cryptocoin and fiat money would drop immediately.
There are more refined attacks, e.g. selfish-mining, which requires in Bitcoin only 33% of power to effectively run double spending attacks, but these are complicated to run and not yet seen in practice. Finally, it should be noted that although permissioned blockchains are not prone to this kind of attack, they do require a high degree of trust on the validator set (between 6 and 100 entities must be trusted and protected in case of an attack).
Possible Censorship Through Network Attacks
Network attacks aim at cutting the communication between the nodes to achieve a partition of the network (see Fig. 2). Since the IP addresses of most blockchain users are public, such an attack is technically feasible, but it would require a huge logistic task that may even include cutting many fiber optic cables and blocking internet and satellite connections between the network nodes. Depending on the geographical location of the network nodes, this is more or less easily achievable. A defense strategy to this would be the use of Tor to obfuscate their location, but the network wouldn’t be able to work if everybody did this.
What would this attack mean to our journalist? The attack affects the insertion of new data and the consistency of the database. Since our journalist already published his text in the blockchain, the real aim of such an attack would be to make the network nodes leave the system, and data would disappear for those without a copy. Running such an attack would cause a huge collateral damage, and it is considered to be very unlikely in geographical locations with a dense infrastructure and due to the internet routing system.
Another possible network attack could be done by isolating special nodes inside the network, the so-called eclipse attacks, where the adversary gains power of the neighbouring nodes to control the input and output of the attacked node. This attack is not targeting the entire network, but individual nodes, e.g. to perform a double spending with one node. Since this is an individual attack, it makes it less likely to bring down the entire system. More information about this can be found here.
Denial of Service Attacks
In this attack the adversary tries to spam the blockchain communication channels with lots of information in order to make the system work slowly. The blockchain communication has methods to protect itself against such attacks making them less feasible. But still this could be a vulnerability that should be studied in detail. More information can be found here about how Bitcoin protects itself against DoS attacks, and what the countermeasure of Cardano are, can be read here.
The essence of this attack would be to make the network die, so that no node would be online anymore to share the text file. This can be considered as a critical attack vector, although countermeasures are in place and no real threat has been seen until this day.
Possible Censorship Through Broken Cryptography
There is currently no threat for blockchains which use standard cryptographic tools, i.e. standard hash functions for “chaining” the blocks together and digital signatures in order to authenticate transactions. Most blockchains rely on the standard ECDSA signature scheme and the hash function SHA-256 which are, mathematically speaking, very safe (notice what can take place if you do not comply with the standards: vulnerability in IOTA). Also, see our article about the security of the ECDSA and other signature schemes.
The only potential threat would be the advent of quantum computers which would only affect the security of the digital signatures (Shor’s algorithm; see also our article), not the hash functions. The real dimension of this danger can be seen by the initiative of the NIST program for Post Quantum Secure cryptography.
Should this happen, since signatures would not be secure anymore, funds could be stolen and the users might escape from the blockchain network. Nevertheless, quantum computers are still far from becoming a reality, and although in three generations this might be a different story, by that time it is also likely that a secure quantum computer cryptography which protects against such attack by updating the core software will already exist.
Importance of Decentralization
The general rule is: more nodes equals more security. The more nodes in the system, the less likely to gain majority due to higher distribution of power, and the less likely for a network partition attack to occur. Moreover, as more copies of the blockchain exist, it becomes less likely to interrupt access to the blockchain because redundancy ensures that data cannot easily be deleted.
Statistics for PoW
The statistics below show the number of public reachable nodes. Not all nodes are full nodes, i.e. nodes having a local copy of the entire blockchain. Usually only enterprises (exchanges, mining pools, etc.) run full nodes, since for a single private user there is no real economic incentive to do so.
Statistics for PoS
In the statistics below we include the numbers of nodes for EOS and Tezos which have special nodes called delegates. Only delegates can propose the next block; they are elected from the community, whereby anyone can become a delegate. It is important to mention that, in contrast to this, in Algorand and Cardano all peers can theoretically participate in the consensus.
Some Real World Limitations
Many network nodes store their local copy of the blockchain in a cloud since running a full node requires a huge storage space available. In practice, this makes the system less decentralized, because many of the full nodes are stored in a few service providers. Therefore, a network attack becomes more feasible and would affect the availability of data by cutting the connection between the reader and any local copy. Anyway, the blockchain system can recover if it reacts quickly.
Another important limitation to the decentralization in PoW blockchains are mining pools. At present, the mining power of the entire Bitcoin network is in the hands of a few mining pools. According to these statistics, just 4 colluding mining pools can run the majority attack. Of course, this is not in their own economic interest, but it shows the feasibility of the threat. In this respect, PoS blockchains are superior to PoW blockchains, since they do not depend on external resources (energy, mining tools), but work completely intrinsically, based on the amount of stake. On the other hand, powerful stakeholders can make a collusion to gain majority through stake.
Why to Search for Alternatives to Blockchain
The blockchain use goes far beyond safe storage of arbitrary data. It allows, for example, to introduce digital identities and have a consensus layer to find agreement in a decentralized fashion even in untrusted networks. But all this comes with the trade-off of performance limitations, mostly in number of transactions per second (the limitation of performance became evident during the Crypto Kitties hype) and growth of the size of the entire blockchain. Since the purpose of blockchains is not storage, using it for large sized files becomes costly (around $ 180 for 50 KB in Bitcoin; see this note, p. 18). Moreover, a serious limitation for a journalist could be the maximum limit of data size per transaction which is 100 KB in the case of Bitcoin.
Although the widespread usage of some blockchains makes them a secure storage possibility, we suggest to analyze alternative distributed storage solutions. As long as there are sufficient redundancy guarantees, these solutions may be better suited for the journalist’s purpose. One reason for the broad acceptance of blockchains is the economic incentive in participating. Since distributed storage networks usually do not have an economic incentive system to keep the network “alive”, research about combinations of blockchain with distributed P2P storage to introduce economic incentives is under way (see e.g. SWARM, Filecoin, etc.).
Community decisions in the shape of hard forks are in their technical spirit a “51% attack”. Majority can be “broken” because the community agrees on protocol changes or because they want to revert transaction history via a hard fork (e.g. after the DAO hack of Ethereum). Most miners have an economic interest, hence their incentives are not aligned with the incentive of our journalist. Or, even worse, miners may be against the publishing use, since it could violate the right to forget and in an extreme scenario may cause governments to prohibit the use of blockchain.
As a final advice we would suggest the journalist to avoid the use of small blockchains since they are more vulnerable to majority attacks. We would definitely select a truly decentralized blockchain for replication of the document.
Portrait Image by Brian J. Matis. License creative commons.