Sharding is going to be scalability solution in upcoming Ethereum 2.0 and the enhancement hope for many other blockchains. Scalability is the speed key factor of the blockchain. This word is used to describe all the technology solutions which have the major goal: processing more transactions per second. Currently Bitcoin process about 3-7 transactions per second and Ehtereum about 7-15. With shards this result would be much increased.
What are the reasons of limited blockchain efficiency?
What’s the main, basic idea of sharding?
What’s the Blockchain Trillema?
I’m explaining in this article.
The reasons of limited blockchain speed
Blockchain is a decentralized database which works, process, saves and communicates in real time. Consists of nodes, which can be described as processing units (just software) which does all of these functions. What’s distinctive about classical blockchain model is that all the processing is done on all nodes of the network. Communication and data, processing and saving in blockchain is based on transactions. Transactions are like digital envelopes exchanged in the network, containing all the data to process and save.
Speaking about processing and saving I simplify a little bit (there is much more work in blockchain to do like reaching consensus and a lot of math calculations) but my intention is to underline the difference between transactions in blockchain. Especially in the blockchains like Bitcoin and Ethereum.
Bitcoin blockchain offers only sending cryptocurrency (in theory it offers a little more but these features are not used) – in practice it’s based on reading and saving some data representing ownership of cryptocurrency amount (these data chunks are described as UTXO, can be processed only with user crypto keys). The processing of these data is generally described as cryptocurrency sending. This logic is simple enough that it can be described as data saving only (simplified).
In case of Ethereum we speak about much more sophisticated mechanism than only data saving. Ethereum allows to run smart contracts – software running in blockchain. So Ethereum does more – it saves but also really process (by smart contract logic) the data. And that’s what does every node in the network.
Every Ethereum node process transactions which are sending Ether transactions or/and process the transactions which are intended to invoke smart contract functions. As you can probably imagine – the computational work which must be done by every node in network is big and just in theory – scalability of such designed system is limited. In such model there is just a limit, which is the limit of maximum transactions capacity which can be processed by the collective network in the given period of time. It’s even very hard to guess the concrete number of transactions per second which blockchain can handle – because this number is depended on a lot of factors: transaction type (sending or smart contract invocation), data size, network synchronization time, etc.
There are some ideas to improve blockchain network scalability. Generally every of them has some pros and cons. Increasing the block size (in order to give ability of saving more transactions to the blockchain in given time) is one of the ideas – this what Bitcoin Cash does (generally not best idea). Second layer is the idea of moving some transactions outside the main chain, so less transactions are saved to the main blockchain (example is Lighting Network in Bitcoin or Plasma in Ethereum).
Sharding is the solution which should be present in future Ethereum version. Let’s take a look on it.
How sharding works?
The idea of sharding using simple words:
All transactions shouldn’t be processed by every node in the network. The network should be divided on groups (shards), and groups should independently process the subset of all transactions. This will give the possibility of processing all the transactions – but split on subsets, in parallel (in the same time) by the independent group of nodes.
The diagram shows the idea:
Currently all the nodes process the same transactions what makes the total speed 5 tx/s of the entire network. Splitting the network on 10 shards, where very shard will process similar amount of transactions as is processed currently by entire network – we have efficiency of about 50 tx/s (just the perfect scenario, not taking into consideration the other factors like time needed for nodes data exchange, consensus, etc.).
Problems of sharding and Blockchain Trillema
Sharding is not simple. In distributed systems space (most often implemented as microservices) there is a law called CAP theorem. This theorem describes 3 properties of distributed system, related to nodes which are the parts of such system. These properties are: consistency, accessibility and partition tolerance.
- Consistency – in the given moment of time all nodes have the same consistent data.
- Accesibility – the system as a collective always gives the response to the request.
- Partition tolerance – the system as a collective always works well, even if nodes cannot communicate with each other.
The meaning of this law is given: it’s impossible to create the system which will have all of these 3 properties. The distributed system always can have maximum 2 of these 3. In other words: it’s impossible to create perfect system (at least no one designed it yet). In real world of IT software solutions the partition tolerance property is chosen almost always, so generally system are designed additionally to be consistent or accessible.
I described that law because blockchain seems to have some kind of similar restrictions. Blockchain Trillema is similar to CAP theorem. It was introduced by Vitalik Buterin.
Blockchain Trillema
The blockchain network can have maximum 2 of these 3: decentralization, scalability, security.
The ultimate goal is to design the blockchain which will have all 3 properties. This is something the Ethereum community works on, with Vitalik on the front. As far as I know this is rather theorem than a law in case of blockchain – currently it’s probably easier to design the blockchain which will have these 3 properties than design the distributed system which will fulfill 3 CAP theorem requirements.
Unfortunately currently the Blockchain Trillema in case of sharding can cause for example security problems. Let’s imagine the blockchain, which is based on shards. Such system is decentralized and because of sharding: is effective, scalable. However there are plenty of security problems here (third property of Blockchain Trillema). For example there can be a situation where during shards divided processing – the attacker of the network can control whole shard. Having such power – in theory the attacker can lie the entire network and mess the transactions. Such situation should never ever happen.
Challanges of sharding
In blockchains network almost always the most important factor is the security. I told about it already. I would like to say something yet about challenges related to “using” blockchains based on shards.
In Ethereum exists possibility to communicate between smart contracts. Smart contract A can in the own function define the invocation of function from smart contract B. In classical model, where each node contains the copy of the blockchain and process all the transactions, possible is the action called revert of changes. If function from smart contract A will finish with error (no matter what reason), but the function processing already invoked (before throwing error) the function from smart contract B – the changes made by function from smart contract B will be reverted. After all there will be no state/blockchain change, because function from smart contract A failed. Such described feature in the architecture based on sharding will be much more complicated to do. Why? Because there might be the scenario where invoking the function from smart contract B will be based on communication between shards.
The Ethereum community express some fear about the programming model of smart contracts which exists today, however Vitalik seems to be calm and assure that these mechanics will not be changed (at least there will be somehow such possibility as exists currently).
Summary
Ethereum 2.0 is comming. The total migration from 1.0 to 2.0 will probably take few years. The change will touch many blockchain fields. The scalability optimization in form of shards probably will be introduced after switching to Proof-of-Stake consensus algorithm. According to the plan, we can expect first changes in 2020.
Vitalik Buterin, the leader of Ethereum seems to be confident about the upcoming changes and is determined to introduce the best possible solution. It’s very responsible and pioneer role. Luckily there are a lot of ideas introduced, brain storm keeps going.
What do you think about sharding? Is it the good approach? Maybe you think there are better solutions?
Let me know in the comments – Przemek.