Key Takeaways
- Blockchains can be either public or private, permissioned or trustless
- IBM Hyperledger and R3 Corda are two of the most widely used enterprise blockchains
- Deployment of real solutions is still limited and patchy
- The space is continuing to evolve and is in its early stages
- Enterprise adoption is still cautious
Navigating the blockchain space can be very challenging.
A large number of articles have been written about the subject, many of which are filled with a large amount of hot air and hype, as well as specialist technical and other jargon.
In this article, we will explain the difference between the two major branches of blockchain projects (public and private) as well as some fundamental technical terms related the area.
This will allow us to address a fundamental question in the current discussion of blockchains and related solutions: What are the valid use cases for using a public, trustless blockchain vs a distributed private ledger vs a traditional database?
Some of the most important terms that are used in the blockchain space are:
- Trusted third-party - A system where certain facts (e.g. the identity of participants) cannot be verified except by referring to a privileged (often centralized) authority.
- Trustless - A system which relies on no trusted third-party for any aspect of its operation, including transaction confirmation or identity verification.
- Proof-of-work (POW) - Finding the solution to a mathematical puzzle (typically a hashing problem) that has no short-cut algorithm and so must solved by brute computational force.
- Unspent transaction output (UTXO) - In some blockchains (e.g. Bitcoin), transactions consume inputs and leave some (the outputs) _unspent_. These unspent outputs are then available to become the inputs for future transactions.
- Virtual machine model - Some blockchains (notably Ethereum) have an abstract model of how the overall state of the system is represented, and how this state is updated. This model can usually be described by a formal model of a virtual state machine, for example the Ethereum Virtual Machine (EVM).
- Smart contract - A small, event-driven program that can deployed into a blockchain that supports program execution. Once deployed, the program will continue to execute, using blockchain transactions as inputs and can take actions that cause further transactions to be executed. The code of the program is protected cryptographically.
- Turing-complete - A Computer Science term that can be read as "fully capable programming language". All mainstream languages such as Java, Javascript, Python, Ruby, Go, etc are all Turing-complete. For technical reasons, some blockchains may choose not to make the full power of Turing-complete languages available to smart contract writers.
There is a lot more detail about each of these terms, but some of the most important aspects of them include the following:
- Most transaction systems are completely or partially reliant on some trusted third parties. Trustlessness, on the other hand, is a quite remarkable property, but it is not obtained cheaply - considerable extra complexity and effort must be expended to make a system behave trustlessly. The key to this is the POW algorithm for consensus used by many blockchains.
- Once a solution to a POW problem has been found (essentially by trial-and-error on a vast scale) then the correctness of the solution can be demonstrated by any participant immediately. Good POW problems have statistical properties that allow any observer to estimate reliably how much computation time was expended to stumble across a solution. This makes them suitable for use as a distributed consensus mechanism in public blockchains (e.g. Bitcoin).
- The UTXO model provides one simple route to ensure transaction integrity and prevent the same Bitcoins from being used twice in two separate transactions (the _double-spending problem_). This is by ensuring that any input to any transaction must appear in the collection of outputs - the UTXO database.
- By contrast, the virtual machine model (notably implemented by Ethereum) offers a significant extension - the ability to store arbitrary state and run simple programs within the network in a trustless and fully decentralized manner.
With these main definitions (and some consequences) clarified, we can now take a look at some of the main blockchain-based technologies in use in the world today.
Bitcoin
The original cryptocurrency, which uses the unspent transaction output (UTXO) model for the ledger. It uses a simple POW algorithm for mining, based on guessing a random string which, when combined with the last transaction block, causes the SHA-256 hash of the composite to be numerically less than a small threshold value.
The participant that successfully guessed the answer is said to have "mined a block", and the transactions that were contained in the block are added to the ledger.
This then allows a very simple method of determining consensus - participants simply agree that the longest chain should be regarded as the basis for mining the next transaction block.
Bitcoin was not designed to accommodate smart contracts. As a result, extremely limited functionality is available, mostly through the novel use of side channels that happen to exist within the protocol. The resulting functionality is in no way Turing-complete and instead most analyses of Bitcoin simply ignore it.
The resulting system is therefore purely that of a cryptographically secure ledger.
It has no identity semantics beyond the use of cryptographic signatures to verify the origin of transactions, and is completely trustless.
Ethereum
This ledger builds on some of the ideas of Bitcoin, but models the state of a single, global virtual machine rather than using the UTXO model. The key innovation is the addition of a Turing-complete smart contract capability. This is the Ethereum virtual machine (EVM), a VM created specifically for use in the context of a distributed ledger with smart contracts.
In Ethereum, program state is private and belongs to individual contract addresses, and is altered by a series of EVM bytecode instructions, which are the contents of smart contracts.
The overall, global state is then derived by aggregating the program state of each contract address.
All full nodes in the Ethereum network follow the model's rules. They can calculate the system state for any contract address's state on their machine and as long as they use the same transactions (which constitutes the input data within the Ethereum model), they will arrive at the same result.
As Ethereum uses a global consensus algorithm and has a concept of globally latest block, then the overall transaction processing rate (i.e. the effective _clock speed_) of the Ethereum virtual machine is limited by the block production rate. Adding more hardware and computing power to the Ethereum network does not make it any quicker or more powerful, merely more tamper-proof.
The use of Turing-complete smart contracts allow additional functionality to be added on top of the network without all participants needing to be aware of them. This allows, for example, the Ethereum network to issue software tokens that are held as additional state within the Ethereum virtual machine. This forms the basis of so-called Initial Coin Offerings (ICOs).
The EVM is superficially similar to the JVM and similar environments, but makes different designs in some important areas. In particular, the design of EVM bytecode makes static analysis of compiled code much harder than for established alternatives. This is not a selling point for an execution environment that requires a very high degree of transparency and verifiability.
The low-level EVM environment is also not a particularly human-friendly programming environment. As a result, several higher-level languages have been created that compile down to EVM bytecode. Of these, the most well-known is Solidity.
Inspired by more mainstream programming languages, such as Javascript and Java, the Solidity language also incorporates novel features for interacting with the Ethereum blockchain.
In some respects, Ethereum is a victim of its own success. Its emergence as the platform of choice for ICOs led to a high price for ETH (the Ethereum token) in early 2018. The amount of this cryptocurrency held by the Ethereum team themselves led to a situation where many of the major players have a large vested interest in the status quo, and in wanting to realize their paper profits.
Corda
R3 Corda uses the UTXO model (like Bitcoin) but also includes Turing-complete smart contracts as part of the design. These contracts are represented as JVM bytecode, with optional additional determinism guarantees that restrict contract semantics.
The approach does not use a single global lock (block height) to control advancement of ledger state, but instead allows non-conflicting transactions to proceed in parallel. This effectively fine-grains the lock, at the cost of requiring a more complex and subtle notion of time and clocks. In the Corda model - the simple "longest chain wins" rule is no longer sufficient as a consensus algorithm.
Participants are known and trusted, as third-party identity verification is a natural part of the Corda architecture.
As a side effect of the desire to remove the global ledger lock, Corda includes exit / entry semantics to disentangle transactions from the main chain and to prevent all transactions from becoming entwined over time. This is possible as the architecture strongly depends upon the trusted third-party model.
As a result, the authority of certain participants to retire ledger items (the equivalent of coins in cryptocurrencies) and replace them with freshly minted equivalents that have no transaction history can be guaranteed by the identity authorities.
The creators of Corda do not see it as being that similar to a cryptocurrency.
Instead, they regard the technology as forming the basis of shared infrastructure at the whole-market level, rather than at the level of an individual company. [link to panel part 2]
This positions Corda as an enterprise blockchain intended for use by organisations that can benefit from common infrastructure and a shared view of the state of the world, rather than maintaining separate versions of records, which inevitably lead to reconciliation problems.
HyperLedger
The HyperLedger project, started by IBM, is another of the leading enterprise blockchain solutions.
The Hyperledger Fabric is a blockchain framework implementation and one of the Hyperledger projects hosted by the Linux Foundation.
IBM's primary design goals for the project include confidentiality, resiliency, flexibility, and scalability.
Like Corda, Hyperledger uses a permissioned architecture.
It implements a deterministic Practical Byzantine Faul Tolerant (PBFT) algorithm, which ensures that once a transaction completed notification is received, it is genuinely done.
IBM have invested in solid Docker integration, including testing within containers.
Smart contracts for Hyperledger can be written in Java, with an SDK available (Go contracts are also possible).
Hyperledger separates nodes by roles, which include full peers, certificate authority nodes (needed for the permissioning) and orderers that group transactions into blocks.
The Hyperledger blockchain state is modelled as a versioned key-value store (KVS), where keys are names (strings) and values are arbitrary blobs.
This is a very low-level interface, above which Hyperledger provides a layer called Ledger, which provides a verifiable history of all successful state changes.
The code and architecture of Hyperledger is still evolving quickly, but actual production trials are starting to appear, and it is emerging alongside Corda as one of the solutions more likely to be used by enterprises.
Use cases
The use cases for blockchains are still being hotly debated.
There is the obvious example of censorship-resistant digital currencies.
However, the volatility and fragmentation seen in the cryptocurrency market during 2018 seems to suggest that the actual applicability of trustless digital currencies is limited.
From the enterprise perspective, it is becoming clear that they can also be used to create systems or networks that are deployed as a shared construct between multiple entities that don't necessarily trust each other yet want to share data and maintain a form of consensus about concerns that all parties care about.
These use cases, where a centralized authority is unacceptable to the participants, or too costly to set up, are still emerging.
This is despite the time, effort and venture capital that has deployed into the wide array of blockchain projects created to date.
As more projects come to market as we move into 2019, it remains to be seen whether the promise of blockchain will ever amount to the major impact that its advocates have now been promising for quite some time.
About the Author
Ben Evans is a co-founder of jClarity, a JVM performance optimization company. He is an organizer for the LJC (London's JUG) and a member of the JCP Executive Committee, helping define standards for the Java ecosystem. Ben is a Java Champion; 3-time JavaOne Rockstar Speaker; author of "The Well-Grounded Java Developer", the new edition of "Java in a Nutshell" and "Optimizing Java" He is a regular speaker on the Java platform, performance, architecture, concurrency, startups and related topics. Ben is sometimes available for speaking, teaching, writing and consultancy engagements - please contact for details.