Intrigued about Corda? Look no further — a compendium of learning resources to take you from novice to specialist.
Like many Enterprise Distributed Ledger Technologies (DLT), Corda is nascent, burgeoning and quickly taking advanced market share — particularly in financial & capital markets. To understand why, check out my article on Choosing an Enterprise Blockchain.
Some notable heavyweights currently working with Corda include Nasdaq, SIX (SDX), HSBC, and B3i as well as numerous consortiums such as Marco Polo and Contour. There are hundreds more initiatives in stealth mode, either working directly with R3 or with startups building business networks and applications on Corda — such as Ivno and many others.
Those paying attention will likely be keen to capitalise on the growing demand within the Corda ecosystem for talented technologists, executives, marketers and serial entrepreneurs. Similarly, there are no doubt those who count a wealth of experience in financial markets and are eager to be part of the newest paradigm in the industry.
My intention here is to curate links to a number of topics as the industry advances. It should be considered a perpetual work in progress that is updated over time. Those with interest should bookmark and then cherry-pick relevant subjects to their goals to then upskill on. We will cover high-level concepts down to line-by-line code tutorials — use your intuition to gauge your current level of skill and your own critical path to advanced knowledge.
For example, experienced executives are likely to benefit greatly from understanding macro topics, following industry blogs and the like, but likely to experience diminishing returns from understanding the finite details of how to apply a Reference State.
On the flipside a senior developer or architect should understand Corda architectural concepts in detail, and have an understanding of how to apply complex notions such as Confidential Identities through application and testing.
To aid you in your selection of relevant topics, I will designate each link with the following key(s): [ A = Architect; E = Executive; D = Developer ]
(note: I won’t split developers into sub-groups such as QA, Infra, etc — if you are already a specialist you will be qualified to do this yourself, and each subgroup will benefit from learning the macro subjects regardless).
Do not be afraid to diverge between your intended end goal into other topics or keys — the generalist will become the most important archetype in your current or future organisation. By all means specialise, but compound your knowledge utilising other verticals. Trust me or David Epstein on this.
- Start by learning Java and the Java Virtual Machine, or Kotlin — or both!
[ A, D ]
- Serialisation using AMQP protocol, and why this is vitally important in a in a network which cannot assume constant connectivity.
[ A, D ]
- Structured Query Language, AKA SQL. Corda supports PostgreSQL (Open Source Corda), as well as SQL Server, Oracle, Azure SQL (Corda Enterprise).
[ A, D ]
- Remote Procedure Call Protocol (or RPC), used by Corda, as well as Representational State Transfer (REST), and the important differences between the two when building APIs.
[ A, D ]
- Asynchronous programming and State Machines. Corda uses ‘checkpoints’ written to a Corda Node database to ensure asynchronous flows are not thread-blocking and may persist — able to wait dormant for a response until the State Machine is ready to be transitioned. [ A , D ]
- Comparisons of Corda with other platforms, including public Blockchains (Bitcoin & Ethereum) (p.18) as well as Enterprise Blockchains (Enterprise Ethereum & Hyperledger Fabric) (p.19) [ A, D, E ]
- The UTXO computational model. [ A, D ]
Work towards becoming Corda Certified, understand the principles and vision behind the Corda platform, and understand how to utilise the key concepts (Flows, States, Contracts, etc) to begin building your own CorDapps. Getting hands-on with code paired with a healthy amount of theory is the best way to progress here.
- Reading the Corda Whitepaper will impart the vision, features and concepts within Corda. [ A, D, E ]
- Richard Gendal Brown explains the genesis of Corda and it’s original tenets [ A, E ]
- The Corda Documentation is of course the holy bible of reference material for Corda. It covers high-level concepts as well as low-level implementation detail. [ A, D ]
- Read the Corda Training materials [ A, D, E ]
- Sign up for a Corda Bootcamp for scheduled in-person or online training. If the timings don’t work, peruse the bootcamp material yourself or try the Udemy course. [ A, D, E ]
- If face-to-face classroom schedules are preferred, or you have a budget available, try LearnQuest or B9lab. [ A, D, E ]
- A Corda Node ‘Vault’, or relational database. [ A, D ]
- State Objects represent an agreement between two or more parties, governed by machine-readable contract code (below). [ A , D ]
- Contract Code takes a transaction (below) as input, and determines whether the transaction may be considered valid based on defined contract rules. [ A, D ]
- Commands enable multiple output states from a given input state. A command specifies what a transaction may do. [ A, D ]
- The Flow Framework — a highly extensible and re-usable framework which enables parties to coordinate actions in a decentralised but agreed fashion. [ A, D ]
- Transactions transition State objects through a lifecycle. [ A, D ]
- Attachments allow transactions to specify an ordered list of zip file hashes, which may contain code or reference data (written contracts or documents, perhaps) relevant to the transaction. [ A, D ]
- Timestamps and the use of time restrictions within a distributed technology is a famously difficult problem due to each node utilising its own desynchronised clock, as well as issues that might arise around latency or downtime.
Corda solves this with the use of Timewindows, with the source of truth being the transaction notary. [ A, D ]
Validity, whereby parties to a transaction can agree that a proposed lifecycle update is valid.
Uniqueness, whereby parties can be certain that a proposed transaction is the unique consumer of its input states. [ A, D, E ]
- Validating & Non-validating Notaries.
Validating notaries are essentially a counterparty within the transaction, as they will fully resolve transaction chains and all of the accompanying data. Be conscious that validating notaries will have a copy of potentially sensitive data from your CorDapp (if the application lacks mitigating privacy techniques).
Non-validating notaries assume the validity of a transaction and only provide double-spend protection, ensuring the States presented within the transaction are unspent.
Be aware of a security concern with non-validating notaries. Because such a notary is uninterested in the validity of a transaction or whom is consuming it, anybody who knows the hash and index of a State can consume it without in fact being the rightful owner of the State. [ A, D, E ]
- Determinism & the Corda Sandbox JVM. [ A, D ]
- The Global Corda Network, which may be thought of as Corda’s ‘internet’ & The Corda Network Foundation. James Carlyle outlines the top 5 facts about the Corda Network. [ A, D, E ]
- A network map is needed to publish information on how to connect to other nodes on a Corda network. Used as a means of abstracting network IPs in favour of well-known identities or LEIs.
Some further background on the introduction of a network map in Corda 3, by Mike Hearn. More by Joel Dudley. [ A, D ]
- Business Networks. Corda 4.6 has introduced R3’s Business Network Framework. Martin Jee has a concise explainer of the difference between Corda Networks and Business Networks. Richard Brown, R3’s CTO, elucidates further. [ A, D, E ]
- CorDapp Design Language (CDL), diagram-driven CorDapp representations which should be considered a must-learn for any Corda developer or technologist. [ A, D ]
- Quasar as a transformation engine to enable suspendable fibers & checkpointing of Flows. This is a vital part of Corda’s networking function. In the event your Flow counterparty is unable to respond immediately, Quasar ensures that the thread doesn’t simply hang awaiting a response. Clearly this would slow down and eventually stop the node altogether. Quasar allows flows to be checkpointed and serialised to disk, freeing the thread until a response is returned from the counterparty. [ D ]
- Corda is architected to support volumes in excess of billions of daily transactions across a single network. The most high profile example of heavy throughput so far was achieved by Accenture & DTCC with R3’s support — achieving 115 million requests in a single day with only 170 nodes. [ A, E ]
- A State should be entrusted to a single Notary throughout it’s lifecycle to ensure the protection of uniqueness consensus. A Notary without the most recent, unspent version of a State is at risk of facilitating a double-spend attack. Therefore, to use more than one Notary you must use a Notary change Transaction. A technical explanation from Joel Dudley. [ A, D ]
- Contracts in Corda often represent real-life business logic, which is usually complex and changes regularly as workflows become more efficient. For this reason CorDapps are upgradeable.
Contract Constraints ensure that upgrades are safer from malicious developers who might look to co-opt your CorDapp and take control of it.
A Hash Constraint will ensure exactly one version of the CorDapp must exist, and prevent it from being upgraded.
Signature Constraints allows new versions to be updated by explicit keys only, or Composite Keys. [ A, D ]
- Networking & messaging protocols in Corda on a need-to-know basis (lazy propagation), and why Corda avoids gossip protocol for proprietary data.
[ A, D ]
- Message delivery in Corda does not assume an always-live implementation, allowing for node downtime without loss of transaction messaging. Messages are written to disk and delivery is retried to offline entities until the counterparty acknowledges receipt of the message.
Corda uses Apache Artemis as a message broker (network AMQP/1.0 protocol) combined with TLS to ensure message security in transit and authentication between endpoints. [ A, D ]
- Corda Settler, a method to integrate off-ledger payment rails such as SWIFT GPI or XRP. The Settler design can be used to integrate existing non Corda-native settlement options. Of course, if you’re looking for an on-ledger settlement capability, reach out to Ivno! [ A, D, E ]
- Token SDK for issuing Corda native interoperable tokens. Factsheet, Corda docs and Github, and a great tutorial by ecosystem heavyweight Adel Rustum (make sure to follow Adel on Medium). [ A, D ]
- API layer generators, such as Braid & OpenAPI, under the expert guidance of Farzad Pezeshkpour & the accomplished team at LAB577. Gitlab here.
[ A, D ]
- Corda Accounts SDK opens up the possibility for multi-tenanted nodes, increasing scale and greatly reducing the cost of infrastructure and deployment within organisations, or allowing for digital custody for Corda assets and many other use cases. Github. [ A, D, E ]
- Reference States allow a transaction to include shared ledger facts (States) without consuming them or executing their contract logic, thus allowing for constant re-use and continued reference.
Obvious use-cases include KYC attestations and Financial Instrument data.
Chain of provenance and dependent transactions are still verified to satisfy transaction counterparties of the Reference State’s validity.
Still confused? Allow Roger Willis to explain more lucidly than I ever could! [ A, D ]
- Notary Pools (or clusters) & “Pluggable” consensus.
Notaries may be clustered to utilise consensus algorithms such as RAFT (high-speed, high-trust) or BFT (low-speed, low-trust), for example. Corda is not tied to any particular consensus algorithm, instead allowing for immense flexibility, hence ‘plugging’ consensus algorithms to suit the business case. [ A, D ]
- Corda Oracles, allowing for ledger-external data to be used to determine contract validity whilst retaining determinism. Dan Newton fleshes out the concept offering when, how and why to use an Oracle service. [ A, D ]
- A Corda network is always ‘permissioned’, in the sense that access must be granted. Further to this, a Corda network can be mostly public — such as the canonical Corda Network (an ‘internet’)— or mostly private — for example, within an organisation and equivalent to an ‘intranet’.
Corda networks offer various tuneable parameters such as an acceptable list of notaries, minimum accepted Corda version or the network horizon period — which determines the span of time allowed to elapse before an offline node is considered no longer a member of the network. [ A, D ]
- The Flow Hospital is where ‘stopped’ flows are sent. In some cases they may retry automatically (IE — database deadlocks) but in many cases they require developer input to fix, or force a retry. Corda 4.6 has added a HospitalizeFlowException, whereby a developer has much more control and may send flows to the hospital for subsequent retries as desired. [ D ]
- Dependency resolution (or backchain-checking) is a complex subject. For a State to be considered valid, all counterparties within the transaction must be certain of the validity of the chain of provenance right back to issuance. It does this using the ResolveTransactions flow.
Be conscious that with certain highly-liquid assets, these chains of transactions are liable to become lengthy and will thus take longer to validate.
Consider backchain privacy also. For a transaction to be resolved, future and existing holders of the asset must also see previous asset movements in order to truly validate the chain. This might present a privacy leak and provide data on market positions. Confidential Identities or “Tear-offs” as well as State re-issuance can be considered, but each solution has drawbacks.
It is likely Zero Knowledge Proofs or Conclave will provide effective ways of shielding backchain privacy in future versions of Corda. [ A , D ]
- Within Ethereum, famously “code is law”. Within Corda, law is law. Use Attachments to include legal prose, along with @legalProseReference annotations, and established legal procedures may underlie your CorDapp functionality, which will often provide comfort to large enterprise.
[ A, D, E ]
- Scheduling flows to start at specific times can be useful for a number of reasons, such as expiry of an option contract or various other financial contracts. States can implement the SchedulableState interface for this functionality. [ A, D ]
Testing, Samples & Tools
- Corda Samples in Kotlin and Java: split into Accounts, Advanced, Basic, Features & Tokens. [ D ]
- MockNetworkDriver and writing Flow Tests. [ D ]
- Corda Samples. [ D ]
- Node Explorer & Github. [ D ]
- Corda Performance Sampling using JMeter. [ D ]
- Testacles, Corda’s Node Driver & JUnit5, hanging in unison. [ D ]
- VSCode Corda extension. [ D ]
- Cordaptor and a description of Corda integration options by Igor Lobanov. [ D ]
- Chainstack offer a speedy deployment onto Corda Networks. [ A, D ]
Key management & security
- Non-verified keys are typically used to assign keys not stored on the node. For example, signing keys can be stored in off-node hardware in order to move signing authority away from the node infrastructure provider or operator, and back into the hands of the business owner.
This does of course mean that proper key storage, generation and backup needs to be undertaken by the off-ledger entity and is no longer governed by the Corda framework. [ A, D, E ]
- Corda supports the use of Hardware Security Modules, or HSMs, which are in many cases already used by large entities or financial institutions. HSMs are often seen as the golden standard for key storage and tamper-proof protection, but are not a cure-all.
HSMs rarely provide mechanisms to detect improper usage of the key, so control over the Corda node is still an implicit control over the underlying key used for signing which is stored in the HSM. Often they fall short of support for multi-signature or quorum key signing also, as well as being cumbersome to maintain (for example, with ledger-specific curve upgrades) — be aware of such limitations before utilising within Corda.
[ A, D, E ]
- Composite Keys enable certain on-ledger actions to require a complex quorum of keys utilising AND/OR logic.
There are a number of use-cases for composite keys. For example, a transaction could be signed when a quorum of (Alice AND Bob AND Charlie AND Dilbert) were to sign a transaction, but it may also be considered valid if (OR Egbert) were to sign the same transaction. Therefore, composite keys can be used to reflect internal company structure and management (Egbert), or instead providing a cryptographic four-eyes check through key signatures (Quorum).
Composite keys should also be used within Contract Constraints when upgrading CorDapps for extra protection against rogue developers. [ A, D ]
Identity management & Permissions
- Tooling to create and maintain private Corda Networks or testing environments: Corda Enterprise Network Manager, and Cordite’s NMS
[ A, D ]
- Corda on Kubernetes: Part 1 & Part 2, & Github. [ A, D ]
- Governance in Corda, and more acutely governing within the canonical Corda Network. Also Why Public-Permissioned Blockchains are not an oxymoron. [ A, E ]
The Future of Corda
- Conclave. I cannot overstate the potential importance of Conclave for Corda, and indeed for a number of other non-Corda use cases. See a factsheet, and a blog entry by Mike Hearn, Conclave’s Lead Platform Engineer as well as one of the R3 OGs. The CordaCon 2020 presentation should be viewed also.
- Micronodes and machine identities are detailed in the Corda Technical Whitepaper, as well as during Mike Hearn’s 2019 CordaCon keynote.
- Key recovery via Shamir’s secret sharing scheme or key rotation, whereby backup keys may be generated and rotated should the initial key become compromised or lost.
- Physical hardware-signing keys held by the node operator can offer significant protection via multi-factor authentication.
- Dynamic Data Distribution Groups are an attempt to facilitate dynamic membership and subsequent distribution of (transaction) data to said membership.
When joining a network, there may be a need for the dissemination of existing data to ‘catch up’ with other participants and thus be able to efficiently participate in the network. The traditional approach to achieving this would be to include all members as participants of each and every transaction, but of course beyond privacy issues, this also becomes cumbersome and near impossible to maintain at high transaction velocities or to large membership groups.
Data streams of observable transactions sent to all members is a much more efficient way of achieving this.