DISCIPLINA Trustless Data Trade Protocol
One of the major goals of the DISCIPLINA blockchain is to provide a way for Educators to monetize the data on the achievements of their students. What if the recruiters could search through validated educational records and buy verified transcripts of the students they are interested in? It will open up a lot of new possibilities and may transform the world of recruiting as we know it.
It is quite difficult to implement this idea correctly: there are a lot of concerns related to the possibility of fraud. For example, imagine if some educational institution announces it has lots of data and some recruiter pays lots of money for it. The educational institution receives the money, and can then transfer useless data it has never actually had to the recruiter. How do we avoid such situations?
The solution is trivial if there is a trusted intermediary that can observe the data transfer and hold the money. There are also algorithms of the so-called “optimistic fair exchange”, where the intermediary acts as a judge and does not always have to be online. However, data exchange in a trustless environment by now has been only probabilistic: two parties exchange chunks of data, increasing the certainty that the other party plays fair.
We at DISCIPLINA like challenging tasks. We needed to make a deterministic two-party fair data trade protocol, and we knew that we have blockchain technology at our disposal. Let’s start with defining the problem in a more formal way. The protocol has two parties — the Buyer (B) and the Seller (S). There is also an intermediary that can hold money and validate the data, but, unlike conventional data exchange algorithms, this intermediary is a blockchain — a decentralized ledger with a consensus mechanism that can ensure the validity of all the transactions (assuming an honest majority). Note that the nodes of this blockchain do not have to observe the data as long as the parties agree on the fairness of the deal. We also assume that there is a way for all the parties to determine if any meaningful chunk of data is valid (e. g. in DISCIPLINA we check whether the data chunks along with their Merkle paths match the Merkle roots on the public chain). In rare cases of dispute we use blockchain as a final judge — nodes of the network observe the chunk the Buyer claims to be invalid, and decide whether the acquisition is justified.
How do the nodes do it? Well, the general idea is as follows (we omit the price negotiation and some exit points in this description):
1. Prior to the trade, the Buyer notifies them of the validation function that will be used for the trade. Every chunk of the data sold should be valid in terms of this validation function.
2. The Seller then encrypts each chunk with a session key and computes a Merkle root of the encrypted data. He announces this Merkle root so that the nodes in the network can see it.
3. The Buyer creates a session keypair, and sends the session public key to the buyer on-chain so that everyone can observe it. He also sends coins that are locked on the smart-contract with this transaction.
5. The Seller sends a fixed security deposit to the contract and waits for a pre-agreed confirmation period in order to be sure there is no chain rollback.
4. The Seller transfers the encrypted data chunks to the Buyer off-chain. Thus, we avoid polluting the blockchain and disclosing the entire dataset in case of dispute.
5. After the Buyer acknowledges that he has received the encrypted chunks, the Seller publishes a session key encrypted with the public session key of the Buyer.
6. Now the Buyer can decrypt the data.
6a. If the Buyer finds out that some data chunk is invalid, he can just identify that invalid chunk and reveal his private session key. Everyone can then decrypt the chunk and apply the validation function. In order to prove that this chunk was indeed among the data that the Seller sent to the Buyer off-chain, the Buyer also provides a Merkle path of this chunk. In case the Buyer can prove the data is invalid, he receives his money back, along with the security deposit of the Seller.
6b. If all the data is valid, the Buyer acknowledges it through the on-chain transaction. If he fails to do so, the contract sends the money to the Seller automatically after some time has passed.
The protocol described above provides a way to trade the validated information for cryptocurrency, and guarantees that no party can gain advantage over the other. We use this protocol in DISCIPLINA to guarantee that only valid educational records leave the educational institutions upon the Recruiter’s request. However, the protocol can be adapted to trade almost all types of arbitrary data, as long as this data can be divided into meaningful chunks and the validity of the chunks can be determined from on-chain sources of information.
You can find the detailed description of the protocol in the DISCIPLINA technical paper: https://disciplina.io/yellowpaper.pdf (see sec. 3.7 Data Disclosure). Please feel free to contact us if you have any questions or feedback!