Buying data is a lengthy process on its own. Finding the right vendor, type of
data and passing all the regulatory and legal procedures takes time and costs
money. On top of that, in practical reality, the procedure is even lengthier
because it starts before the decision to buy the data.
Identifying which dataset has the desired profiles and statistical qualities
often can only be done after starting the acquisition process because of
confidentiality and privacy concerns. This multiplies the costs and time needed
by the number of datasets that are explored.
The way this problem is dealt with today is described in the image below. The
data acquirer and the data holder both give their datasets (or the requirements)
to a trusted third party, which then aggregates all the data and manually
identifies the profile pairing.
The data sharing process today is pretty ancientWhile this is cheaper and faster
than getting through the whole acquisition process, it still takes weeks and
costs tens of thousands of dollars.
One of the most prominent examples is the healthcare industry and the case for
linking patient data. There are undeniable benefits in augmenting a dataset with
different data about the same profiles (1
). Examples range from linking images with prescription data to following the
patients after a clinical trial to the "real world". Having this kind of
datasets, data analytics experts can accelerate drug development
and can help health care companies deliver much better care by better
understanding their patients
and the effects of their drugs. The combination and the acquisition of data is
however stumbling upon two blockers:
1. Finding the data that contain the relevant profiles is very costly and
2. Patient profile linking is time-consuming because of the privacy and
It's a match! – from 2 weeks to 2 minutes
Following our commitment to make the lives of our customers simpler while
preserving privacy and security, we created the avato platform
includes the Private Set Intersection Instance (PSII). The PSII is based on the
latest advancements in cryptography and allows a software solution without
having to involve a third-party. It runs on any browser that can securely and
1. Identify the overlap between an owned dataset and a target dataset on
chosen keys (for example, patient profiles).
2. Ensure that after the purchase of the dataset, the linkage of the profiles
will be as friction-less as possible by anonymously linking the dataset, without
any human ever seeing the whole data.
3. Allow paying only when the intersection threshold is met.
Private Set Intersection Instance - User story
With the Private Set Intersection Instance from the avato platform, no data is
revealed or moved before the purchase. 1. Buyer Inc identifies a dataset which
is potentially interesting for its needs from:
a. Partnerships with data providers that are already in place
b. decentriq’s listing of available datasets/providers
2. After getting in touch with the data
holder, both Buyer Inc and Seller Inc downloads and open decentriq’s PSII to
simply drag and drop their datasets on the instance. The software extracts
the metadata of both companies, encrypts the datasets and compares them to
identify the linkage/overlap potential.
3. The users get a percentage score of
how many profiles they have in common, and the data buyer receives some
anonymized statistics regarding the target data
4. If the parties decide to transact, our
tool anonymously links their datasets without any party (including
decentriq) ever seeing the complete, de-anonymized dataset. The data always
The promise of trusted analytics in healthcare
Artificial Intelligence has been long promised to revolutionize how the
healthcare industry operates. While change has indeed happened in the past
years, the adoption has not been as extensive as some have expected. The
sensitivity of the underlying data, and the privacy-breaching power that their
combination hold, are some of the biggest hurdles in data cooperation in the
industry. We demonstrate that this dilemma no longer holds.