At decentriq we aim to solve a simple problem. We want to make data
collaborations simple and safe. Combining datasets across organizations can
unlock huge value, but companies are reluctant to share data due to sensitivity
concerns. This usually means either involving a trusted third party through a
complicated process, or not doing the project at all.
In this blog we explore how we remove the need for a trusted third party (even
decentriq!), by using Intel® Software Guard Extensions (Intel® SGX). This is the
first in a series of posts explaining this technology and how it allows us to
make data collaborations simple and safe. This post aims at giving a high-level
overview of the concepts before becoming more technical in the following posts.
Ready? Let’s dive in!
A dive into the Secure Enclaves
Secure Enclaves, also known as trusted execution environments
[https://en.wikipedia.org/wiki/Trusted_execution_environment], are computer
programs with additional privacy and security guarantees. Figuratively speaking,
a Secure Enclave is a safe box inside a computer processor that allows data to
get in, lock it, do computation on it, and then ship the results out. While this
process sounds simple (and in principle it is) there are multiple aspects that
make the end-to-end implementation, let’s say… interesting.
> Figuratively speaking, a Secure Enclave is a safe box inside a computer
processor that allows data to get in, lock it, do computation on it, and then
ship the results out.
At decentriq, we use the most advanced type of Secure Enclave, Intel’s SGX, to
guarantee the safety and confidentiality of our users’ data. Intel SGX is
basically an extension of the instructions a CPU can perform, coupled with some
extra circuits on the actual processor. These instructions and circuits enable
running software “safe boxes” (aka enclaves) that are isolated from all the
other processes. In simple terms, this means that it is not possible to look at
the data being processed inside an enclave. This is achieved by enclave-memory
encryption, in other words garbling all data outside the CPU, such that no other
processes can read it. This is often referred to as encryption of data in-use
(while computing on it), and not only in-rest or in-transit (on hard-disk or
when sending it over a network).
Additionally, these instructions allow the enclaves to identify themselves and
their code to remote users. But what does “identify themselves” mean? Next to
the isolation mentioned before, this is the core concept that any good Secure
Enclave needs such that a user can trust it. Simply put, it means that the
Secure Enclave is doing what the user expects. As a user, before I send my data
to a remote program, I want to be sure that this program is what I expect.
Without enclaves, there is no way of getting such a proof. Intel SGX enables an
enclave to provide this proof through a process that is called remote
Still here? Good! We know from experience that hearing these concepts for the
first time can be somewhat intimidating. This is why we will take an even deeper
dive into enclave-memory encryption and remote attestation in a second blog
For now, let’s take these concepts for granted and look at how they translate
into the guarantees our avato platform offers.
The avato guarantees
Remember that avato aims to make data collaborations simple and safe. On the
highest level, a data collaboration consists of multiple parties putting data
into avato and one or multiple parties perform computations inside avatoand
receive results. To make it as safe as possible for our users, we provide some
guarantees to them. Which we. take. very. seriously.
The high level process of working with avatoGuarantee 1 - The datasets can only
be decrypted inside the enclave.
This guarantee means that nobody can look at the data on its way to the enclave.
It is achieved for each user by encrypting their data with well-known (in fact,
the internet runs on it) public-key encryption
[https://en.wikipedia.org/wiki/Public-key_cryptography] using the enclave’s
public key. This public key is received as part of the remote attestation
process when the enclave identifies itself to the user.
Guarantee 2 - Only the intended result leaves the enclave; or none of the
datasets can be inspected by any third party.
This guarantee means that nobody can look at the data when it is “in” the
enclave. It is achieved by the fact that enclave-memory is encrypted also during
the computations. Simply meaning, that not even the operating system is able to
tamper with the data. On top of that, we provide you with a way to validate that
only your code is running in the enclave by giving you the possibility to
reconstruct an identical enclave from open-source material. Hence, we expose
ourselves to public scrutiny and guarantee that only the expected code is
running and nothing else.
Guarantee 3 - Only pre-specified users can decrypt the result.
This is easily achieved by having the enclave create one encrypted result per
pre-specified user, each encrypted with this user’s key.
Wrap up - The road to sensitive data analytics
In this blog we take a shot at explaining in straightforward way a fairly
complicated technology. However, our mission here at decentriq is pretty simple.
We want to make sensitive data analysis simple and safe. Intel SGX is an
extremely useful technology that allows us to enable capabilities to our clients
that were impossible before, at a scale and speed that fits directly into
We hope that this blog gives you a better idea of what we are working with, for
the next blog series we are going to expand on two different aspects. A deeper
technical dive on SGX, and a blog about securing data privacy with SGX while
running Machine Learning models.
In the meantime, if you prefer listening, Intel recently featured us in their
podcast series [https://soundcloud.com/intelchipchat/data-encrypted-enclave].
For any questions please don’t hesitate to contact us!