Fully Homomorphic Encryption: Introduction and Use-Cases

by Nicolas Gama and Sandra Guasch. Posted on Nov 18, 2023

This blog is an introduction to FHE. Rather than diving into mathematical details (which are complex but also interesting!), we aim to provide to the reader a higher level overview of what FHE can be used for and the different scenarios or setups that leverage FHE. In a future blogpost, we will dive into more details about the types of FHE (which influence the kind of computations we can perform) and finally which kind of compilers we can find to translate our programs into operations that can be computed using FHE.

When one thinks about encryption, the first usages that come to mind are encryption at rest and encryption during transport. The first one allows to store some data on encrypted hard drives, removable devices, or even cloud databases and offers the guarantee that only the legitimate owner can see or edit the plaintext content. Encryption during transport guarantees that data transmitted over the Internet is only accessible by its designated recipients, even if it transits through public routers or channels. Both scenarios rely on encryption, with the additional integrity guarantee that the data has also not been tampered with by a malicious attacker in between. This is known as authenticated encryption: once a data is encrypted, nobody in the chain can infer any single bit of data (confidentiality) and nobody can alter the ciphertext without it being detected (integrity/authenticity).

Some collaborative use cases require that some non-trivial processing is allowed even on ciphertexts. This is the domain of privacy preserving techniques, or encryption at use, with fully homomorphic encryption (FHE) being one of them. One example is electronic voting in the cloud: voters may, for instance, encrypt their ballot, then some entity in the middle would aggregate all the ballots to count the number of votes, and only the final result would be published. Unfortunately, with authenticated encryption, the entity in the middle would need to decrypt all the ballots to perform such a computation, and would see the individual votes in the clear, which is quite cumbersome. We could, in theory, shuffle the ballots (some e-voting protocols actually rely on this), but, unlike for paper ballots, the traditional cryptographic mechanisms that guarantee integrity also make it much harder to unlink an encrypted ballot from the identity of its sender. In an e-voting scheme, one could add a hardware wall around the entity that counts the votes. This is, for instance, the objective of trusted execution enclaves. Such enclaves would make it much harder for an attacker to interact with the entity. But then a failure in the hardware could leak the decryption key, and, unlike software errors, hardware design vulnerabilities cannot be patched easily.

To address this and other similar use cases, we can make use of Fully Homomorphic Encryption (FHE). FHE is a special form of encryption that allows one to compute a function over the ciphertexts without decrypting them, and to obtain an encryption of the function’s output directly.

Most of the time, the function $f$ to evaluate is public, so the sequence of processing steps to convert an encryption of $x$ into an encryption of $f(x)$ is public knowledge, and can be carried out in the cloud without involving any secret.

This figure depicts the 3 scenarios for evoting: in the leftmost image, a trusted entity shuffles and decrypts the individual votes before publishing their addition. We must trust the entity doing the computation so that voter privacy is preserved and votes are correctly counted. In the center image a Trusted Enclave, trusted to provide integrity and privacy guarantees, is used to perform the same computation. In the right image, homomorphic encryption is used: the encrypted votes can be added (in public) before the result is decrypted. E( ) means the encryption operation, while D( ) denotes decryption.

This figure depicts the 3 scenarios for evoting: in the leftmost image, a trusted entity shuffles and decrypts the individual votes before publishing their addition. We must trust the entity doing the computation so that voter privacy is preserved and votes are correctly counted. In the center image a Trusted Enclave, trusted to provide integrity and privacy guarantees, is used to perform the same computation. In the right image, homomorphic encryption is used: the encrypted votes can be added (in public) before the result is decrypted. E( ) means the encryption operation, while D( ) denotes decryption.

FHE is also compact, which means that the bit-size of the output ciphertext and the effort to decrypt it only depends on the number of bits in the plaintext result. It does not depend on the chain of computations that was applied. This excludes trivial non-compact cryptosystems that would simply concatenate the inputs ciphertext of $x$ with the source code of $f$, and let the recipient do all the work by decrypting $x$ and then applying $f$.

FHE outsourcing is often presented as an alternative to secure enclaves, based on the hardness of a mathematical problem rather than on practical barriers. FHE is therefore completely invulnerable to passive side-channel attacks, or other corruptions of the cloud host. Imagine that someone needs to outsource some computation, but the data is really sensitive. That person would probably be reluctant to use a cloud VM if someone else can be root on the machine. They would probably also hesitate running it in an enclave like SGX, knowing that the CPU and memory of cloud hosts are constantly monitored for load, power, temperature. Maybe some information can be extracted from these measurements. That person may be reassured by the promise of FHE that extracting any single bit of information requires breaking a post-quantum mathematical problem, independently of any kind of measurement we could gather.

If the confidentiality provided by the scheme prevents an attacker from reading any single bit of information without the secret key, the universal malleability of FHE allows, on the other hand, an attacker to flip any bit that is computed: in a circuit, this would be the equivalent of active side-channel attacks, where the attacker is granted a magical laser beam that can target any bit. It may sound very scary at first, but in FHE these attacks correspond to dishonest executions of the homomorphic operations. They can be avoided by adding replication or redundancies in the computation.

FHE is often presented in a public key manner. There is:

  • A decryption key: This is the masterkey of the cryptosystem, from which all other keys can be derived. The decryption key is typically generated locally and never transmitted, and the only person that can decrypt an FHE ciphertext is the decryption key owner.

  • An encryption key: In public key mode, when the party that provides the initial ciphertext is not the decryption key owner, this medium-sized key usually consists in random encryptions of zero. Since FHE supports affine functions, this is sufficient to encrypt any message.

  • An evaluation key: This key shall be given to any party who will evaluate a function on ciphertexts. This key is also safe to be published or transmit like any public key. The size of the evaluation key varies from empty to very large, depending on whether the function to evaluate is linear or arbitrary.

The decryption key owner is the owner of the most sensitive secret of the cryptosystem. This person is responsible of making sure that the chain of homomorphic operations that were executed is legitimate and that the final ciphertext is safe to decrypt, and then decrypt the plaintext result. If malicious operations are introduced in the chain, the decryption key could potentially be leaked at decryption time. Luckily, homomorphic operations can be done in public and verified.

FHE Scenarios

In this section, we describe a few scenarios in which FHE can be used, as well as some pros and cons of each setup.

Outsourcing Mode

In this figure, the orange key symbolizes the decryption key (and its owner). FHE ciphertexts are represented by locks of the same color as the decryption key. Parties that contribute with private data are represented by a cylinder: here, only Alice contributes with private data. On Bob’s side, the evaluation function and the evaluation key are public, and the computation, depicted by a green box, can be made deterministically. Everyone can retrace the computation and detect if the claimed output ciphertext is incorrect.

In this figure, the orange key symbolizes the decryption key (and its owner). FHE ciphertexts are represented by locks of the same color as the decryption key. Parties that contribute with private data are represented by a cylinder: here, only Alice contributes with private data. On Bob’s side, the evaluation function and the evaluation key are public, and the computation, depicted by a green box, can be made deterministically. Everyone can retrace the computation and detect if the claimed output ciphertext is incorrect.

This is historically the first use case that was published for FHE. It aims at turning cloud computing into a private computing analogous to a secure enclave, but based on cryptographic security instead of hardware security. In such a setting, Alice owns some private data, but has limited computing capabilities. Bob mimics a cloud instance with much larger computing power. Bob does not contribute with any additional private data. Alice can outsource a computation by encrypting the inputs, Bob then evaluates the desired (public) function homomorphically and sends the encrypted result back to Alice.

With the current hardware capabilities, the outsourcing mode is still a bit slow to be used in full generality in practice – we can typically count a running time overhead factor of 1 million and a memory overhead of 1 thousand on non-linear use cases. However, FHE hardware is currently being developed to close the gap, like the Darpa DPRIVE project or CryptoLight.

Currently, the outsourcing mode is used in practice for PIR (Private Information Retrieval) use cases, where a server (Bob) has a large public database, a client (Alice) emits a query, and the index that is queried should remain private. Such PIR schemes benefit a lot from the linearity and compacity of homomorphic encrypted operations, while the small multiplicative depth of the circuits maintains the computation costs within reasonable bounds.

This table summarizes the pros and cons of the outsourcing mode.

PROSCONS
1. The overhead of running a circuit (i.e. constant-time program) homomorphically vs in plaintext is constant in time and memory.1. The constant is actually large: $1000\times$ memory, $10^6\times$ time. There is an additional slowdown due to mapping non-constant time algorithms to an equivalent circuit.
2. Bob doesn’t learn anything about the data received from Alice.2. Bob can report wrong data as a result of the computation, however this can be detected (see PROS 3).
3. Bob’s computation can be publicly rerun, hence any dishonest execution can be detected.3. The verification of Bob’s computations requires a trusted third party.

Two Party Computing Mode

This figure uses the same color coding as previously. This time, Bob contributes to the computation with some private data. The computation on Bob’s side is not publicly verifiable anymore, symbolized by a red box, this mode should be restricted to honest-but-curious use cases.

This figure uses the same color coding as previously. This time, Bob contributes to the computation with some private data. The computation on Bob’s side is not publicly verifiable anymore, symbolized by a red box, this mode should be restricted to honest-but-curious use cases.

In this new setup, the only difference is that Bob contributes to the computation with some private data. In this case, FHE is a good two party compute solution, with minimal communication, and provides strong guarantees on the querier side: Bob learns nothing about Alice’s data, and Alice learns the result of the computation.

A potential application for this scenario would be the millionaire’s problem, where Alice and Bob are two millionaires who want to know who is richer without revealing their wealth to the other party. Solutions for this problem are used in e-commerce applications.

PROSCONS
1. It requires much less communication than other methods for 2-party computations, like garbled circuits or oblivious transfer.1. In general it requires more local computations than other 2-party methods.
2. Bob doesn’t learn Alice’s data from the received information.2. Roles are not 100% symmetric: Alice learns the output of the computation. Potentially this could uncover Bob’s secret information.
3. Suitable for honest-but-curious scenarios, when all parties have a mutual incentive to follow the protocol honestly (while at the same time they try to learn as much information as possible).3. Bob’s computation is not publicly verifiable anymore. Not suitable for a scenario where Bob may be malicious and return the wrong computation.

Aggregation Mode

This is a refinement of the outsourcing mode, where data from many participants can be aggregated in a compact (in the sense that the output doesn’t grow with the amount of participants) and publicly verifiable manner. Typical usages include federated learning and e-voting.

PROSCONS
1. The data from many participants can be aggregated to obtain a constant-size result.1. Additional measures need to be taken to prevent participants from sending corrupt data, which can lead to model poisoning in the case of federated learning, or invalid results in case of e-voting.
2. Neither the aggregator B nor the decryptor C learn the data of a single participant.2. The decrypted aggregation may still leak information about participants’ data. Also, the aggregator B and the decryptor C are assumed not to collaborate, otherwise C may decrypt data from a single participant.
3. This computation is compatible with a map-reduce infrastructure, and the computation on each node is publicly verifiable.3. Regarding PRO 3, some additional replication is needed to actually check the computation is correct.

Client-Server Mode

This setup is a refinement of the two-party-compute mode, where the compute party is now providing a secure compute service to multiple clients, that have their own secret key. FHE can for instance be used as a private model prediction service (e.g. a ML service with a private model and inputs): the server has the private model (private but in plaintext) and each client has its own data and would like to run a prediction. As a result, each client retrieves its own encrypted prediction, with its own secret key.

PROSCONS
1. The server doesn’t learn the clients’ data.1. The result obtained by the client may leak the server data. The server may need to apply additional measures to protect it (such as Differential Privacy).
2. Each client gets only the result of their own query.2. The homomorphic computation is not publicly verifiable: the server should be trusted, or have incentive to follow the protocol.

How does homomorphic encryption ensure that the function computed is legitimate?

It is always easier to use FHE in collaborative scenarios where each participant has an incentive to follow the protocol honestly. FHE can for instance be used to compute statistics between two legal entities of the same group in two different countries: regulations such as GDPR allow some statistics to be published, but prevent in general gathering all indidividual data in the same place. In this case, using FHE is possible and all participants have incentive to follow the protocol honestly. In a non-collaborative scenario, the easiest way of making sure that the relevant function has been computed is to introduce redundancy. For instance, in the outsourcing and aggregation scenarios, homomorphic computations are entirely public and can be made deterministic. As long as two or more independent entities end up with the exact same output ciphertext, then the computation is correct and the result is safe to decrypt. The more redundancy, the higher confidence we can have in the result, which of course is a trade-off with efficiency.

Also, when the compute party vouches for an FHE result by digitally signing the input and output ciphertexts, everyone can retrace the same FHE computation and check whether the proof is valid. Any attempt of cheating by the FHE computing party can be publicly caught, and associated with a publicly verifiable certificate that exposes the cheat and the cheater – we call such model strong covert security model.

Fully Homomorphic signatures are another way of verifying the correctness of a computation, without the need of an independent verifier, but requires in general much more resources.

How does FHE ensure that the final recipient decrypts only the final result and not the intermediate variables?

The easiest way of doing this is to ensure that the decryption key owner does not have access to any intermediate ciphertext.

In the two-party scenario, or in the client-server one, Alice encrypts the input, Bob does the computation over ciphertexts and transmits the encrypted output back to Alice, it is clear that Alice will only be able to decrypt the result, she does not have access to any other variable.

In a cloud aggregation scenario, like e-voting where many participants send encrypted ballots on a common cloud, another technique is used: the decryption key is in general not given to a single recipient but instead secret-shared between different members of a decryption authority. In this case, decryption can only be triggered on one particular ciphertext by executing a multiparty computation, that involves online communication between the members of the authority. If one member refuses to decrypt a ciphertext, decryption is impossible. This ensures that only the ciphertexts that are agreed upon by all the authority members can be decrypted.

The Building Blocks of Homomorphic Encryption

There are three kinds of homomorphic encryption: partial homomorphic encryption (PHE), leveled homomorphic encryption (LHE), and fully homomorphic encryption (FHE). Partial homomorphic encryption only allows us to compute a restricted set of functions (e.g. only a sum, only linear functions, only bilinear functions), whereas leveled and fully homomorphic encryption can evaluate arbitrary circuits, or, equivalently, functions whose control flows are data independent. For LHE, the cryptographic parameters depend on the function and grow with the complexity of the circuit, which in turn results in larger ciphertexts and keys. FHE schemes allow, for a given set of parameters, and thus for a given key and ciphertext size, us to evaluate any function that can be represented as a circuit with arithmetic or binary gates. That is, in contrast to LHE, even if the circuit to evaluate grows more and more, the scheme parameters (and keys and ciphertexts) do not get larger.

In other words, when the question is asked whether a given plaintext circuit can be run homomorphically, and at what cost (in time and memory overhead), PHE is may answer no to the question. LHE answers yes, but may set an arbitrary high cost for complex circuits. FHE also answers yes, and, in addition, it provides the keys, encryption and decryption algorithms, and how to homomorphically evaluate a universal set of gates before the plaintext circuit is even specified. FHE is therefore the only mode that guarantees that the memory and running time of the homomorphic evaluation remains proportional to the original plaintext circuit. To do this, all the FHE schemes known today deal with noisy ciphertexts that get more and more noisy as computations are done. To avoid noise overflowing into the computation done and leading to decryption errors, these schemes regularly perform a quite costly operation called bootstrapping, which reduces noise back to a manageable level. More on the specifics of each sort of scheme, on bootstrapping, and on how to minimise noise and other costs with FHE compilers, in the second blog post of this series!