If Enc(2) + Enc(3) = Enc(5), and Enc(1) + Enc(4) = Enc(5). Does Enc(5) represent the same ciphertext in both cases? I'm asking because, if so, shouldn't it be trivial to uncover the plaintexts if you can perform any math op on the ciphertexts?
It represents the same plaintext, but the ciphertexts are different and both computationally indistinguishable from random noise (unless you know the key)
As KenoFischer says, they are not the same ciphertext, even if we consider a non homomorphic encryption system. Enc is basically a random algorithm, and we need it to return different ciphertexts for the same plaintext, otherwise it would be easy to break - if I know Enc(1) and the scheme is additive, then I'd know Enc(n) for all n...
Are there any existing FHE algorithms with that property, or is it just a theoretical goal for the field?
Every time I've heard FHE mentioned, I've had the same "this sounds like it has all the problems of ECB mode plus some new ones" reaction. This article (like all of the ones I've read) doesn't seem to cover how what you're describing would be achieved.
What is the input to the algorithm that makes two identical cleartexts encrypt to different ciphertexts? In a traditional block cipher, it would be an IV or a "confounder", but IVs are included with the ciphertext, so I'm assuming it's more like a "confounder".
If an FHE algorithm that exists today has this property, how does essentially randomizing the ciphertext not break the ability to perform calculations on it? It seems like whatever does the randomizing would need to be known to all parties in order to take it into account, and so anyone could factor it out in some way to get back to ciphertexts that are identical for identical cleartexts.
Yes, all existing FHE schemes have this property (called semantic security). The encryption algorithm is a randomized algorithm, which takes the plaintext and a random value as input (just like an IV). Note that we're talking about public-key crypto here, which is a different primitive from the symmetric crypto you're thinking of. Each key is actually a key pair consisting of a secret key and a public key. Such cryptosystems are based on some mathematical trapdoor: only with the secret key are you able to "undo" the randomization and learn the plaintext. It therefore doesn't matter if you want to undo the randomization on a direct encryption of a plaintext, or whether the ciphertext is the sum of several ciphertexts.
If you want to see how this works on a bit more technical level, look at the ElGamal cryptosystem [1]. It is in fact partially homomorphic (you can add ciphertexts, but cannot multiply), and it's probably the easiest to understand system with this property.
This is the webpage of TFHE, a recent and quite fast FHE scheme - https://tfhe.github.io/tfhe/ . They have a (surely incomplete) list of applications. I work in a somewhat related field, and I know that current FHE schemes can be used for things like voting and computing basic statistics when the data size is smallish.
Because of the sheer performance challenges, and the availability of SGX as an alternative, and also the competitiveness of MPC, I think most use cases struggle to justify selecting homomorphic encryption as the best choice.
To me, who is involved in related fields but not FHE directly, it seems like practical FHE is probably 15 or more years away, even for niche use cases.
Intel SGX - allows you to run your code on a someone’s hardware fully assured that owner can’t get nor your code not your data.
MPC - Multi-Party Computations. To protect your data and algorithms, you split data and code between multiple parties in special way that prevents them from knowing what exactly was computed.
There are a few companies doing some stuff commercially - some have mentioned TFHC, sponsored by commercial entity called Inpher (https://www.inpher.io/ ) and you have others like Enveil (https://enveil.com), who are building some cool use cases for Private Information Retrieval.
Ultimately, they end up being very niche use cases that are part of a broader security strategy- we are very far away from having this be practical enough for general use cases. Deployment are also difficult because they require client side changes to adapt to the underlying cryptographic protocols.
At my place of employment, we believe the right approach is a combination of locked down execution environments (see: keystone enclave) + webassembly that expose what is effectively a compiler to choose the right cryptographic computation paths based on query planner (similar technologies that power database query engines). It’s not a one-size fits all, but there are ways that you can optimize down to a fully homomorphic operation _for a particular computational path_. If this stuff is interesting to you, we are hiring :)!
Honestly nothing in this article is even that bad, just some elementary group theory. If you wanted to, you could probably pick it up in a few hours. The math for the HE schemes themselves does rely on some intermediate level number theory though (e.g. Chinese Remainder Theorem in polynomial rings) and of course there's a whole formal language for the security proofs also.
I'm always confused by people complaining about knowing 'advanced' mathematical notation. The thing that requires effort to understand is the actual mathematics, not the notation.
I got most of it, but... KeyGen(1^lambda), what does the 1^lambda part mean? They don't really explain that, and I don't remember encountering that notation in any of the mathematics courses I took at university
It's lambda encoded as a unary vector, i.e. a vector (1, ..., 1) of lambda ones. It means KeyGen is parametrized by a natural number lambda, like KeyGen(lambda).
The notion is from computer science or complexity theory I think. The reason it's written like this is because the complexity of functions is described as a function of the length of the inputs, rather than the input itself.
The business card version, as I understand it, involves splitting our plaintext as bits, encrypt each, pass these 'bits' through a logical circuit equivalent to the desired computation (ex a full adder [0]), decrypt the 'bits', and reassemble them into the transformed plaintext. The advanced math comes in encrypting it by choosing values that they follow a lattice [1] .. maybe .. so that evaluating the circuit doesn't trash the encrypted bits.
ELI5 level isn't what I'd want, personally. Cryptography is of interest to many programmers, and the mental gymnastics needed for mathematics and programming are quite similar. This topic could be explained to a much broader audience without the mathematical notation, albeit not nearly as concisely, and it'd likely lack the exacting precision afforded by the notation.
This sounds like people are just scared of the maths. I find it very hard to imagine there are many people who know what modular arithmetic, polynomials, or groups are, but don't know the sigma notation for summing, the forall sign, or the general "prop1, prop2 : conclusion" for 'given prop1 and prop2 then conclusion'.
Every time I've heard FHE mentioned, I've had the same "this sounds like it has all the problems of ECB mode plus some new ones" reaction. This article (like all of the ones I've read) doesn't seem to cover how what you're describing would be achieved.
What is the input to the algorithm that makes two identical cleartexts encrypt to different ciphertexts? In a traditional block cipher, it would be an IV or a "confounder", but IVs are included with the ciphertext, so I'm assuming it's more like a "confounder".
If an FHE algorithm that exists today has this property, how does essentially randomizing the ciphertext not break the ability to perform calculations on it? It seems like whatever does the randomizing would need to be known to all parties in order to take it into account, and so anyone could factor it out in some way to get back to ciphertexts that are identical for identical cleartexts.
If you want to see how this works on a bit more technical level, look at the ElGamal cryptosystem [1]. It is in fact partially homomorphic (you can add ciphertexts, but cannot multiply), and it's probably the easiest to understand system with this property.
https://en.wikipedia.org/wiki/ElGamal_encryption
FHE is interesting but very early.
To me, who is involved in related fields but not FHE directly, it seems like practical FHE is probably 15 or more years away, even for niche use cases.
MPC - Multi-Party Computations. To protect your data and algorithms, you split data and code between multiple parties in special way that prevents them from knowing what exactly was computed.
https://en.wikipedia.org/wiki/Software_Guard_Extensions
https://en.wikipedia.org/wiki/Secure_multi-party_computation
Ultimately, they end up being very niche use cases that are part of a broader security strategy- we are very far away from having this be practical enough for general use cases. Deployment are also difficult because they require client side changes to adapt to the underlying cryptographic protocols.
At my place of employment, we believe the right approach is a combination of locked down execution environments (see: keystone enclave) + webassembly that expose what is effectively a compiler to choose the right cryptographic computation paths based on query planner (similar technologies that power database query engines). It’s not a one-size fits all, but there are ways that you can optimize down to a fully homomorphic operation _for a particular computational path_. If this stuff is interesting to you, we are hiring :)!
https://www.darkreading.com/threat-intelligence/major-brazil...
The notion is from computer science or complexity theory I think. The reason it's written like this is because the complexity of functions is described as a function of the length of the inputs, rather than the input itself.
Of course, its all relative, grade school mathamatical notation is advanced notation to someone who doesn't know it
[0] https://en.wikipedia.org/wiki/Adder_(electronics)#Full_adder
[1] https://en.wikipedia.org/wiki/Lattice_(group)