Mathematicians Seal Loophole to Breaking RSA Encryption

(quantamagazine.org)

137 points | by chmaynard 8 days ago

8 comments

  • wbhart 8 days ago

    The title and the article are severely misleading. One can easily try the method described in the article and find that empirically, it doesn't work for large numbers. That implies that no one is seriously taking such a naive approach. You don't need a mathematical proof either way to know this.

    But even so, the paper doesn't show that all polynomials are irreducible. It doesn't even show most are, just ones of a very special form (with 0 and 1 coefficients, randomly chosen in a certain way). Therefore, the paper doesn't rule out other kinds of polynomials, even if this were a serious approach.

    Furthermore, even if the paper showed that all kinds of polynomials were almost always irreducible, that wouldn't rule out the existence of a very clever algorithm for writing the integers as values of polynomials that aren't irreducible.

    Consider the following analogy: there are exceedingly many possible prime factors of 6000 digit numbers; far too many to practically test them all one after the other. I can produce a mathematical proof that shows that I'll never test primality of a 6000 digit number by trying all possible prime factors in order. But that doesn't prevent the existence of algorithms that can quite quickly tell you whether the number is prime or composite (such algorithms actually exist). Just because something looks hard or it can be proven that some naive approach is computationally infeasible, doesn't mean there isn't a very clever algorithm that makes it easy.

    The authors of the actual math paper really asked for any unwanted attention they get from this news article, though. They actually give the integer factorisation problem as a motivation in their paper!

    • segfaultbuserr 8 days ago

      This is the actual paper.

      https://arxiv.org/abs/1810.13360

      It's worth noticing that Riemann Hypothesis is assumed in the proof, as I layman, I think it means the proof merely gives us more confidence about the difficulty of integer factorization. But integer factorization itself still remains an open question.

      > Abstract. We consider random polynomials with independent identically distributed coefficients with a fixed law. Assuming the Riemann hypothesis for Dedekind zeta functions, we prove that such polynomials are irreducible and their Galois groups contain the alternating group with high probability as the degree goes to infinity. This settles a conjecture of Odlyzko and Poonen conditionally on RH for Dedekind zeta functions.

      Page 2, Section 1.1, Motivation has a summary.

      > Beyond its intrinsic interest, the problem of irreducibility of random polynomials of high degree is motivated by some other problems, which we now briefly discuss. It is believed to be computationally difficult to determine the prime factorization of integers. On the other hand, polynomial time algorithms are known for computing the factorization of polynomials in Z[x]. Given an integer N ∈ Z_{>0}, we can write it as N=P(2) for a unique polynomial P with 0,1-coefficients. By computing the factorization of P in Z[x] and evaluating the factors at 2, we can obtain a factorization of N.

      > The only weakness of this approach is that the polynomial P may be irreducible and thus the factorization of N obtained may be trivial. The problem we study in this paper thus asks for the probability that this procedure returns only a trivial factorization. Therefore, it is desirable to have results, such as those of this paper, proving that this probability converges to 1 very fast. We will discuss our method in Section 1.3. The method links the problem of irreducibility of random polynomials with mixing times of certain Markov chains, which are "mod p" analogues of the Bernoulli convolutions we had studied in earlier work.

      > In this paper, we use results available for the Markov chains to study random polynomials, but this can be reversed. In particular, in a forthcoming paper, we will use the results of this paper to obtain new results about the Markov chains. Our results on irreducibility assume the Riemann hypothesis for Dedekind zeta functions, or at least some information on the zeros. In our last theorem, Theorem 7, we show that conversely irreducibility of random polynomials has (modest) implications about the zeros of Dedekind zeta functions.

      • throwawaymath 8 days ago

        There are a huge number of proofs published every year which assume the Riemann Hypothesis. So while you're technically correct that this only gives us confidence, it does significantly reduce our uncertainty and compartmentalize it to the Riemann Hypothesis.

        That's really what assuming the Riemann Hypothesis is useful for in modern mathematics. You can obtain new results for a wide variety of things using the following routine:

        1. Assume the Riemann Hypothesis is true and attempt to prove your theorem.

        2. Now assume the Riemann Hypothesis is false and attempt to prove the same theorem.

        If you were only able to prove your theorem under one of those conditions, you have a nice result. If you were able to prove the theorem under one condition and disprove it under the other condition, you've got a great result (see the list of things which are equivalent to Riemann). And of course, you might find that you've proven or disproven the theorem for both conditions, which is still good.

      • openasocket 8 days ago

        I noticed that this proof technically doesn't assume RH, but rather that the Dedikind zeta functions over K (where K is an algebraic extension of a root of a polynomial with binary coefficients) satisfy the riemann hypothesis. From my minimal understanding of number theory, this seems to be a stronger statement than the regular Riemann Hypothesis, but weaker than the Extended Riemann Hypothesis. I'm curious where this assumption falls on the hierarchy of different variations on the Riemann Hypothesis.

      • jcranberry 8 days ago

        > The mathematicians Emmanuel Breuillard and Péter Varjú of the University of Cambridge proved that as polynomials with only 0 and 1 as coefficients get longer, they’re less and less likely to be factorable at all.

        ...

        > Breuillard and Varjú proved that it’s nearly impossible to find polynomials of that length that can be factored.

        Can someone provide a link to the paper? I'm assuming that what they proved is that (my math is rusty, so what I'm saying may be incorrect) solving polynomials of a single variable with all coefficients either 1 or 0 becomes prohibitively difficult as the the degree increases, which is what I'm getting from the second comment, not that the frequency of irreducible polynomials increases, which is what I get from the first quote.

      • jmount 8 days ago

        The idea of writing A in binary and forming the 0/1 polynomial such that p(2) = A is nifty. Then if you factor p(x) it into two polynomials with integer coefficients u(x)v(x) you have u(2) and v(2) are divisors of A, so if none of them are equal to +-1 you have found something. Assuming it is easy to factor the polynomials.

        Factoring polynomials is somewhat easy, but you have to specify factoring into what.

        The terminology needs a little clean-up. By the fundamental theorem of algebra we know every polynomial over the reals factors into a product of polynomials of degree no more than 2. So they probably don't mean factoring/primality in general but a definition of factoring in to square-free polynomials. Factoring into square-free polynomials is easy as if q(x)*q(x) divides into p(x) then it also divides into p'(x) (the derivative of p(x) with respect to x) and therefore divides into gcd(p(x), p'(x)). And the gcd() will have integer coefficients and be degree less than p(x) (so not equal to p(x) and degree >= q(x), so if q(x) is non-trivial it forces something nice to happen; even if you don't know what q(x) is).

        • WhiteSage 8 days ago

          You want the coefficients of the factors to be integers. That is why you can very seldomly factor them.

        • smartbit 8 days ago

          Can we now safely use RSA keys of length 2048? Or is it still advised to use only use 4096bit long keys?

          This notwithstanding that ED25519 is preferred above RSA, eg as OpenSSH supports Curve25519 for 5 years and is the default when both the client and server support it https://www.openssh.com/txt/release-6.5

          • wbhart 8 days ago

            No. This proof says nothing about the security of RSA. I don't know anyone seriously trying to break RSA by factoring polynomials. It's pretty easy to test a bunch of random polynomials and pretty quickly find that this just isn't going to work. You don't need a proof to know this empirically, which is why it isn't a serious attack.

            Even if most polynomials were irreducible, which this work doesn't show, that wouldn't rule out having a method that cleverly, but efficiently writes the numbers as polynomials that aren't irreducible.

            Also, according to the article, the result is only for polynomials with 0 or 1 coefficients, which aren't the only polynomials, obviously!

          • throwawaymath 8 days ago

            I dislike that Quanta used the term "Back Door" in their headline. That has a pretty specific meaning in information security generally and cryptography particularly. This was not a back door.

            The tl;dr: it was conjectured that a more feasible method of factoring large numbers might exist by finding their unique polynomial representation. Polynomials are more quickly factored than integers, and to every integer there corresponds a unique polynomial per the fundamental theorem of algebra.

            But that method doesn't actually work, because as polynomials get larger they become commensurately more difficult to factor. In other words, it is as or more difficult to find the polynomial you'd want than it is to just find the prime factors for any given RSA semiprime.

            This is a neat result, but nothing is changing for RSA.

            • pbsd 8 days ago

              The first step in the number field sieve is to represent the number to be factored as a polynomial (two, in fact), and it is assumed in the following steps that these polynomials do not split. If these polynomials did split with reasonable probability, the number field sieve would be a lot shorter.

              The usual reference for this is Brilhart et al. [1], which deals with polynomial representations in arbitrary bases, as opposed to strictly base 2, as is the case here.

              [1] https://doi.org/10.4153/CJM-1981-080-0

              • segfaultbuserr 8 days ago

                To people with some infosec knowledge, the poor choice of word "backdoor" sounds as if that the article is about importing an asymmetric backdoor to RSA encryption, which is not the case. If the author wants to have a casual style of writing, I think "loophole" could be used.

                • throwawaymath 8 days ago

                  Agreed. "Loophole" would have been a particularly better choice of terminology. RSA has never had a "back door" in the sense of, e.g., DUAL_EC_DRBG.

                  • dang 8 days ago

                    Ok, we've put a loophole in the title above.

                  • chrismeller 8 days ago

                    I would go a step further and say that “sealed” implies they somehow changed something that directly impacts all RSA encryption. At least that’s the way I read it. Almost like taking down a botnet by taking over the domain or IP it’s C&C server uses.

                    The reality was far less movie plot worthy.

                    • segfaultbuserr 8 days ago

                      > “sealed” implies they somehow changed something that directly impacts all RSA encryption.

                      So we are talking about holes in RSA, some readers may find the following interesting, that in fact, RSA does have a problematic aspect (well, many) that can be called a "loophole": the padding. You are encrypting a short message (the symmetric key) by using a large public key, so the message have to be padded. From now the real fun started, intuitive solutions are prone to various forms of mathematical and cryptographic tricks, you get chosen ciphertext attack, and gives an attacker a straightforward "padding oracle". If you can manipulate the ciphertext and asks the server if the ciphertext is valid (or not), often, the attacker can deduce knowledge of the message or even the private key. Bleichenbacher attack and Coppersmith attack are known in the 90s, yet we still see exploitations in 2015. Proper padding is tricky to design and implement. As a response to this problem, OAEP, a padding algorithm with rigorous security proof, has finally been designed, and it should put an end to these attacks eventually. Yet, recently there have been some theoretical doubts about the soundness of the proof.

                      But it turns out, you don't have to encrypt the session key as a message and padding it securely to begin with. Cryptographer Victor Shoup purposed RSA-KEM. The basic idea is, we can just keep generating a stream of meaningless random numbers using a CSPRNG, until it's long enough to be encrypted by the public key. Then, put these random numbers to a hash function (or any other secure KDF), the output will be the session key. This scheme is known as RSA-KEM, offers strong security guarantee, yet it only uses a simple approach that is hard to get wrong.

                      https://en.wikipedia.org/wiki/Key_encapsulation

                    • Animats 8 days ago

                      That's Quanta, which is a crap science mag. That article doesn't even seem to link to the actual paper. The actual paper is linked above.

                      It's good to know that this approach to factoring is less vulnerable than suspected. We still don't have a proven lower bound on factoring, though. There's always the possibility that someone will make a breakthrough.

                      • 8 days ago
                        [deleted]
                        • hopler 8 days ago

                          Quanta does that intentionally, as clickbait to generate popular interest in their mathematical aricles that don't actually have applications. It's a scientific embarassment. Nature's magazine/blog does it to.

                          • hackinthebochs 8 days ago

                            > It's a scientific embarassment.

                            A little over the top, don't you think? A little clickbait isn't a disservice to the science. Quanta is a layman magazine. Baiting layman interest with more familiar terms even if less technically accurate is ultimately a service to science.

                        • TheRealPomax 8 days ago

                          This is textbook scientific clickbait. Catchy title, then an article that says nothing other than affirming the status quo, while then making a claim that demonstrates the title was a clickbait title.

                          "mathematicians show that the longer the input, the less likely it is that input can be factored" is the literal reason we use longer and longer keys as time goes on.

                          Shame on this author. These mathematicians did some nice work, and yet the author felt the need to sell snake oil based on it.

                          • smadge 8 days ago

                            I don’t understand your anger. The conclusion is that it was previously empirically believed to be the case but logically unproven that these polynomial representations of integers were unlikely to be factorable. If I understand this article correctly, it is now logically proven, shutting the door on that particular attack.

                            • TheRealPomax 7 days ago

                              But it _doesn't_ shut the door: it leaves the door exactly where it already was. It was the case that to make it harder to crack, you make the number bigger. The proof shows that, yes, even if you try the polynomial trick, you're still bound to that limitation: the longer the polynomial, the less likely you'll find an factorisation.

                              But that proof doesn't change anything. It's a good proof, and the author abuses it to make a claim that cannot be made. No doors have been shut, the status is still quo. Except the author got some ad revenue for writing a clickbaity article that made it onto hacker news.

                          • dannykwells 8 days ago

                            Can anyone provide a deeper (Algebraic? Number theoretical?) explanation of the polynomial/prime connection? I'd never heard of it before and it's interesting.

                            • throwawaymath 8 days ago

                              This is a half hour 3Blue1Brown video: https://youtu.be/NaL_Cb42WyY

                              If you're willing to watch at least the first half of it, I think it'll give you a very intuitive grounding for 1) why integers have unique polynomial representations, and 2) why that's related to (semi)prime factorization.

                              • openasocket 8 days ago

                                Essentially they are both integral domains: https://en.wikipedia.org/wiki/Integral_domain

                                The space of integers and polynomials have a lot of things in common. They have well-defined addition and multiplication operations, they have a zero value and a one value (for integers it is literally 0 and 1, for polynomials it's the constant function returning 0 and 1, respectively) that obey the usual laws (x + 0 = 0, x1 = x, x0=0, etc). They also both have no zero divisors, i.e. if a*b = 0, then either a=0 or b=0. From just those basic notions you can define divisibility and factorization in a way that applies to both polynomials and integers.

                                In a very deep way, any property or concept you define on the integers can (often, but of course there are exceptions) be extended to apply to polynomials. There might be some category-theoretic way to formally state that relationship, I'm not sure.

                                • pdpi 8 days ago

                                  The way positional number systems work, each position corresponds to a power of the base — e.g. 415 = 4 * 10^2 + 1 * 10^1 + 5 * 10^0. A slightly more complicated way to say this is 415 = 4 * x^2 + 1 * x^1 + 5, where x = 10. Ignore the x = 10 part for a bit, this is your polynomial. You can factor this polynomial normally without changing the fact that substituting x = 10 yields the original number. Because of this, it follows that the factors of the polynomial must also be divisors of the original number when you substitute the variable for the appropriate value!

                                  • amelius 8 days ago

                                    But why is it useful to turn the base of the number system into a variable? Why go through polynomials when you can factor integers directly?

                                    • pdpi 8 days ago

                                      Because factoring integers is notoriously hard (which is why RSA is safe), but factoring polynomials is easier, and factoring a polynomial results in finding a divisor for the corresponding integer. Crucially, this relationship is not symmetrical — compound polynomial guarantees compound number, but compound number doesn't guarantee compound polynomial — so the open question was: how often can you use this trick to make breaking RSA easier. What this result proves is that the attack is highly unlikely to work.

                                  • convivialdingo 8 days ago

                                    Here's my layman's take - basically it's similar to how computers convert binary integer representations to integers.

                                    For instance, a binary number of 0b1111 could be represented as 2^3 + 2^2 + 2^1 + 2^0 = 15, or 8+4+2+1=15. You now have a typical polynomial which can also be factored algebraically and quickly using something like Mathmatica.

                                    Not all of these binary conversion type polynomials can be factored however as they're irreducible. There's a lot of shifting around in big numbers when you compare the normal binary conversion and what is algebraically viable to be factored.

                                    In my conjecture - it's not clear to me the exact mathematical connection between the normal binary conversion and the perfect polynomial factors we're searching for. I believe it still may be possible to find these unknown polynomial factors by migrating some of the polynomial terms through redistribution of higher value terms and/or performing some sort of hard search. So instead of 2^4+2^1, you can rewrite as 2^3+2^3+2^1, for example.

                                    The factors are exponentially smaller and the search space is reasonable considering that it's basically possible to reduce the polynomial to a large matrix.

                                    I gave up on this path a year ago, so I may be a bit off in my recollection. I was using SymPy (a computer algebra system) but the speed was an issue.

                                    • pdpi 8 days ago

                                      > I believe it still may be possible to find these unknown polynomial factors by migrating some of the polynomial terms through redistribution of higher value terms and/or performing some sort of hard search.

                                      You start with a fairly obvious representation of the integer as a polynomial. If that doesn't give you some sort of factorisation, then any information that allows you a better guess at a factorable polynomial is also information that helps you outright factor the original number.

                                      • 8 days ago
                                        [deleted]
                                    • ljcn 8 days ago

                                      Think about units, tens, hundreds, ..., and stare at step 3 in the article.

                                      • daveFNbuck 8 days ago

                                        In their example of 1111, this binary representation means 2^3 + 2^2 + 2^1 + 2^0. So plugging in 2 to the polynomial x^3 + x^2 + x + 1 gives you the original value.

                                        If the polynomial is factored, plugging 2 in still gives you the same value, but now you have at least 2 integers that multiply to give you the original value, so you've found a factor.