After completing this semester of studies at my university, I found myself continuing to ponder the material discussed in my Number Theory course, particularly regarding the unit of cryptography, which dove into various public-key cryptosystems, like Diffie-Hellman, ElGamal, and RSA. I remember one lecture in particular where we briefly discussed the Discrete Logarithm Problem and how it single-handedly forms the backbone of several cryptosystems. At a high level, most of the widely-used encryption schemes today rely on the computational difficulty of solving certain mathematical problems, like factoring large integers or computing discrete logarithms, as the problem name suggests. This piqued my interest, and I decided to explore further the computational efficiency of these systems and how they compare to one another. This post may be a bit more math-heavy than my usual fare, but my goal is to make the ideas concrete and intuitive, especially where performance and design tradeoffs matter.
RSA Overview
We begin with a brief overview of the RSA cryptosystem. Suppose you want to send me a message, encoded as an integer, over a public channel, securely. How can we go about this? We could, in theory, send the message directly or use a simple cipher, but an eavesdropper could easily intercept and decode it. So, the problem becomes: how can we ensure that only you can read the message, even if someone intercepts it? This is the essence of the study of public-key cryptography, and where RSA comes into play. In short, we need to provide just enough information to allow you to encrypt/decrypt messages, but not enough for an eavesdropper to do the same. To do so, we perform the following procedure. Suppose you wish to send me a message over a public channel:
I choose two large primes and , and compute .
I compute . This value helps us generate an exponent , chosen such that .
I compute using the Extended Euclidean Algorithm. This value acts as my private key.
I publish the public key .
You encode your message as and send me the ciphertext .
I decrypt the message by computing
This procedure and variants of it are widely used in modern encryption systems. But, can we improve upon this at all? While RSA is elegant and robust, modular exponentiation on a large scale can be computationally expensive. Although encryption is typically fast since the exponent is usually chosen to be small, commonly being , decryption requires computing where is a large integer of size comparable to . Even though some strategies like binary exponentiation reduce the number of operations, the cost of decryption grows with the number of bits in , resulting in many large-integer multiplications and modular reductions.
This raises a natural question: can we design a public-key cryptosystem with similar security guarantees, but with a cheaper decryption step? One promising direction comes from Rabin-style cryptosystems, which replace exponentiation with squaring: an operation that is significantly faster in practice. The tradeoff, however, is that decryption in the textbook Rabin cryptosystem leads to four possible plaintext results since, for , calculating and yields two solutions each, much like how a positive number has two square roots over the integers. These results then combine to create four distinct square roots modulo (Chinese Remainder Theorem).
The Rabin-p Cryptosystem
This limitation is one of the primary reasons Rabin-style systems are rarely used. However, while exploring this design space, I came across a refinement known as the Rabin-p cryptosystem that resolves this ambiguity while preserving the efficiency benefits of squaring-based decryption. I will again step through the associated algorithms for encoding and decoding messages, following the Rabin-p cryptosystem as presented in Asbullah and Ariffin (2016). Suppose you wish to send me a message over a public channel:
I choose two large primes, and , satisfying , , as well as for some integer . I then compute . This is the public key that I publish.
You encode your message as , ensuring before doing so. You send me the ciphertext .
To decrypt the ciphertext, I first reduce it modulo by computing .
Since , I can efficiently compute a square root of modulo via . This is a result of Euler's Criterion that gives a value satisfying .
At this point, is a square root modulo , but not necessarily modulo . To quantify exactly how far is from when the modulus is raised from to , I compute
To lift the square root from modulo to modulo , I consider a small correction of the form and expand . Matching the resulting expression to modulo forces the congruence , which I solve by computing .
Using the correction term, I construct a square root modulo via . This value is the unique lift of modulo that satisfies .
As with any square root modulo an odd modulus, exactly two roots modulo exist, differing by a sign. This ambiguity is combated by enforcing a message size constraint at encryption time. Exactly one of or lies in the valid range for as defined above. The decrypted message is if , and otherwise.
A full proof of correctness of the decryption algorithm can be found in the paper cited above.
Complexity Comparison
In terms of complexity, however, how does Rabin-p compare to RSA? At a high level, both schemes spend most of their time doing arithmetic on large integers. But how many multiplications and modular reductions do we need to perform encryption and decryption in both schemes? First, we notice that encryption is the simplest operation in both schemes, being modular multiplications for RSA but only a single modular squaring in Rabin-p.
For decryption, the naive view is that RSA performs a single modular exponentiation , with the exponent comparable in bit length to . On the other hand, Rabin-p performs a single exponentiation modulo , followed by a constant number of operations to perform the lift and extract the correct plaintext. Instead of doing work modulo , Rabin-p does work modulo .
However, this comparison is somewhat misleading. In practice, RSA decryption does not need to compute directly. Instead, using the Chinese Remainder Theorem, one computes and , then evaluates and separately, and combines the results via CRT to recover . Since each exponentiation now operates on numbers with roughly half the bit length of , each one is significantly cheaper. The tradeoff is that two exponentiations are required instead of one, but the reduction in operand size more than compensates in practice.
With this optimization in mind, the gap between RSA and Rabin-p narrows considerably. Both schemes ultimately perform their heavy arithmetic modulo factors of rather than modulo itself. Rabin-p still avoids full modular exponentiation in favor of a single exponentiation modulo plus a constant-cost Hensel lift, but RSA with CRT-based decryption is far more competitive than the naive comparison suggests. Each system makes a distinct tradeoff between efficiency, generality, and implementation complexity, and understanding these nuances is what makes the comparison worthwhile.