The Barrett-Domb modular multiplication is a low-complexity, hardware-friendly modular multiplication algorithm based on the novel Barrett modular reduction scheme.
In this work, we introduce a multi-precision, GPU/CPU friendly version of the Barrett-Domb algorithm. We show that our scheme is competitive with the widely used Montgomery modular multiplication, potentially retiring the need for the combersome Montgomery conversions.
Read this paper on in its original form on our Github
A brief overview explanation
Modular multiplication of big integers is at the heart of every SNARK computation. It’s the most expensive basic building block. The one that often determines the complexity of the ENTIRE protocol.
Improving mod-mul, even by a little bit, can result in significant acceleration…
...So obviously, we at Ingonyama are obsessed with this problem.
In July of last year, we had our first stab with the introduction of the Barrett-Domb modular multiplication:
Feedback was great, and we implemented the algorithm into our hardware. Other teams implemented it as part of Zprize.
Now we are releasing Part Two, a multi-precision version that can support GPU / CPU, and in some cases can even replace the famous Montgomery multiplication!
Take NTT, for example. Due to a large number of conversions into/out of Montgomery form, our algorithm achieves a lower multiplications count.
Paper is here: https://github.com/ingonyama-zk/papers/blob/main/multi_precision_fast_mod_mul.pdf
Code is here: https://github.com/ingonyama-zk/modular_multiplication
PRs are welcome 🧙♂️