[PowerPC] Optimize Poly1305 based on radix 2^44 with fat build support
This patch optimizes Poly1305 for powerpc64 architecture by utilizing POWER9-specific instruction vmsumudm
for full 64-bit multiplication applied on 4-blocks at parallel based on radix 2^44
The patch also adds new option --enable-power9
for configuration to compile Power ISA v3.0 code.
testsuite passes all tests of this patch.
Benchmark of poly1305 update using nettle-benchmark on Power9
C | This patch |
---|---|
472.63 Mbyte/s | 2140.15 Mbyte/s |
Edited by Maamoun TK