[x86_64] Implement Poly1305 based on 2^26 using AVX2 (!46) · Merge requests · Nettle / nettle

Maamoun TK requested to merge mamonet/nettle:poly_avx2 into master Apr 30, 2022

This patch adds optimized version of Poly1305 based on 2^26 using AVX2 instructions and YMM registers, it interleaves four-blocks horizontally for each loop iteration.

The patch adds new option --enable-x86-avx2 for configuration to compile AVX2 files.

testsuite passes all tests of this patch.

Benchmark of poly1305 update on intel Core i5-10300H CPU

Upstream (Standard based on radix 64)	This patch (AVX2 based on radix 26)
3900.75 Mbyte/s (1.136 cpb)	6490.70 Mbyte/s (0.691 cpb)

[x86_64] Implement Poly1305 based on 2^26 using AVX2

Merge request reports