Skip to content

[S390x] Optimize SHA3 permute using vector facility

This patch optimizes SHA3 permute function by taking advantage of supported vector facility. Vectorizing SHA3 permute fits more than applying SHA3 hardware-accelerator for s390x architecture in terms of implementing the actual permute procedure only rather than executing unneeded extra procedures which are handled by other functions in nettle library. Applying SHA3 hardware-accelerator in a previous patch yielded 12% performance boost while this patch has ~105% performance increase for SHA3 functions. The optimized core follows the same optimization procedure that used in SHA3 permute implementation for x86_64 architecture.

Algorithm C (Mbyte/s) Vectorized (Mbyte/s)
sha3_224 235.08 483.41
sha3_256 226.15 460.68
sha3_384 172.90 357.15
sha3_512 120.46 243.96
Edited by Maamoun TK

Merge request reports

Loading