[S390x] Optimize SHA3 permute using vector facility (!36) · Merge requests · Nettle / nettle

Maamoun TK requested to merge mamonet/nettle:s390x-sha1 into master Oct 21, 2021

This patch optimizes SHA3 permute function by taking advantage of supported vector facility. Vectorizing SHA3 permute fits more than applying SHA3 hardware-accelerator for s390x architecture in terms of implementing the actual permute procedure only rather than executing unneeded extra procedures which are handled by other functions in nettle library. Applying SHA3 hardware-accelerator in a previous patch yielded 12% performance boost while this patch has ~105% performance increase for SHA3 functions. The optimized core follows the same optimization procedure that used in SHA3 permute implementation for x86_64 architecture.

Algorithm	C (Mbyte/s)	Vectorized (Mbyte/s)
sha3_224	235.08	483.41
sha3_256	226.15	460.68
sha3_384	172.90	357.15
sha3_512	120.46	243.96

Edited Oct 24, 2021 by Maamoun TK

[S390x] Optimize SHA3 permute using vector facility

Merge request reports