[S390x] Optimize SHA256 and SHA512 compress functions
This patch optimizes SHA256 and SHA512 compress functions for s390x architecture, the testsuite passes the tests. Benchmark on Z15:
Algorithm | C | Hardware-accelerated |
---|---|---|
SHA265 | 242.76 Mbyte/s | 869.00 Mbyte/s |
SHA512 | 373.18 Mbyte/s | 1555.21 Mbyte/s |