[AArch64] Optimize AES with fat build support
This patch optimizes AES encrypt/decrypt functions with each key size has its own implementation to load the key expansion just once at function prologue which yields a considerable performance increase over loading the key expansion for every block iteration. The patch also adds fat build support for the AES functions.
make check passes all tests. Benchmark of executing
|Algorithm||mode||C (Mbyte/s)||OpenSSL (Mbyte/s)||This patch (Mbyte/s)|