- Jan 20, 2021
-
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
- Jan 13, 2021
-
-
Switch arm neon assembler routines to endianness-agnostic loads and stores where possible to avoid modifications to the rest of the code. This involves switching to vld1.32 for loading consecutive 32-bit words in host endianness as well as vst1.8 for storing back to memory in little-endian order as required by the caller. Where necessary, r3 is used to store the precalculated offset into the source vector for the secondary load operations. vstm is kept for little-endian platforms because it is faster than vst1 on most ARM implementations. vst1.x (at least on the Allwinner A20 Cortex-A7 implementation) seems to interfer with itself on subsequent calls, slowing it down further. So we reschedule some instructions to do stores as soon as results become available to have some other calculations or loads before the next vst1.x. This reliably saves two additional cycles per block on salsa20 and chacha which would otherwise be incurred. vld1.x does not seem to suffer from this or at least not to a level where two consecutive vld1.x run slower than an equivalent vldm. Rescheduling them similarly did not improve performance beyond that of vldm. Signed-off-by:
Michael Weiser <michael.weiser@gmx.de>
-
- Jan 10, 2021
-
-
Niels Möller authored
* fat-ppc.c: Don't use __GLIBC_PREREQ in the same preprocessor conditional as defined(__GLIBC_PREREQ), but move to a nested #if conditional. Fixes compile error on OpenBSD/powerpc64, reported by Jasper Lievisse Adriaanse.
-
- Jan 04, 2021
-
- Jan 01, 2021
-
- Dec 28, 2020
-
-
Niels Möller authored
-
Niels Möller authored
-
- Dec 27, 2020
-
-
Niels Möller authored
-
- Dec 26, 2020
-
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
* configure.ac: Bump package version, to 3.7. (LIBNETTLE_MINOR): Bump minor number, to 8.1. (LIBHOGWEED_MINOR): Bump minor number, to 6.1.
-
- Dec 21, 2020
-
-
Niels Möller authored
Spotted by Michael Weiser
-
Niels Möller authored
-
Niels Möller authored
[PowerPC64] Skip using getauxval() when it is not available See merge request nettle/nettle!16
-
Maamoun TK authored
-
- Dec 20, 2020
-
-
Maamoun TK authored
-
- Dec 19, 2020
-
-
Niels Möller authored
[PowerPC64] Use 32-bit offset to load data See merge request nettle/nettle!14
-
- Dec 18, 2020
-
-
mamonet authored
-
- Dec 12, 2020
-
-
Niels Möller authored
-
- Dec 08, 2020
-
-
Niels Möller authored
-
- Dec 01, 2020
-
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
* powerpc64/p7/chacha-4core.asm (QR): Instruction level interleaving in the main loop, written by Torbjörn Granlund.
-
- Nov 30, 2020
-
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
* chacha-crypt.c: (_nettle_chacha_crypt_4core) (_nettle_chacha_crypt32_4core): New functions. * chacha-internal.h: Add prototypes for _nettle_chacha_4core and related functions. * configure.ac (asm_nettle_optional_list): Add chacha-4core.asm. * powerpc64/fat/chacha-4core.asm: New file. * powerpc64/p7/chacha-4core.asm: New file. * fat-ppc.c (fat_init): When altivec is available, use _nettle_chacha_crypt_4core and _nettle_chacha_crypt32_4core instead of _2core variants.
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-
Niels Möller authored
-