Skip to content
GitLab
Menu
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
Brian Smith
nettle
Commits
39c03743
Commit
39c03743
authored
Feb 12, 2013
by
Niels Möller
Browse files
armv7: Optimized aligned case of memxor, using 3-way unrolling.
parent
610677e4
Changes
2
Hide whitespace changes
Inline
Side-by-side
ChangeLog
View file @
39c03743
2013-02-12 Niels Möller <nisse@lysator.liu.se>
* armv7/memxor.asm (memxor): Optimized aligned case, using 3-way
unrolling.
2013-02-06 Niels Möller <nisse@lysator.liu.se>
* armv7/memxor.asm (memxor, memxor3): Optimized aligned case, now
...
...
armv7/memxor.asm
View file @
39c03743
...
...
@@ -18,6 +18,12 @@ C along with the nettle library; see the file COPYING.LIB. If not, write to
C
the
Free
Software
Foundation
,
Inc.
,
51
Franklin
Street
,
Fifth
Floor
,
Boston
,
C
MA
02111
-
1301
,
USA.
C
Possible
sp
eedups
:
C
C
The
ldm
instruction
can
do
load
two
registers
per
cycle
,
C
if
the
address
is
two
-
word
al
igned.
Or
three
registers
in
two
C
cycles
,
regardless
of
al
ignment.
C
Register
usage
:
define
(
<
DS
T
>
,
<
r0
>
)
...
...
@@ -131,38 +137,49 @@ PROLOGUE(memxor)
b
.Lmemxor_bytes
.Lmemxor_same:
tst
N
,
#
4
it
ne
subne
N
,
#
4
bne
.Lmemxor_same_loop
ldr
r3
,
[
SRC
],
#
+
4
ldr
r4
,
[
DS
T
]
eor
r3
,
r4
str
r3
,
[
DS
T
],
#
+
4
subs
N
,
#
8
bcc
.Lmemxor_same_end
.Lmemxor_same_loop:
C
6
cycles
per
iteration
,
0.75
cycles
/
byte
ldr
r4
,
[
SRC
,
#
+
4
]
ldr
r3
,
[
SRC
],
#
+
8
ldr
r6
,
[
DS
T
,
#
+
4
]
ldr
r5
,
[
DS
T
]
eor
r4
,
r6
eor
r3
,
r5
subs
N
,
#
8
str
r4
,
[
DS
T
,
#
+
4
]
str
r3
,
[
DS
T
],
#
+
8
C
8
cycles
per
iteration
,
0.67
cycles
/
byte
ldmia
SRC
!
,
{
r3
,
r4
,
r5
}
ldmia
DS
T
,
{
r6
,
r7
,
r12
}
subs
N
,
#
12
eor
r3
,
r6
eor
r4
,
r7
eor
r5
,
r12
stmia
DS
T
!
,
{
r3
,
r4
,
r5
}
bcs
.Lmemxor_same_loop
.Lmemxor_same_end:
adds
N
,
#
8
C
We
have
0
-
11
byte
s
left
to
do
,
and
N
holds
number
of
byte
s
-
12
.
adds
N
,
#
4
bcc
.Lmemxor_same_lt_8
C
Do
8
byte
s
more
,
leftover
is
in
N
ldmia
SRC
!
,
{
r3
,
r4
}
ldmia
DS
T
,
{
r6
,
r7
}
eor
r3
,
r6
eor
r4
,
r7
stmia
DS
T
!
,
{
r3
,
r4
}
beq
.Lmemxor_done
b
.Lmemxor_bytes
.Lmemxor_same_lt_8:
adds
N
,
#
4
bcc
.Lmemxor_same_lt_4
ldr
r3
,
[
SRC
],
#
+
4
ldr
r4
,
[
DS
T
]
eor
r3
,
r4
str
r3
,
[
DS
T
],
#
+
4
beq
.Lmemxor_done
b
.Lmemxor_bytes
.Lmemxor_same_lt_4:
adds
N
,
#
4
beq
.Lmemxor_done
b
.Lmemxor_bytes
EPILOGUE
(
memxor
)
define
(
<
DS
T
>
,
<
r0
>
)
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment