Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
N
nettle
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Labels
Merge Requests
5
Merge Requests
5
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Commits
Open sidebar
Nettle
nettle
Commits
39c03743
Commit
39c03743
authored
Feb 12, 2013
by
Niels Möller
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
armv7: Optimized aligned case of memxor, using 3-way unrolling.
parent
610677e4
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
45 additions
and
23 deletions
+45
-23
ChangeLog
ChangeLog
+5
-0
armv7/memxor.asm
armv7/memxor.asm
+40
-23
No files found.
ChangeLog
View file @
39c03743
2013-02-12 Niels Möller <nisse@lysator.liu.se>
* armv7/memxor.asm (memxor): Optimized aligned case, using 3-way
unrolling.
2013-02-06 Niels Möller <nisse@lysator.liu.se>
* armv7/memxor.asm (memxor, memxor3): Optimized aligned case, now
...
...
armv7/memxor.asm
View file @
39c03743
...
...
@@ -18,6 +18,12 @@ C along with the nettle library; see the file COPYING.LIB. If not, write to
C
the
Free
Software
Foundation
,
Inc.
,
51
Franklin
Street
,
Fifth
Floor
,
Boston
,
C
MA
02111
-
1301
,
USA.
C
Possible
sp
eedups
:
C
C
The
ldm
instruction
can
do
load
two
registers
per
cycle
,
C
if
the
address
is
two
-
word
al
igned.
Or
three
registers
in
two
C
cycles
,
regardless
of
al
ignment.
C
Register
usage
:
define
(
<
DS
T
>
,
<
r0
>
)
...
...
@@ -131,38 +137,49 @@ PROLOGUE(memxor)
b
.Lmemxor_bytes
.Lmemxor_same:
tst
N
,
#
4
it
ne
subne
N
,
#
4
bne
.Lmemxor_same_loop
ldr
r3
,
[
SRC
],
#
+
4
ldr
r4
,
[
DS
T
]
eor
r3
,
r4
str
r3
,
[
DS
T
],
#
+
4
subs
N
,
#
8
bcc
.Lmemxor_same_end
.Lmemxor_same_loop:
C
6
cycles
per
iteration
,
0.75
cycles
/
byte
ldr
r4
,
[
SRC
,
#
+
4
]
ldr
r3
,
[
SRC
],
#
+
8
ldr
r6
,
[
DS
T
,
#
+
4
]
ldr
r5
,
[
DS
T
]
eor
r4
,
r6
eor
r3
,
r5
subs
N
,
#
8
str
r4
,
[
DS
T
,
#
+
4
]
str
r3
,
[
DS
T
],
#
+
8
C
8
cycles
per
iteration
,
0.67
cycles
/
byte
ldmia
SRC
!
,
{
r3
,
r4
,
r5
}
ldmia
DS
T
,
{
r6
,
r7
,
r12
}
subs
N
,
#
12
eor
r3
,
r6
eor
r4
,
r7
eor
r5
,
r12
stmia
DS
T
!
,
{
r3
,
r4
,
r5
}
bcs
.Lmemxor_same_loop
.Lmemxor_same_end:
adds
N
,
#
8
C
We
have
0
-
11
byte
s
left
to
do
,
and
N
holds
number
of
byte
s
-
12
.
adds
N
,
#
4
bcc
.Lmemxor_same_lt_8
C
Do
8
byte
s
more
,
leftover
is
in
N
ldmia
SRC
!
,
{
r3
,
r4
}
ldmia
DS
T
,
{
r6
,
r7
}
eor
r3
,
r6
eor
r4
,
r7
stmia
DS
T
!
,
{
r3
,
r4
}
beq
.Lmemxor_done
b
.Lmemxor_bytes
.Lmemxor_same_lt_8:
adds
N
,
#
4
bcc
.Lmemxor_same_lt_4
ldr
r3
,
[
SRC
],
#
+
4
ldr
r4
,
[
DS
T
]
eor
r3
,
r4
str
r3
,
[
DS
T
],
#
+
4
beq
.Lmemxor_done
b
.Lmemxor_bytes
.Lmemxor_same_lt_4:
adds
N
,
#
4
beq
.Lmemxor_done
b
.Lmemxor_bytes
EPILOGUE
(
memxor
)
define
(
<
DS
T
>
,
<
r0
>
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment