Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
Nettle
nettle
Commits
cdde35bb
Commit
cdde35bb
authored
Apr 12, 2013
by
Niels Möller
Browse files
ARM umac_nh: Use vmlal, 16% speedup.
parent
3be646d1
Changes
2
Hide whitespace changes
Inline
Side-by-side
ChangeLog
View file @
cdde35bb
2013-04-12 Niels Möller <nisse@lysator.liu.se>
* armv7/umac-nh.asm: New file. 2.1 time speedup.
* armv7/umac-nh.asm: New file. 2.4 time speedup.
* armv7/machine.m4 (D0REG, D1REG): New macros.
* configure.ac (asm_replace_list): Added umac-nh.asm and
...
...
armv7/umac-nh.asm
View file @
cdde35bb
...
...
@@ -30,7 +30,7 @@ define(<QB>, <q1>)
define
(
<
DM
>
,
<
d16
>
)
define
(
<
QLEFT
>
,
<
q9
>
)
define
(
<
QRIGHT
>
,
<
q10
>
)
define
(
<
Q
ACC
>
,
<
q11
>
)
define
(
<
Q
Y
>
,
<
q11
>
)
define
(
<
QT0
>
,
<
q12
>
)
define
(
<
QT1
>
,
<
q13
>
)
define
(
<
QK0
>
,
<
q14
>
)
...
...
@@ -59,7 +59,7 @@ PROLOGUE(_nettle_umac_nh)
vmov.i32
D0REG
(
QLEFT
)[
0
],
SHIFT
vmov.32
D1REG
(
QLEFT
),
D0REG
(
QLEFT
)
vmov.i64
Q
ACC
,
#
0
vmov.i64
Q
Y
,
#
0
vshl.u64
DM
,
DM
,
D0REG
(
QRIGHT
)
.Loop:
...
...
@@ -78,14 +78,12 @@ PROLOGUE(_nettle_umac_nh)
vld1.i32
{
QK0
,
QK1
}
,
[
KEY
]
!
vadd.i32
QA
,
QA
,
QK0
vadd.i32
QB
,
QB
,
QK1
vmull.u32
QT0
,
D0REG
(
QA
),
D0REG
(
QB
)
vmull.u32
QT1
,
D1REG
(
QA
),
D1REG
(
QB
)
subs
LENGTH
,
LENGTH
,
#
32
v
add.i64
QACC
,
QACC
,
QT0
v
add.i64
QACC
,
QACC
,
QT1
v
mlal.u32
QY
,
D0REG
(
QA
),
D0REG
(
QB
)
v
mlal.u32
QY
,
D1REG
(
QA
),
D1REG
(
QB
)
bhi
.Loop
vadd.i64
D0REG
(
Q
ACC
),
D0REG
(
Q
ACC
),
D1REG
(
Q
ACC
)
vmov
r0
,
r1
,
D0REG
(
Q
ACC
)
vadd.i64
D0REG
(
Q
Y
),
D0REG
(
Q
Y
),
D1REG
(
Q
Y
)
vmov
r0
,
r1
,
D0REG
(
Q
Y
)
bx
lr
EPILOGUE
(
_nettle_umac_nh
)
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment