Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
Dmitry Baryshkov
nettle
Commits
dd8652d4
Commit
dd8652d4
authored
Oct 03, 2011
by
Niels Möller
Browse files
Implemented sse2-loop. Configured at compile time, and currently
disabled. Rev: nettle/x86_64/memxor.asm:1.3
parent
af1d0e1c
Changes
1
Hide whitespace changes
Inline
Side-by-side
x86_64/memxor.asm
View file @
dd8652d4
...
...
@@ -28,7 +28,9 @@ define(<TMP2>, <%r9>)
define
(
<
CNT
>
,
<%
rdi
>
)
define
(
<
S0
>
,
<%
r11
>
)
define
(
<
S1
>
,
<%
rdi
>
)
C
Overlaps
with
CNT
define
(
<
USE_SSE2
>
,
<
no
>
)
.file
"memxor.asm"
.text
...
...
@@ -78,6 +80,10 @@ PROLOGUE(memxor3)
jnz
.Lalign_loop
.Laligned:
ifelse
(
USE_SSE2
,
yes
,
<
cmp
$
16
,
N
jnc
.Lsse2_case
>)
C
Ch
eck
for
the
case
that
AP
and
BP
have
the
same
al
ignment
,
C
but
di
fferent
from
DS
T.
mov
AP
,
TMP
...
...
@@ -209,4 +215,40 @@ C jz .Ldone
.Ldone:
ret
ifelse
(
USE_SSE2
,
yes
,
<
.Lsse2_case:
lea
(
DS
T
,
N
),
TMP
test
$
8
,
TMP
jz
.Lsse2_next
sub
$
8
,
N
mov
(
AP
,
N
),
TMP
xor
(
BP
,
N
),
TMP
mov
TMP
,
(
DS
T
,
N
)
jmp
.Lsse2_next
ALIGN
(
4
)
.Lsse2_loop:
movdqu
(
AP
,
N
),
%
xmm0
movdqu
(
BP
,
N
),
%
xmm1
pxor
%
xmm0
,
%
xmm1
movdqa
%
xmm1
,
(
DS
T
,
N
)
.Lsse2_next:
sub
$
16
,
N
ja
.Lsse2_loop
C
FIXME
:
See
if
we
can
do
a
full
word
first
,
before
the
C
byte
-
wise
final
loop.
jnz
.Lfinal
C
Final
operation
is
al
igned
movdqu
(
AP
),
%
xmm0
movdqu
(
BP
),
%
xmm1
pxor
%
xmm0
,
%
xmm1
movdqa
%
xmm1
,
(
DS
T
)
ret
>)
EPILOGUE
(
memxor3
)
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment