summaryrefslogtreecommitdiff
path: root/plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README
diff options
context:
space:
mode:
authorKirill Volinsky <mataes2007@gmail.com>2012-05-19 18:01:32 +0000
committerKirill Volinsky <mataes2007@gmail.com>2012-05-19 18:01:32 +0000
commitb1509f22892dc98057c750e7fae39ded5cea3b09 (patch)
tree6bdcc9379ae86339a67022b758575729d1304074 /plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README
parente7a776a6f5ab323cd9dd824e815846ef268fa7f1 (diff)
added MirOTR
git-svn-id: http://svn.miranda-ng.org/main/trunk@83 1316c22d-e87f-b044-9b9b-93d7a3e3ba9c
Diffstat (limited to 'plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README')
-rw-r--r--plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README84
1 files changed, 84 insertions, 0 deletions
diff --git a/plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README b/plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README
new file mode 100644
index 0000000000..5a2d5fd970
--- /dev/null
+++ b/plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README
@@ -0,0 +1,84 @@
+This directory contains mpn functions for various HP PA-RISC chips. Code
+that runs faster on the PA7100 and later implementations, is in the pa7100
+directory.
+
+RELEVANT OPTIMIZATION ISSUES
+
+ Load and Store timing
+
+On the PA7000 no memory instructions can issue the two cycles after a store.
+For the PA7100, this is reduced to one cycle.
+
+The PA7100 has a lookup-free cache, so it helps to schedule loads and the
+dependent instruction really far from each other.
+
+STATUS
+
+1. mpn_mul_1 could be improved to 6.5 cycles/limb on the PA7100, using the
+ instructions bwlow (but some sw pipelining is needed to avoid the
+ xmpyu-fstds delay):
+
+ fldds s1_ptr
+
+ xmpyu
+ fstds N(%r30)
+ xmpyu
+ fstds N(%r30)
+
+ ldws N(%r30)
+ ldws N(%r30)
+ ldws N(%r30)
+ ldws N(%r30)
+
+ addc
+ stws res_ptr
+ addc
+ stws res_ptr
+
+ addib Loop
+
+2. mpn_addmul_1 could be improved from the current 10 to 7.5 cycles/limb
+ (asymptotically) on the PA7100, using the instructions below. With proper
+ sw pipelining and the unrolling level below, the speed becomes 8
+ cycles/limb.
+
+ fldds s1_ptr
+ fldds s1_ptr
+
+ xmpyu
+ fstds N(%r30)
+ xmpyu
+ fstds N(%r30)
+ xmpyu
+ fstds N(%r30)
+ xmpyu
+ fstds N(%r30)
+
+ ldws N(%r30)
+ ldws N(%r30)
+ ldws N(%r30)
+ ldws N(%r30)
+ ldws N(%r30)
+ ldws N(%r30)
+ ldws N(%r30)
+ ldws N(%r30)
+ addc
+ addc
+ addc
+ addc
+ addc %r0,%r0,cy-limb
+
+ ldws res_ptr
+ ldws res_ptr
+ ldws res_ptr
+ ldws res_ptr
+ add
+ stws res_ptr
+ addc
+ stws res_ptr
+ addc
+ stws res_ptr
+ addc
+ stws res_ptr
+
+ addib