diff options
author | Kirill Volinsky <mataes2007@gmail.com> | 2012-05-19 18:01:32 +0000 |
---|---|---|
committer | Kirill Volinsky <mataes2007@gmail.com> | 2012-05-19 18:01:32 +0000 |
commit | b1509f22892dc98057c750e7fae39ded5cea3b09 (patch) | |
tree | 6bdcc9379ae86339a67022b758575729d1304074 /plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README | |
parent | e7a776a6f5ab323cd9dd824e815846ef268fa7f1 (diff) |
added MirOTR
git-svn-id: http://svn.miranda-ng.org/main/trunk@83 1316c22d-e87f-b044-9b9b-93d7a3e3ba9c
Diffstat (limited to 'plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README')
-rw-r--r-- | plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README | 84 |
1 files changed, 84 insertions, 0 deletions
diff --git a/plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README b/plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README new file mode 100644 index 0000000000..5a2d5fd970 --- /dev/null +++ b/plugins/MirOTR/libgcrypt-1.4.6/mpi/hppa/README @@ -0,0 +1,84 @@ +This directory contains mpn functions for various HP PA-RISC chips. Code +that runs faster on the PA7100 and later implementations, is in the pa7100 +directory. + +RELEVANT OPTIMIZATION ISSUES + + Load and Store timing + +On the PA7000 no memory instructions can issue the two cycles after a store. +For the PA7100, this is reduced to one cycle. + +The PA7100 has a lookup-free cache, so it helps to schedule loads and the +dependent instruction really far from each other. + +STATUS + +1. mpn_mul_1 could be improved to 6.5 cycles/limb on the PA7100, using the + instructions bwlow (but some sw pipelining is needed to avoid the + xmpyu-fstds delay): + + fldds s1_ptr + + xmpyu + fstds N(%r30) + xmpyu + fstds N(%r30) + + ldws N(%r30) + ldws N(%r30) + ldws N(%r30) + ldws N(%r30) + + addc + stws res_ptr + addc + stws res_ptr + + addib Loop + +2. mpn_addmul_1 could be improved from the current 10 to 7.5 cycles/limb + (asymptotically) on the PA7100, using the instructions below. With proper + sw pipelining and the unrolling level below, the speed becomes 8 + cycles/limb. + + fldds s1_ptr + fldds s1_ptr + + xmpyu + fstds N(%r30) + xmpyu + fstds N(%r30) + xmpyu + fstds N(%r30) + xmpyu + fstds N(%r30) + + ldws N(%r30) + ldws N(%r30) + ldws N(%r30) + ldws N(%r30) + ldws N(%r30) + ldws N(%r30) + ldws N(%r30) + ldws N(%r30) + addc + addc + addc + addc + addc %r0,%r0,cy-limb + + ldws res_ptr + ldws res_ptr + ldws res_ptr + ldws res_ptr + add + stws res_ptr + addc + stws res_ptr + addc + stws res_ptr + addc + stws res_ptr + + addib |