We implemented PSPI migration on the Sun Niagara2 by combining
coarse-grained and fine-grained prallelism.
We showed that the multi-thread per core model leads to significant uplift
in performance over a single thread approach. Compared to a strictly fine grain parallelism we achieved
a 60% uplift. Compared to a coarse grain approach the improvement was 5X. Improved
floating point/vector performance could lead to signficant uplift for this aglorithm.