Script started on Thu Dec 15 17:56:35 1994 conx [emulator] pram -t -c -r 8 rank 7 6 5 4 3 2 1 0 CLOCK: 1003 CLOCK: 2797 1 2 3 4 5 6 7 8 conx [emulator] exit exit script done on Thu Dec 15 17:56:59 1994 The above is the result of 8(N) number with 8(N) processors. The following are the result using different number of N: N number and N processors clock 1 clock 2 N=1 ---> 552 N=2 ---> 920 944 N=4 ---> 920 1595 differ from previous one : 651 N=8 ---> 920 2714 differ from previous one : 1119 N=16 ---> 920 4717 differ from previous one : 2003 N=32 ---> 920 8436 differ from previous one : 3719 In this case, it has two clock. The first clock is constant time. It is correct because we need only one operation for each processor. The second clock is not what we expect. The differences are supposed to be the same (i.e. the algorithm is o(logN) ). We tried to rewrite this program for many times. Because of the restriction of the pm2 language, we can't do nested parallel operation. The code is following: par a := 1 to n sync do // sum the value in array // for c := 1 to log(n) do limit := 1; for d := 1 to c do limit := 2*limit; end; // for for b := 1 to (n/limit) do <------if we can use par, it will be logN ranki[a,b] := ranki[a,2*b-1] + ranki[a,2*b]; end; // for end; // for rankf[a] := ranki[a,1]; end; // par Maybe it is possible to solve this problem. Or maybe there is a way to modify pm2 language to handle that we can assign certain processor to do different things.