BU/CLA CS-551

Parallel Computing: Models, Languages, and Architectures


SIMD matrix multiplication on the MasPar

We all know that to compute the product of two matrices, we do:

  C(i,j) = A(i,1)*B(1,j) + A(i,2)*B(2,j) + ... + A(i,N)*B(N,j)
To do this on the MasPar, storing one element per processor, is tricky because we have to get the appropriate elements to the appropriate processors. Doing so using general communication might prove to be overly expensive.

Another way of doing matrix multiplication is using Pipelining. Let (+) and (*) denote element-wise addition and multiplication of two arrays. To explain this technique, consider the following 3x3 matrix multiplication of A and B to yield C:

 /             \         /             \         /             \
| C11  C12  C13 |       | A11  A12  A13 |       | B11  B22  B33 |
| C21  C22  C23 |   =   | A22  A23  A21 |  (*)  | B21  B32  B13 |  (+)
| C31  C32  C33 |       | A33  A31  A32 |       | B31  B12  B23 |
 \             /         \             /         \             /


                         /             \         /             \
                        | A12  A13  A11 |       | B21  B32  B13 |
                        | A23  A21  A22 |  (*)  | B31  B12  B23 |  (+)
                        | A31  A32  A33 |       | B11  B22  B33 |
                         \             /         \             /


                         /             \         /             \
                        | A13  A11  A12 |       | B31  B12  B23 |
                        | A21  A22  A23 |  (*)  | B11  B22  B33 |
                        | A32  A33  A31 |       | B21  B32  B13 |
                         \             /         \             /
In the above computation, the elements of the 3 by 3 arrays A and B were initially arranged so that successive circular shifts (shifting up for A and left for B) resulted in the elements of A and B be aligned at each processing element (PE) so that the cumulative sum for the matrix multiplication can be computed in C, which remains static with respect to the PEs. Assuming that A and B are square matrices that conform to the physical dimensions of the MasPar, answer the following questions:
This document has been prepared by Professor Azer Bestavros <best@cs.bu.edu> as the WWW Home Page for CS-551, which is part of the NSF-funded undergraduate curriculum on parallel computing at BU.

Date of last update: November 22, 1994.