The input language for the register allocation process is MiniMIPS as presented in the class notes on MiniMIPS, but with three exceptions: (i) there is no use of load or store instructions, (ii) references to data may be either registers or variables, and (iii) the only data type available is integer (no float, arrays, or structs). Furthermore, we will consider only allocation for a single basic block. The technique here can be applied iteratively to each basic block in the program.
We will assume that all variables (including temporaries) are loaded into a symbol table, and have been allocated in the static data region, so that offsets are available for all variables. This is consistent with the assumptions under which you will perform register allocation in project four.
For example, we might have the following sequence of statements:
mult T1, A, B
sub T2, A, C
add T1, T1, T2
addi D, T1, 5
j 12
with offsets in static memory as follows:
Variable Offset
-------- ------
A 0
B 4
C 8
D 12
T1 16
T2 20
The goal is to produce a sequence of MiniMIPS instructions in which all references to data in arithmetic and branch instructions are through registers. However, in order to simplify our task, we will make no assumptions about the values held in registers between block. Thus, we must load (lw) all data used in the block from memory into suitable registers for arithmetic manipulation, and then, any variables which have been modified during the block (i.e., they are "dirty," to borrow terminology from virtual memory) must be stored (sw) out to memory at the end of the block.
For example, we might translate the previous basic block as follows:
lw $1, 0($0) # Load A into $1
lw $2, 4($0) # Load B into $2
mult $3, $1, $2 # Multiply and store in $3 instead of T1
lw $4, 8($0) # Load C into $4
sub $5, $1, $4 # Subtract and store in $5 instead of T2
add $3, $3, $5
addi $6, $3, 5 # Add and store in $6 instead of D
sw $6, 16($0) # Store value of D out to memory
j 12
Note how we load and store only when absolutely necessary; e.g.,
since the value of D upon entry to the block was not
needed, we did not load it into a register, and since A, B, and C were not
modified during the block, we did not need to store them out to memory at the
end. We have a one-to-one correspondence between variables and registers:
Variable Register
-------- ------
A $1
B $2
C $4
D $6
T1 $3
T2 $5
But the problem is complicated by the fact that we may have more variables than registers, and thus may need to store more than one variable in a register during the course of the block. This leads us to consider a kind of stack-based allocation: maintain a list of available registers, popping off a new register as needed, and pushing back a register when it has no "next use" in the block. In our example, we can recycle $2 as soon as it is used, and similarly for $4. If we do this systematically in our running example we need only 3 registers:
MiniMIPS Code Register Assignment after this instruction
------------- ------------------------------------------
lw $1, 0($0) $1 = A
lw $2, 4($0) $1 = A; $2 = B
mult $2, $1, $2 $1 = A; $2 = T1
lw $3, 8($0) $1 = A; $2 = T1; $3 = C
sub $1, $1, $3 $1 = T2; $2 = T1;
add $2, $2, $1 $2 = T1;
addi $1, $2, 5 $1 = D;
sw $1, 12($0) $1 = D;
j 12
Note that the timing of the push of a variable with no next use is delicate: you should
do it after registers have been allocated for both source registers, but before
allocating the target register; thus in the sub instruction, we could reuse $1 as a target, since
it is not needed after its value is obtained as a source operand.
All that is necessary to implement this method is to keep track at each instruction
which values are in which registers, and to record the next use of the current value
of a variable (a simple method for calculating such NextUse matrix will be given below).
Unfortunately, this is a little too naive, as it is possible that we simultaneously need to store more values than we have registers. For example, we are assuming that we have only 12 registers, but we may have 20 variables, each used multiple times in a long block. After loading only 12 of these values into registers, each may have a next-use in the block, but we need more registers to store the remaining 8 values! Thus, we need to consider a register allocation method such that, when all registers are in use and we need to allocate a register for a new value, we temporarily move it out to memory. This is called "spilling " a register. For example, if we suppose that we had only TWO registers available to us, then we could still translate our running example into MiniMIPS:
lw $1, 0($0)
lw $2, 4($0)
mult $2, $1, $2
sw $2, 16($0) # Need a new register to hold C; spill $2 out to T1
lw $2, 8($0) # Now use $2 to hold C
sub $1, $1, $2
lw $2, 16($0) # Whoops! Need T1 again, load it; $2 is available, so use it again
add $2, $2, $1 # Ok, back on track....
addi $1, $2, 5
sw $1, 16($0)
j 12
The overhead for this was two instructions to spill and reload the register.
In fact, it is easy to see that we can make due with only two registers for
any block (to hold the two
operands for each instruction), but this would result in a lot of spilled registers....
The chief thing we need to decide in such cases is: which register should be spilled? If we spill a register that is used in the next instruction, then this could set off a cascade of spilling; to avoid this, a sensible choice would be a register whose next-use is as far in the future as possible. (Compare the similar situation in virtual memory, where we would like to replace a page whose next access is as far in the future as possible; in that case, however, the next access can not be determined, and so least-recently-used (LRU) is used as a rough approximation.) This is another reason why next-use information is important.
In the remainder of this document, we flesh out this quick overview by giving algorithms for collecting next-use information, and for replacing variable references by registers in MiniMIPS code.
Infinity -- Next-use is indeterminate, since A is a non-temp which might be used after
leaving the block; (Infinity can be represented by Maxint);
k -- The next-use of A is in line k >= i;
-1 -- There is no next-use of the value in A at line i.
The calculation of these values is quite simple: we start with the assumption that
temps have no next-use outside the block, and non-temps have an indeterminate
use after the block; then we run through the statements in reverse, marking
uses of variables when they occur as source operands, and marking the value as -1
when a variable is used as a target (and hence the value will be overwritten).
In detail, suppose our basic block starts at location S and ends at location E
and suppose there are M scalar variables (i.e., not array or structure type)
in the symbol table. Then our NextUse matrix will have rows S through E+1 and
M columns, one for each possible use of a variable in an instruction, plus
a row at the end for initialization.
CalculateNextUse(int S, int E) {
for each variable D in symbol table:
if (D is a temporary)
NextUse(E+1,D) = -1;
else
NextUse(E+1,D) = Inf;
for(i = E; i >= S; i--) {
// Case 1: Suppose stmt i is: op A, B, C
// First carry up previous row
foreach variable D in symbol table:
NextUse(i,D) = NextUse(i+1,D);
// Next, overwrite with new values for A, B, and C
NextUse(i,A) = -1;
NextUse(i,B) = i;
NextUse(i,C) = i;
// Also do the expected thing for othercases: (beq B, C, next),
// (addi A, B, imm), (sll A, B, imm), etc.
// and for (j next) do nothing
}
}
For example, the NextUse matrix for our running example would be as follows, assuming that it starts at location 0, and we let F stand for Infinity:
Code NextUse
---- ----------------
A B C D T1 T2
----------------
0: mult T1, A, B 0 0 1 -1 -1 -1
1: sub T2, A, C 1 F 1 -1 2 -1
2: add T1, T1, T2 F F F -1 2 2
3: addi D, T1, 5 F F F -1 3 -1
4: j 12 F F F F -1 -1
5: F F F F -1 -1
Note that NextUse(2,T1) would first be set to 3 (the next row carried up), then
set to -1 (because T1 is a target), and then to 2 (because it is a source as well).
In some sense, the values listed show the status of the values in the variables
just after the source operands have been accessed, and before the target has been
assigned. This matrix tells us several interesting things about the values
that live in these variables, as we scan down a column:
IsLastUse(R,i) = 1 if NextUse(i,R) >=0 and NextUse(i+1,R) == -1;
0 otherwise;
Active(R) = 1 if exist i and j with NextUse(i,R) == -1 and NextUse(j,R) != -1;
0 otherwise;
FarthestUse(i) = an R currently assigned to a register (VarDes(R) != NULL) such that NextUse(i,R) is maximal in row i; note that infinity
is always maximal.
A note on implementation: NextUse matrices are typically stored in the symbol table, one
column per entry. Functions PutNextUse(R,i) and GetNextUse(R,i) can be
easily written that access these arrays so that the structure of the symbol
table is hidden from the NextUse functions.
There are two parts to the register allocation algorithm. The first is to find a register for a given reference, and the second is to actually go through the basic block and generate the instructions. Regarding the second, I'll assume that the final sequence of instructions will be put into a separate code array, say MIPSCode[] with a separate nextMIPS pointer (instead of code[] and nextQuad -- in the fourth project writeup, I assumed that you would do code generation in the same code[] array that you created during project three, but it seems simpler to produce a separate code sequence), using an instruction
Gen(op, R1, R2, R3) = Add an instruction "op R1, R2, R3" to the code
array at location nextMIPS and increment nextMIPS
This is completely analogous to the AddQuad routine we have used previously.
The first step is accomplished by a function GetReg(R) which takes a reference R (which for this project, must be an integer variable) and returns a register name $k for some k. It also generates loads and stores to manage the movement of data between registers and memory. In order to do this it must maintain a data structure that records which variable's values are held in which registers. This can be simply an array
ref regDes[16];
(where for us "ref" is the same as "char *" since we record references as strings).
This will also be used to represent the pool of available registers (i.e., those
assigned to NULL). Note that we only use registers $1 -- $12.
We assume we have several functions to access this data structure:
InitRegDes(): forall k, regDes[k] = NULL;
IsAvailableReg(): Return 0 if all registers $1 -- $12 are assigned, and 1 otherwise;
Assign(R): Find register $k in $1 -- $12 such that regDes[k] == NULL, set regDes[k] = R, and
return k; return NULL if no such exists; (this is like Pop() of
a pool of unassigned registers, except that it assigns the new
register to R);
Push(k): set regDes[k] = NULL;
VarDes(R): return $k if there exists k such that regDes[k] = R, or NULL if not found.
That is, regDes[] maps registers to variables, and VarDes() maps variables to registers.
This will be (in our naive method) a one-to-one mapping.
The following function is just used to store out a register during a spill:
Spill(int k) {
Gen("sw", $k, offset(regDes[k]) + "($0)") // Here + is string concatenation
}
We now present two functions to find registers to store references in.
We need two because for a source reference, we need to load the
reference into a register if it is not already there; for a destination register,
this is of course not necessary. Similarly, after spilling a register to make room for
the new value, we need to load a source register, but do NOT load a destination register (since
it is about to get a new value anyway). There are thus two extra lines in the first
function, notated by (***).
// Get source register for reference R in instruction at location i
ref GetSourceReg(ref R, int i) {
ref T, $k;
if (R is already a register)
return R;
else if (VarDes(R) != NULL) // A register is already assigned to R
return VarDes(R);
else if (IsAvailableRef()) {
$k = Assign(R); // get an available register
Gen("lw", $k, -, offset(R) + "($0)") // Here + is string concatenation (***)
return $k;
}
else { // Ack! All registers are in use, must spill a register
T = FarthestUse(i);
$k = VarDes(T);
Nextuse(T, i) = -1; // Invalidate this entry
Spill($k);
Gen("lw", $k, -, offset(R) + "($0)") // Here + is string concatenation (***)
regDes[$k] = R;
return $k;
}
}
// Get destination register for reference R in instruction at location i
ref GetDestReg(ref R, int i) {
ref T, $k;
if (R is already a register)
return R;
else if (VarDes(R) != NULL) // A register is already assigned to R
return VarDes(R);
else if (IsAvailableRef())
return Assign(R); // get an available register
else { // Ack! All registers are in use, must spill a register
T = FarthestUse(i);
$k = VarDes(T);
Nextuse(T, i) = -1; // Invalidate this entry
Spill($k);
regDes[$k] = R;
return $k;
}
}
The line with the comment "Invalidate this entry" deserves a word of explanation. This
function will be called possible several times during a single instruction, and we need to ensure
that we do not try to grab and spill the same register twice! Thus, we will set the NextUse entry so
that this can not happen; in fact this entry will never be used again, so it does not matter what
we do with it as long as we ensure that some other entry in that line is larger.
To use this register allocation algorithm during code generation, we must proceed through the basic block, finding a register for each operand in turn. The GetReg() function will take care of spilling registers as needed. The only subtlety is that we must spill out all the non-temporary variables that were active (i.e., received new values) during the block, and thus must be done right before the last statement (which is typically a jump or branch, so we can not spill after the last instruction!).
As with the calculation of nextuse information, there are a number of different cases for various kinds of instructions, and we will just show what to do for the reg-reg instructions; the others are very easy modification of this basic process.
CodeGenerate(int S, int E) { // Generate code for the basic block from S to E
ref $n, $m, $k
InitRegDes();
CalculateNextUse(S,E);
for(i=S; i<=E; i++) {
// Case 1: stmt i is op A, B, C
if(i == E) // Next to last instruction, must spill active non-temps
for(k=1; k<=12; k++)
if ( regDes[k] != NULL && NextUse(E+1,regDes[k]) == F && Active(regDes[k]) )
Spill(k);
$n = GetSourceReg(B,i);
$m = GetSourceReg(C,i);
if(IsLastUse(B,i))
Push($n);
if(IsLastUse(C,i))
Push($m);
$k = GetDestReg(A,i);
Gen(op, $k, $n, $m);
// Other cases: Do the expected thing for j, beq, sll, addi, etc.
}
}