Dynamic Scheduling Approach - Scoreboarding User's guide
Option
dlxsim -SCORE
[-al#] [-au#] [-dl#] [-du#] [-ml#] [-mu#]
[-SCORE] Execute the scoreboarding version of DLXsim.
[ -al# ] Select the latency for a floating point add (in clocks).
[ -au# ] Select the number of floating point add units.
[ -dl# ] Select the latency for a floating point divide.
[ -du# ] Select the number of floating point divide units.
[ -ml# ] Select the latency for a floating point multiply.
[ -mu# ] Select the number of floating point multiply units.
Command
- go [ address ]
Start simulating the DLX machine. If address is given,
execution starts at that memory address. Otherwise, it continues from
wherever it left off previously. This command does not complete until
simulated execution stops. The return value is an information string
about why execution stopped and the current state of the machine.
- load file [ file file ...]
Read each of the given files. Treat them as DLX assembly
language files and load memory as indicated in the files. Code (text)
is normally loaded starting at address 0x100, but the codeStart
variable may be used to set a different starting address. Data is
normally loaded starting at address 0x1000, but a different starting
address may be specified in the dataStart variable. The return
value is either an empty string or an error message describing
problems in reading the files. A list of directives that the loader
understands is in a later section of this manual.
- put address number
Store number in the register or memory location given by
address. The return value is an empty string. To store floating
point numbers (single or double precision), use the fput
command.
- quit
Exit the simulator.
- stats [stalls] [opcount][branch] [hw] [all]
This command will dump various statistics collected by the simulator
on the DLX code that has been run so far. Any combination of options
may be selected. The options and their results are as follows:
- stalls
Show the number of structural hazard stalls
- opcount
Show the number of each operation that has been executed.
- branch
Show the percentage of branches taken and not-taken.
- score
Show the components of the scoreboard.
- hw
Show the current hardware setup for the simulated machine.
- all
Equivalent to choosing all options to show statistics. This is the default.
- step[ address ]
If no address is given, the step command executes a single
instruction, continuing from wherever execution previously stopped.
If address is given, then the program counter is changed to
point to address, and a single instruction is executed from
there. In either case, the return value is an information string
about the state of the machine after the single instruction has been
executed.
Assembly file format
The assembler built into DLXsim, invoked using the load
command, accepts standard format DLX assembly language programs. The file is expected to contain lines of the following form:
- Labels are defined by a group of non-blank characters starting
with either a letter, an underscore, or a dollar sign, and followed
immediately by a colon. They are associated with the next address to
which code in the file will be stored. Labels can be accessed anywhere
else within that file, and in files loaded after that if the label is
declared as .global (see below).
- Comments are started with a semicolon, and continue to the end of the line.
- Constants can be entered either with or without a preceding number sign.
- The format of instructions and their operands are as shown in
the Computer Architecture book.
While the assembler is processing an assembly file, the data and
instructions it assembles are placed in memory based on either a text
(code) or data pointer. Which pointer is used is selected not by the
type of information, but by whether the most recent directive was .data or .text. The program initially loads into the text
segment.
The assembler supports several directives which affect how it loads
the DLX's memory. These should be entered in the place where you
would normally place the instruction and its arguments. The
directives currently supported by DLXsim are:
- .align
Cause the next data/code loaded to be at the next higher address with
the lower n bits zeroed (the next closest address greater than or
equal to the current address that is a multiple of 2^{n-1}).
- .ascii [`` string1'', `` string2'', ...]
Store the strings listed on the line in memory as a list of
characters. The strings are not terminated by a 0 byte.
- .asciiz [`` string1'', `` string2'',...]
Similar to .ascii, except each string is followed by a 0 byte
(like C strings).
- .byte [`` byte1'', `` byte2'',...]
Store the bytes listed on the line sequentially in memory.
- .data[ address ]
Cause the following code and data to be stored in the data area. If
an address was supplied, the data will be loaded starting at
that address, otherwise, the last value for the data pointer will be
used. If we were just reading code based on the text (code) pointer,
store that address so that we can continue from there later (on a
.text directive).
- .double [ number1, number2,...]
Store the numbers listed on the line sequentially in memory as
double precision floating point numbers.
- .float [ number1, number2,...]
Store the numbers listed on the line sequentially in memory as
single precision floating point numbers.
- .global [ label ]
Make the label available for reference by code found in files
loaded after this file.
- .space [ size]
Move the current storage pointer forward size bytes (to leave some
empty space in memory).
- .text [ address]
Cause the following code and data to be stored in the text (code)
area. If an address was supplied, the data will be loaded
starting at that address, otherwise, the last value for the text
pointer will be used. If we were just reading data based on the data
pointer, store that address so that we can continue from there later
(on a .data directive).
- .word [ word1, word2,...]
Store the words listed on the line sequentially in memory.
Components of scoreboard
SCOREBOARD 15 th clock cycle
Instruction Issue Read opnds Exe complete Write Result
+============================================================================+
addi r1,r0,0x100 V V V V
addi r2,r0,0x104 V V V V
ld f6,0x0(r1) V V V V
ld f2,0x0(r2) V V V V
multf f0,f2,f4 V V
subf f8,f6,f2 V V
divf f10,f0,f6 V
+============================================================================+
FU no. Name Busy Op Fi Fj Fk Qj Qk Rj Rk
+=======================================================================+
1 int NO (null)
2 mul YES multf f0 f2 f4 1 NO NO
3 mul NO (null)
4 add YES subf f8 f6 f2 1 NO NO
5 div YES divf f10 f0 f6 2 NO YES
+=======================================================================+
F0 F2 F4 F6 F8 F10 F12 F14 F16 F18 F20 F22 F24 F26 F28 F30
+-----------------------------------------------------------------------------+
FU 2 4 5
+=============================================================================+
Functional unit status-Indicates the state of the functional
unit (FU). There are nine fields for rach functional unit:
Busy- Indicates whether the unit is busy or not
Op- Operation to perform in the unit(e.g., add or subtract)
Fi- Destination regidter
Fj,Fk- Source-register nimbers
Qj,Qk- Number of the units producing source registersFj,Fk
Rj,Rk- Flags indicating when Fj,Fk are ready; fields are reset when
new values are read so that the scoreboard knows that the
source operand has been read(this is required to handle WAR
hazards)
Check Hazard
- Issue-check structural hazards and WAW
- Read operands-check RAW
- Execute-does not check
- Write result-check WAR
Algorithm for scoreboarding
--------------------------------------------------------------------------
InstructionStatus Wait until Bookkeeping
--------------------------------------------------------------------------
Issue Not busy(FU)and not Busy(FU)<-yes;Result(D)<-FU;
result(D) Op(FU)<-op; Fi(FU)<-D;
Fj(FU)<-S1; Fk(FU)<-S2;
Qj<-Result (S1); Qk<-Result(S2);
Rj<-not Qj; Rk<- not Qk
--------------------------------------------------------------------------
Read operands Rj and Rk Rj<- No; Rk<- No
--------------------------------------------------------------------------
Exe complete Functional unit done
--------------------------------------------------------------------------
Write result for all f(Fj(f)!=Fi(FU) for all(if Qj(f)==FU
or Rj(f)==NO) && then Rj(f)<-YES);
(Fk(f)!=Fi(FU) or for all(if Qk(f)==FU
Rk(f)==NO)) then Rk(f)<-YES);
Result(Fi(FU))<-clear;
Busy(FU)<-NO;
--------------------------------------------------------------------------
Required checks and bookkeeping actions for each step in instruction execution
FU stands for the functional unit used by the instruction, D is the destination register, S1 and S2 are the source regidters, and op is the operation to be done. To access the scoreboard entry named Fj for functional unit FU we use the notation Fj(FU). Result(D) is the value of the result register field for register D. The test on the write-result case prevents the write when there is a WAR hazard. For simplicity we assume that all of the bookkeeping operations are done in one clock cycle.
Example DLX code running on this simulator
Last updated: 1995.5.10