By Henk Robbers.
started 20 feb 1999
last 6 mar 1999
I learned the trade in 1968
on a 2Mbuck mainframe. A ICL1904.
This machine had rocksolid memory protected multitasking.
You were allowed to test while production was running,
although
the operators wouldnt allow you to touch buttons,
especially not those
of running tapedecks. The console typewriter though
was yours.
An operator looked over your shoulder.
The machine used a very simple
and very effective method to
achieve this.
** A model that can very well be implemented on M68K+ machines. **
Here it is: read carefully and let a smile appear on your face.
The CPU has 3 hardware registers.
Datum register.
Limit register.
Status register.
The default status is "supervisor mode" (sm).
When the CPU is running in sm, the datum and limit
registers
are not used. The CPU uses physical addresses.
The address space of the kernel. The kernel can
touch everything.
This sounds familiar.
The kernel loads programs
in physical memory in contiguous
memory locations, and keeps load and status information
in a
table. The start address, end address, priority
and suspension state.
A freshly loaded program is initially in "suspended
awaiting
operator action" state. (No scripting on this dinosaur).
The operator has to type 'go ...'.
The kernel looks around in
the table to see if there are
programs that are "not suspended". If so, it starts
the program
with the highest priority by issuing a single privileged
instruction
that can be described as:
"Enter user mode with table.datum,table.limit"
The CPU loads table.datum in the datum register,
table.limit
in the limit register, loads the programs context,
clears the sm bit
and continues executing instructions. Now in user
mode.
You all now must have guessed that the datum and
limit register
describe a progam's logical address space.
Datum contains the physical address of the programs
address 0.
Limit its highest logical address + 1.
(Beware of the famous +1 bug!! :-)
This is called the "dense" model.
In user mode the datum and
limit register are *used* by the CPU.
On each memory access the (logical) address is compared
with
limit, and if greater, the program is "suspended
awaiting operator
action" and a memory violation interrupt is caused.
when OK, datum is added to the address and execution
continues.
Needless to say that addresses are positive numbers.
This is hardwired. There
is simply no mechanism to circum-
vent this and poking around in other address spaces.
This dinosaur had no stack nor a stack mechanism.
This was a real draw back, in this sense it was
indeed an old beast.
Well, not surprisingly, the context switching mechanism
did *not*
need a stack!!
So where does the CPU find the context of a program
?
Here the story becomes different.
This machine has as unit of addressing a 24 bit word.
The context resides entirely in the program itself.
The first 128 words of the logical address space
is reserved for this.
Words 0 to 7 including: the 8 normal registers,
of which 3 could be
used for indirection.
8:
The program counter.
9:
Direct response for some very basic system calls.
10,11:
Plenty room for status information.
Remaining: At least 4 of
the above for forks. Yes, complete multitasking including
forking.
Followed by plenty of reserve.
Having your registers as actual locations in your
address space, means that there
is nothing special with them and you can simply
address them in operands.
No special denotion needed.
The instruction format (which we would nowadays
call RISC) allows, apart
from the operand, 3 bits for the register number,
just a small address.
The 1904 has NO special wiring for registers, the
same mechanism as for
operands is used; see above for datum and limit.
So their content are
permanently there.
** This meant that a context switch didnt take any
time at all. **
Well at least the time needed for the interrupt
mechanism itself,
really not more than the time needed for an average
simple instruction,
a few clockcycles.
The program counter and status are only set when
an interrupt occurs,
otherwise kept in image store.
Now you may say: the PC accessible in logical space,
isn't that dangerous ?
Yes! it is, but what about hundreds of PC's on a
stack? What do you prefer?
And besides: it is only held there when the program
was *not* running. :-)
And what about speed? isn't this slow. Yes it is,
but don't forget that programs
on avarage spent 80 % of there time waiting while
these old fashioned
tapedecks and 2 cubic meter discs are transfering
data. All DMA of course.
This machine really used idle CPU time for multitasking.
Later versions of this hardware implemented caches.
Because every instruction has a register field,
the registers are the heaviest
used addresses and thus would reside in the cache
permanently between
interrupts. By *not* doing anything special the
speed problem is solved.
Later a complete paging system was added by simply
wiring it into the
datum/limit registers and changing them to a root
pointer.
This is called the "sparse" model.
In fact the two models were implemented dynamicly,
and could run concurrently.
"dense" and "sparse" became properties of binaries.
[description of heap allocation mechanism for dense programs]
A year ago I discovered to my delight that the easier
dense model can
in fact be implemented on the MC68020+ , which makes
exercising
"logical address space" quite soon possible, in
the mean time analysing
and devoloping the sparse models.
Context switching, alas, will not be as beautifull as on the 1900 series.
** This all existed before integrated circuitry came
out of the laboratoria. **