A short story on multitasking and memory protection.

    By Henk Robbers.                                            started 20 feb 1999
                                                                                  last  6 mar 1999

        I learned the trade in 1968 on a 2Mbuck mainframe. A ICL1904.
    This machine had rocksolid memory protected multitasking.
    You were allowed to test while production was running, although
    the operators wouldnt allow you to touch buttons, especially not those
    of running tapedecks. The console typewriter though was yours.
    An operator looked over your shoulder.

        The machine used a very simple and very effective method to
    achieve this.

     **   A model that can very well be implemented on M68K+ machines.  **

    Here it is:        read carefully and let a smile appear on your face.

        The CPU has 3 hardware registers.
    Datum register.
    Limit register.
    Status register.

        The default status is "supervisor mode" (sm).

    When the CPU is running in sm, the datum and limit registers
    are not used. The CPU uses physical addresses.
    The address space of the kernel. The kernel can touch everything.
    This sounds familiar.

        The kernel loads programs in physical memory in contiguous
    memory locations, and keeps load and status information in a
    table. The start address, end address, priority and suspension state.
    A freshly loaded program is initially in "suspended awaiting
    operator action" state. (No scripting on this dinosaur).
    The operator has to type 'go ...'.

        The kernel looks around in the table to see if there are
    programs that are "not suspended". If so, it starts the program
    with the highest priority by issuing a single privileged instruction
    that can be described as:

        "Enter user mode with table.datum,table.limit"

    The CPU loads table.datum in the datum register, table.limit
    in the limit register, loads the programs context, clears the sm bit
    and continues executing instructions. Now in user mode.

    You all now must have guessed that the datum and limit register
    describe a progam's logical address space.
    Datum contains the physical address of the programs address 0.
    Limit its highest logical address + 1.
    (Beware of the famous +1 bug!! :-)

    This is called the "dense" model.

        In user mode the datum and limit register are *used* by the CPU.
    On each memory access the (logical) address is compared with
    limit, and if greater, the program is "suspended awaiting operator
    action" and a memory violation interrupt is caused.
    when OK, datum is added to the address and execution continues.
    Needless to say that addresses are positive numbers.
        This is hardwired. There is simply no mechanism to circum-
    vent this and poking around in other address spaces.

    This dinosaur had no stack nor a stack mechanism.
    This was a real draw back, in this sense it was indeed an old beast.

    Well, not surprisingly, the context switching mechanism did *not*
    need a stack!!

    So where does the CPU find the context of a program ?
    Here the story becomes different.

    This machine has as unit of addressing a 24 bit word.

    The context resides entirely in the program itself.
    The first 128 words of the logical address space is reserved for this.
    Words 0 to 7 including: the 8 normal registers, of which 3 could be
                                          used for indirection.
                8:                       The program counter.
                9:                       Direct response for some very basic system calls.
         10,11:                       Plenty room for status information.

    Remaining:     At least 4 of the above for forks. Yes, complete multitasking including forking.
    Followed by plenty of reserve.

    Having your registers as actual locations in your address space, means that there
    is nothing special with them and you can simply address them in operands.
    No special denotion needed.
    The instruction format (which we would nowadays call RISC) allows, apart
    from the operand, 3 bits for the register number, just a small address.

    The 1904 has NO special wiring for registers, the same mechanism as for
    operands is used; see above for datum and limit. So their content are
    permanently there.

    ** This meant that a context switch didnt take any time at all. **
    Well at least the time needed for the interrupt mechanism itself,
    really not more than the time needed for an average simple instruction,
    a few clockcycles.

    The program counter and status are only set when an interrupt occurs,
    otherwise kept in image store.

    Now you may say: the PC accessible in logical space, isn't that dangerous ?
    Yes! it is, but what about hundreds of PC's on a stack? What do you prefer?
    And besides: it is only held there when the program was *not* running. :-)

    And what about speed? isn't this slow. Yes it is, but don't forget that programs
    on avarage spent 80 % of there time waiting while these old fashioned
    tapedecks and 2 cubic meter discs are transfering data. All DMA of course.

    This machine really used idle CPU time for multitasking.

    Later versions of this hardware implemented caches.
    Because every instruction has a register field, the registers are the heaviest
    used addresses and thus would reside in the cache permanently between
    interrupts. By *not* doing anything special the speed problem is solved.

    Later a complete paging system was added by simply wiring it into the
    datum/limit registers and changing them to a root pointer.
    This is called the "sparse" model.

    In fact the two models were implemented dynamicly, and could run concurrently.
    "dense" and "sparse" became properties of binaries.

    [description of heap allocation mechanism for dense programs]

    A year ago I discovered to my delight that the easier dense model can
    in fact be implemented on the MC68020+ , which makes exercising
    "logical address space" quite soon possible, in the mean time analysing
    and devoloping the sparse models.

    Context switching, alas, will not be as beautifull as on the 1900 series.

    ** This all existed before integrated circuitry came out of the laboratoria. **