ATARI CHARACTER REPRESENTATION (ACR)

"READ THIS! We need your help!"

By Frankenstein


What's ACR?
-----------
Atari Character Representation (ACR) is an attempt to get a better character set for the Atari 8-bitter.

What's the problem?
-------------------
Did you ever have the feeling that our Atari ASCII (ATASCII) set didn't have the characters you wanted?
I've encountered this problem many times (think about diagrams etc.).
The main problem is that people don't use the special ATASCII characters because those characters can not be send to the printer. On the other hand they don't use IBM characters (which are supported by most printers) because they're not in the ATASCII set!

Wouldn't it be nice if we could just use all IBM characters?
Yes, this would be very nice. But unfortunately our Atari can only display 128 different characters (the IBM set can do 256). This is because Atari decided to use the highest bit to get inverted characters.

How to solve this?
------------------
When we take a close look at the ATASCII set we'll notice that some characters are rarely used.
We could redefine these characters in a custom FONT.
The useful ATASCII control characters can be converted to an IBM equivalent.

Let's define the ACR set
------------------------
Most of the characters in the ATASCII set are the same as in ASCII. It's important to know this since we don't have to convert these characters.
All characters which range from 32 to 63 are completely the same.
Here they are (Hex Char):

$20      $30 0    $40 @    $50 P
$21 !    $31 1    $41 A    $51 Q
$22 "    $32 2    $42 B    $52 R
$23 #    $33 3    $43 C    $53 S
$24 $    $34 4    $44 D    $54 T
$25 %    $35 5    $45 E    $55 U
$26 &    $36 6    $46 F    $56 V
$27 '    $37 7    $47 G    $57 W
$28 (    $38 8    $48 H    $58 X
$29 )    $39 9    $49 I    $59 Y
$2A *    $3A :    $4A J    $5A Z
$2B +    $3B ;    $4B K    $5B [
$2C ,    $3C <    $4C L    $5C \
$2D -    $3D =    $4D M    $5D [
$2E .    $3E >    $4E N    $5E ^
$2F /    $3F ?    $4F O    $5F _

NOTE: $20 = the space character

There is another sequence of ATASCII characters which are equal to ASCII.
This ranges from $61 to $7A:

         $70 p
$61 a    $71 q
$62 b    $72 r
$63 c    $73 s
$64 d    $74 t
$65 e    $75 u
$66 f    $76 v
$67 g    $77 w
$68 h    $78 x
$69 i    $79 y
$6A j    $7A z
$6B k
$6C l
$6D m
$6E n
$6F o

As you see, only a few characters are missing; $60 and $7B to $7F.
The characters $7D, $7E and $7F have a special function for the screen (E:) handler (e.g. $7D = clear screen).
The other characters can be redefined in a custom FONT.

So now characters $20 to $7F are configured. We've only 32 characters left in the range $00 to $1F.
In ATASCII this range is used for control characters or international characters.
The international character set can be activated by setting CHBASE ($02F4) to $CC (204). When this set is active, we can't use the control characters anymore, since they now represent international characters.
Some control characters are very useful for creating diagrams, tables and such.
These characters also appear somewhere in the IBM set. The IBM set has also double lines and combinations between single and double lines. All these can be converted to single lines.
Here's the conversion table:

  +---------------------------+
  |   ACR  IBM  IBM  IBM  IBM |
  +-+-------------------------+
  |+| $01  $C3  $C6  $C7  $CC |
  +-+-------------------------+
  |+| $03  $D9  $BE  $BD  $BC |
  +-+-------------------------+
  |+| $04  $B4  $B5  $B6  $B9 |
  +-+-------------------------+
  |+| $05  $BF  $B8  $B7  $BB |
  +-+-------------------------+
  |+| $13  $C5  $D8  $D7  $CE |
  +-+-------------------------+
  |+| $17  $C2  $D1  $D2  $CB |
  +-+-------------------------+
  |+| $18  $C1  $CF  $D0  $CA |
  +-+-------------------------+
  |+| $1A  $C0  $D4  $D3  $C8 |
  +-+-------------------------+
  |-| $12  $C4  $CD  $C4  $CD |
  +-+-------------------------+
  ||| $7C  $B3  $B3  $BA  $BA |
  +-+-------------------------+
  |?| $15  $DC                |
  +-+-------------------------+
  |?| $19  $DD                |
  +-+-------------------------+


Note: As you see all IBM line characters can be converted to ACR. Although if we want to convert ACR to IBM, we have to choose one of the four columns. The first IBM column is the best choice in most cases because it represents the same line characters as the ACR/ATASCII lines.


Inverse characters
------------------
Some characters from the IBM set can be represented by inverse ATASCII characters:

  +---------------------------+
  |   ACR  IBM  IBM  IBM  IBM |
  +-+-------------------------+
  |?| $80  $DB  $B0  $B1  $B2 |
  +-+-------------------------+
  |?| $95  $DF                |
  +-+-------------------------+
  |?| $99  $DE                |
  +-+-------------------------+


More special codes
------------------
As I said earlier, $7D, $7E and $7F represent special functions for the screen handler. It's therefore not very smart to redefine these characters.
Why not? Well, try to send a redefined 'clear screen' code to the E: handler and you will see that the handler still clears the screen instead of showing the redefined character.
In the $00 to $1F range there are five more of those characters:

$1B Escape
$1C Arrow up
$1D Arrow down
$1E Arrow left
$1F Arrow right


New line, RETURN!
-----------------
One last thing we have to take care of is the RETURN mark.
IBM mostly uses the CR/LF combination ($0D,$0A). Some texts (sources) only use LF ($0D).
Atari uses $9B (155 decimal) for a hard RETURN (which is the inverted representation of the ESC ($1B) code).
An IBM text to ACR text converter should therefore convert every ($0D) or ($0D,$0A) into one single $9B.


Finally, curly brackets!
------------------------
I choose characters $08 and $0A to represent the curly brackets.

  +---------------------------+
  |   ACR  IBM                |
  +-+-------------------------+
  |{| $08  $7B  bracket open  |
  +-+-------------------------+
  |}| $0A  $7D  bracket close |
  +-+-------------------------+


The ACR set
-----------
Here's the complete ACR set:

$00 ?    $20      $40 @    $60 `
$01 +    $21 !    $41 A    $61 a
$02 ?    $22 "    $42 B    $62 b
$03 +    $23 #    $43 C    $63 c
$04 +    $24 $    $44 D    $64 d
$05 +    $25 %    $45 E    $65 e
$06 ?    $26 &    $46 F    $66 f
$07 ?    $27 '    $47 G    $67 g
$08 {    $28 (    $48 H    $68 h
$09 ?    $29 )    $49 I    $69 i
$0A }    $2A *    $4A J    $6A j
$0B ?    $2B +    $4B K    $6B k
$0C ?    $2C ,    $4C L    $6C l
$0D ?    $2D -    $4D M    $6D m
$0E ?    $2E .    $4E N    $6E n
$0F ?    $2F /    $4F O    $6F o
$10 ?    $30 0    $50 P    $70 p
$11 +    $31 1    $51 Q    $71 q
$12 -    $32 2    $52 R    $72 r
$12 +    $33 3    $53 S    $73 s
$14 ?    $34 4    $54 T    $74 t
$15 ?    $35 5    $55 U    $75 u
$16 ?    $36 6    $56 V    $76 v
$17 +    $37 7    $57 W    $77 w
$18 +    $38 8    $58 X    $78 x
$19 ?    $39 9    $58 Y    $79 y
$1A +    $3A :    $5A Z    $7A z
$1B ESC  $3B ;    $5B [    $7B ~
$1C UP   $3C <    $5C \    $7C |
$1D DWN  $3D =    $5D ]    $7D CLR
$1E LFT  $3E >    $5E ^    $7E DEL
$1F RGT  $3F ?    $5F _    $7F TAB

$80 ?
$95 ?
$99 ?
$9B RETURN

Note: the '?' chars are unused


Conversion tables
-----------------
A fast way of converting data can be achieved by using conversion tables.
With this method an INPUT value is used as an offset pointer for a lookup table. So the pointer points to the OUTPUT value in the table.

The following conversion tables can be very handy for the ACR format:

1. IBM to ACR
2. ACR to ASCII
3. ACR to IBM

The first one is handy because it can be used to convert IBM text (MS-DOS) to ACR. The second one can be used if we want to output an ACR text to a normal ASCII printer. The third table can be used to convert ACR to IBM text or a printer with an IBM character mode.
You may wonder why there's no ASCII to ACR table. The reason is that converting ASCII to ACR causes loss of data. In other words, the ASCII set doesn't include all the characters from the ACR set.
The data loss problem can also arise when we convert IBM to ACR. The solution: always keep a copy of the original text after converting.


It's up to you!
---------------
We didn't use every character in the $00-$1F range.
These codes can be used for international characters (or any other purpose if you can name one?).
If you know a character which you like to include in the ACR set, please write to Mega Magazine.


Mega Magazine
[P.O.BOX REMOVED]
[ADDRESS REMOVED]
Holland