Close

Cyrillic alphabets in Terminal-BASIC output

A project log for Terminal-BASIC

Language interpreter for uC-based systems

andrey-skvortsovAndrey Skvortsov 08/23/2017 at 11:130 Comments

My friend Dylan Brophy inspired me to investigate the ability to output the cyrillic text using different output devices, supported by Terminal-BASIC.

1. Using symbols others then capital latin in identifiers not possible and unnecessary.

2. USART output.

The job is done by the terminal or (terminal emulator) as long as you use a 1-byte cyrillic encoding (CP-866, KOI8-R, KOI8-U, Windows-1251 ...);

3. Output using TVoutEx, UTFT libraries and devices such as NGT20.

The character set either raster or vector should be extended. The memory resources in this case is main restriction. I think the encodings, which were mentioned above, are not suitable for this case because they define full cyrillic letters set for only 1-2 language and doubles the letters with same look as latin ones. There was interesting encoding Код УПП (punch card device code) which was used on БЭСМ-6 mainframes, defines single latin-cyrillic charset. But it has only capital letters and was incompatible with ASCII.

So I decided to define the encoding, containing all cyrillic symbols without doubling the same-looking latin.

I gather cyrillic letters from 8 alphabets (6 modern and 2 dead). The second part of ASCII will be formed from symbols, differs from latin, with the order ascending by the "points" of this letter in the following rating. This will allow to reduce the number of supported symbols, cutting the tail of the table.


Symbol Early Cyrillic Modern russian Russian before 1917 Belorussian Ukrainian Serbian Bulgarian Macedonian All
1 А 1 1 1 1 1 1 1 1 8
2 Б 1 1 1 1 1 1 1 1 8
3 В 1 1 1 1 1 1 1 1 8
4 Г 1 1 1 1 1 1 1 1 8
5 Ѓ 0 0 0 0 0 0 0 1 1
6 Ґ 0 0 0 0 1 0 0 0 1
7 Д 1 1 1 1 1 1 1 1 8
8 Ђ 0 0 0 0 0 1 0 0 1
9 Е 0 1 1 1 1 1 1 1 7
10 Ё 0 1 0 1 0 0 0 0 2
11 Є 1 0 0 0 1 0 0 0 2
12 Ж 1 1 1 1 1 1 1 1 8
13 З 1 1 1 1 1 1 1 1 8
14 И 1 1 1 0 1 1 1 1 7
15 Ї 1 0 0 0 1 0 0 0 2
16 Й 0 1 1 1 1 0 1 0 5
17 К 1 1 1 1 1 1 1 1 8
18 Ќ 0 0 0 0 0 0 0 1 1
19 Л 1 1 1 1 1 1 1 1 8
20 Љ 0 0 0 0 0 1 0 1 2
21 М 1 1 1 1 1 1 1 1 8
22 Н 1 1 1 1 1 1 1 1 8
23 Њ 0 0 0 0 0 1 0 1 2
24 О 1 1 1 1 1 1 1 1 8
25 П 1 1 1 1 1 1 1 1 8
26 Р 1 1 1 1 1 1 1 1 8
27 С 1 1 1 1 1 1 1 1 8
28 Т 1 1 1 1 1 1 1 1 8
29 Ћ 0 0 0 0 0 1 0 0 1
30 У 1 1 1 1 1 1 1 1 8
31 Ў 0 0 0 1 0 0 0 0 1
32 Ф 1 1 1 1 1 1 1 1 8
33 Х 1 1 1 1 1 1 1 1 8
34 Ц 1 1 1 1 1 1 1 1 8
35 Ч 1 1 1 1 1 1 1 1 8
36 Џ 0 0 0 0 0 1 0 1 2
37 Ш 1 1 1 1 1 1 1 1 8
38 Щ 1 1 1 0 1 0 1 0 5
39 Ъ 1 1 1 0 0 0 1 0 4
40 Ы 1 1 1 1 0 0 0 0 4
41 Ь 1 1 1 1 1 0 1 0 6
42 Ѣ 1 0 1 0 0 0 0 0 2
43 Э 0 1 1 1 0 0 1 0 4
44 Ю 1 1 1 1 1 0 0 0 5
45 Я 0 1 1 1 1 0 1 0 5
46 I 1 0 1 1 1 0 0 0 4
47 J 0 0 0 0 0 1 0 1 2
48 S 1 0 0 0 0 0 0 1 2

After addition of the Macedonian alphabet encoding variant looks like this:

dec hex ASCII КОЭ-13 (CUE-13 Cyrillic United Economical)
0 0 NUL
1 1 SOH
2 2 STX
3 3 ETX
4 4 EOT
5 5 ENQ
6 6 ACK
7 7 BEL
8 8 BS
9 9 TAB
10 A LF
11 B VT
12 C FF
13 D CR
14 E SO
15 F SI
16 10 DLE
17 11 DC1
18 12 DC2
19 13 DC3
20 14 DC4
21 15 NAK
22 16 SYN
23 17 ETB
24 18 CAN
25 19 EN
26 1A SUB
27 1B ESC
28 1C FS
29 1D GS
30 1E RS
31 1F US
32 20 SPACE
33 21 !
34 22 "
35 23 #
36 24 $
37 25
38 26 &
39 27 '
40 28 (
41 29 )
42 2A *
43 2B +
44 2C ,
45 2D -
46 2E .
47 2F /
48 30 0
49 31 1
50 32 2
51 33 3
52 34 4
53 35 5
54 36 6
55 37 7
56 38 8
57 39 9
58 3A :
59 3B ;
60 3C <
61 3D =
62 3E >
63 3F ?
64 40 @
65 41 A
66 42 B
67 43 C
68 44 D
69 45 E
70 46 F
71 47 G
72 48 H
73 49 I
74 4A J
75 4B K
76 4C L
77 4D M
78 4E N
79 4F O
80 50 P
81 51 Q
82 52 R
83 53 S
84 54 T
85 55 U
86 56 V
87 57 W
88 58 X
89 59 Y
90 5A Z
91 5B [
92 5C \
93 5D ]
94 5E ^
95 5F _
96 60 `
97 61 a
98 62 b
99 63 c
100 64 d
101 65 e
102 66 f
103 67 g
104 68 h
105 69 i
106 6A j
107 6B k
108 6C l
109 6D m
110 6E n
111 6F o
112 70 p
113 71 q
114 72 r
115 73 s
116 74 t
117 75 u
118 76 v
119 77 w
120 78 x
121 79 y
122 7A z
123 7B {
124 7C |
125 7D }
126 7E ~
127 7F DEL
128 80 Ç Б
129 81 ű Г
130 82 é Д
131 83 â Ж
132 84 ä З
133 85 à Л
134 86 å П
135 87
У
136 88
Ф
137 89
Ц
138 8A è Ч
139 8B ї Ш
140 8C î И
141 8D ì Ь
142 8E
Й
143 8F
Щ
144 90
Ю
145 91
Я
146 92
Ъ
147 93
Ы
148 94
Э
149 95
Є
150 96
Ё
151 97
Ї
152 98
Љ
153 99
Њ
154 9A
Џ
155 9B
Ѣ
156 9C
Ґ
157 9D
Ѓ
158 9E
Ђ
159 9F
Ќ
160 A0
Ћ
161 A1
Ў
162 A2
б
163 A3
в
164 A4
г
165 A5
д
166 A6
ж
167 A7
з
168 A8
л
169 A9
п
170 AA
т
171 AB
ф
172 AC
ц
173 AD
ч
174 AE
ш
175 AF
и
176 B0
ь
177 B1
й
178 B2
щ
179 B3
ю
180 B4
я
181 B5
ъ
182 B6
ы
183 B7
э
184 B8
є
185 B9
ё
186 BA
ї
187 BB
љ
188 BC
њ
189 BD
џ
190 BE
ѣ
191 BF
ґ
192 C0
ѓ
193 C1
ђ
194 C2
ќ
195 C3
ћ
196 C4
ў

Discussions