Go to the previous, next section.
This charset is available in recode
under the name ASCII
.
In fact, it's true name is ANSI_X3.4-1968
as per RFC 1345,
accepted aliases being ANSI_X3.4-1986
, ASCII
,
IBM367
, ISO646-US
, ISO_646.irv:1991
,
US-ASCII
, cp367
, iso-ir-6
and us
. The
shortest way of specifying it in recode
is us
.
This documentation used to include ASCII tables. They have been removed
since recode
can now recreate these (and a lot of others) easily:
recode -lf us for commented ASCII recode -ld us for concise decimal table recode -lo us for concise octal table recode -lh us for concise hexadecimal table
This charset is available in recode
under the name Latin-1
.
In fact, it's true name is ISO_8859-1:1987
as per RFC 1345,
accepted aliases being CP819
, IBM819
, ISO-8859-1
,
ISO_8859-1
, iso-ir-100
, l1
and Latin-1
. The
shortest way of specifying it in recode
is l1
.
This charset corresponds to the ISO Latin Alphabet 1. It is an eight-bit code which coincides with ASCII for the lower half.
This documentation used to include Latin-1 tables. They have been
removed since recode
can now recreate these (and a lot of others)
easily:
recode -lf l1 for commented ISO Latin-1 recode -ld l1 for concise decimal table recode -lo l1 for concise octal table recode -lh l1 for concise hexadecimal table
The following from `lasko@video.dec.com' (Tim Lasko), with no date.
ISO Latin-1, or more completely ISO Latin Alphabet No 1, is now an international standard as of February 1987 (IS 8859, Part 1). For those American USEnet'rs that care, the 8-bit ASCII standard, which is essentially the same code, is going through the final administrative processes prior to publication.ISO Latin-1 (IS 8859/1) is actually one of an entire family of eight-bit one-byte character sets, all having ASCII on the left hand side, and with varying repertoires on the right hand side:
This charset is available in recode
under the name
ASCII-BS
, with BS
as an acceptable alias.
The file is straight ASCII, seven bits only. According to the definition of ASCII: diacritics are applied by a sequence of three characters: the letter, one BS, the diacritic mark. We deviate slightly from this by exchanging the diacritic mark and the letter so, on a screen device, the diacritic will disappear and let the letter alone. At recognition time, both methods are acceptable.
The French quotes are coded by the sequences: < BS " or "
BS < for the opening quote and > BS " or "
BS > for the closing quote. This artificial convention was
inherited in straight ASCII-BS
from habits around Bang-Bang
entry, and is not well known. But we decided to stick to it so that
ASCII-BS
charset will not loose French quotes.
The ASCII-BS
charset is independent of ASCII
, and
different. The following examples demonstrate this, knowing at advance
that `!2' is the Bang-Bang
way of representing an e
with an acute accent. Compare:
% echo \!2 | recode -v bang:us | od -bc Bang-Bang -> ISO_8859-1:1987 -> RFC 1345 -> ANSI_X3.4-1968 (many to one) Simplified to: Bang-Bang -> ISO_8859-1:1987 -> ANSI_X3.4-1968 (many to one) 0000000 351 012 351 \n 0000002
with:
% echo \!2 | recode -v bang:bs | od -bc Bang-Bang -> ISO_8859-1:1987 -> ASCII-BS (many to many) 0000000 047 010 145 012 ' \b e \n 0000004
In the first case, the e with an acute accent is merely
transmitted by the Latin-1:ASCII
mapping, not having a special
recoding rule for it. In the Latin-1:ASCII-BS
case, the acute
accent is applied over the e with a backspace: diacriticized
characters have special rules. For the ASCII-BS
charset,
reversibility is still possible, but there might be difficult cases.
This charset is available in recode
under the name flat
.
This code is ASCII expunged of all diacritics and underlines, as long as they are applied using three character sequences, with BS in the middle. Also, despite slightly unrelated, each control character is represented by a sequence of two or three graphic characters. The newline character, however, keeps its functionality and is not represented.
Note that charset flat
is a terminal charset. We can convert
to flat
, but not from it.
Go to the previous, next section.