Go to the previous, next section.

Charsets from RFC 1345

In the GNU recode distribution, there is a copy of RFC 1345:

"Character Mnemonics & Character Sets", K. Simonsen, Request for Comments no. 1345, Network Working Group, June 1992.

This document is also available by anonymous ftp at `nic.ddn.mil' in directory `rfc' as file `rfc1345.txt'. This report defines many character mnemonics and character sets.

GNU recode implements most of RFC 1345, however:

  1. It does not recognize 16-bits charsets: GB_2312-80, JIS_C6226-1978, JIS_C6226-1983, JIS_X0212-1990 and KS_C_5601-1987.

  2. It does not recognize those charsets which combine two characters for representing a third: ANSI_X3.110-1983, ISO_6937-2-add, T.101-G2, T.61-8bit, iso-ir-90 and videotex-suppl.

  3. It interprets the charset isoir91 as NATS-DANO (alias iso-ir-9-1), not as JIS_C6229-1984-a (alias iso-ir-91). So better avoid using these two alias names.

  4. It interprets the charset isoir92 as NATS-DANO-ADD (alias iso-ir-9-2), not as JIS_C6229-1984-b (alias iso-ir-92). So better avoid using these two alias names.

  5. It ignores all about code overloading, but still processes correctly the remainder of dk-us and us-dk.

Keld Simonsen `keld@dkuug.dk' did most of RFC 1345 himself, with some funding from Danish Standards and Nordic standards (INSTA) project. He also did the character set design work, with substantial input from Olle Jaernefors. Keld typed in almost all of the tables, some have been contributed. A number of people have checked the tables in various ways. The RFC lists a number of people who helped.

Internally, RFC 1345 associates which each character an unambiguous mnemonic of (usually) one or two characters, taken from ISO 646, a minimal set of 83 characters. The charset made up by these mnemonics is available in recode under the name RFC 1345, with . being accepted as a short alias.

Even if the mnemonics are unambiguous taken separately, strings made up by concatenating these mnemonics are ambiguous and cannot be safely interpreted. So recode only allows converting to RFC 1345, never from it. However, special machinery in the program allows for converting through RFC 1345, when RFC 1345 is neither the initial nor the final charset of the conversion sequence.

Recoding directly to . has the main goal of letting the user examine foreign charsets. We cannot do much, mechanically, with the result. For increased readability, as a matter of convenience, SP is left as a single space and LF becomes a newline.

ANSI_X3.4-1968
ANSI_X3.4-1986, ASCII, IBM367, ISO646-US, ISO_646.irv:1991, US-ASCII, cp367, iso-ir-6 and us are aliases for this charset. source: ECMA registry

ASMO_449
ISO_9036, arabic7 and iso-ir-89 are aliases for this charset. source: ECMA registry

BS_4730
ISO646-GB, gb, iso-ir-4 and uk are aliases for this charset. source: ECMA registry

BS_viewdata
iso-ir-47 is an alias for this charset. source: ECMA registry

CSA_Z243.4-1985-1
ISO646-CA, ca, csa7-1 and iso-ir-121 are aliases for this charset. source: ECMA registry

CSA_Z243.4-1985-2
ISO646-CA2, csa7-2 and iso-ir-122 are aliases for this charset. source: ECMA registry

CSA_Z243.4-1985-gr
iso-ir-123 is an alias for this charset. source: ECMA registry

CSN_369103
iso-ir-139 is an alias for this charset. source: ECMA registry

DEC-MCS
dec is an alias for this charset. VAX/VMS User's Manual, Order Number: AI-Y517A-TE, April 1986.

DIN_66003
ISO646-DE, de and iso-ir-21 are aliases for this charset. source: ECMA registry

DS_2089
DS2089, ISO646-DK and dk are aliases for this charset. source: Danish Standard, DS 2089, February 1974

EBCDIC-AT-DE
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-AT-DE-A
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-CA-FR
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-DK-NO
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-DK-NO-A
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-ES
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-ES-A
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-ES-S
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-FI-SE
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-FI-SE-A
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-FR
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-IT
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-PT
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-UK
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

EBCDIC-US
source: IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987

ECMA-cyrillic
iso-ir-111 is an alias for this charset. source: ECMA registry

ES
ISO646-ES and iso-ir-17 are aliases for this charset. source: ECMA registry

ES2
ISO646-ES2 and iso-ir-85 are aliases for this charset. source: ECMA registry

GB_1988-80
ISO646-CN, cn and iso-ir-57 are aliases for this charset. source: ECMA registry

GOST_19768-74
ST_SEV_358-88 and iso-ir-153 are aliases for this charset. source: ECMA registry

IBM037
cp037, ebcdic-cp-ca, ebcdic-cp-nl, ebcdic-cp-us and ebcdic-cp-wt are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM038
EBCDIC-INT and cp038 are aliases for this charset. source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990

IBM1026
CP1026 is an alias for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM273
CP273 is an alias for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM274
CP274 and EBCDIC-BE are aliases for this charset. source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990

IBM275
EBCDIC-BR and cp275 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM277
EBCDIC-CP-DK and EBCDIC-CP-NO are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM278
CP278, ebcdic-cp-fi and ebcdic-cp-se are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM280
CP280 and ebcdic-cp-it are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM281
EBCDIC-JP-E and cp281 are aliases for this charset. source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990

IBM284
CP284 and ebcdic-cp-es are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM285
CP285 and ebcdic-cp-gb are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM290
EBCDIC-JP-kana and cp290 are aliases for this charset. source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990

IBM297
cp297 and ebcdic-cp-fr are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM420
cp420 and ebcdic-cp-ar1 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990 IBM NLS RM p 11-11

IBM423
cp423 and ebcdic-cp-gr are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM424
cp424 and ebcdic-cp-he are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM437
437 and cp437 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM500
CP500, ebcdic-cp-be and ebcdic-cp-ch are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM850
850 and cp850 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM851
851 and cp851 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM852
852 and cp852 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM855
855 and cp855 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM857
857 and cp857 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM860
860 and cp860 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM861
861, cp-is and cp861 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM862
862 and cp862 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM863
863 and cp863 are aliases for this charset. source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991

IBM864
cp864 is an alias for this charset. source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991

IBM865
865 and cp865 are aliases for this charset. source: IBM DOS 3.3 Ref (Abridged), 94X9575 (Feb 1987)

IBM868
CP868 and cp-ar are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM869
869, cp-gr and cp869 are aliases for this charset. source: IBM Keyboard layouts and code pages, PN 07G4586 June 1991

IBM870
CP870, ebcdic-cp-roece and ebcdic-cp-yu are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM871
CP871 and ebcdic-cp-is are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM880
EBCDIC-Cyrillic and cp880 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM891
cp891 is an alias for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM903
cp903 is an alias for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM904
904 and cp904 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IBM905
CP905 and ebcdic-cp-tr are aliases for this charset. source: IBM 3174 Character Set Ref, GA27-3831-02, March 1990

IBM918
CP918 and ebcdic-cp-ar2 are aliases for this charset. source: IBM NLS RM Vol2 SE09-8002-01, March 1990

IEC_P27-1
iso-ir-143 is an alias for this charset. source: ECMA registry

INIS
iso-ir-49 is an alias for this charset. source: ECMA registry

INIS-8
iso-ir-50 is an alias for this charset. source: ECMA registry

INIS-cyrillic
iso-ir-51 is an alias for this charset. source: ECMA registry

INVARIANT

ISO_10367-box
iso-ir-155 is an alias for this charset. source: ECMA registry

ISO_2033-1983
e13b and iso-ir-98 are aliases for this charset. source: ECMA registry

ISO_5427
iso-ir-37 is an alias for this charset. source: ECMA registry

ISO_5427:1981
iso-ir-54 is an alias for this charset. source: ECMA registry

ISO_5428:1980
iso-ir-55 is an alias for this charset. source: ECMA registry

ISO_646.basic:1983
ref is an alias for this charset. source: ECMA registry

ISO_646.irv:1983
irv and iso-ir-2 are aliases for this charset. source: ECMA registry

ISO_6937-2-25
iso-ir-152 is an alias for this charset. source: ECMA registry

ISO_8859-1:1987
CP819, IBM819, ISO-8859-1, ISO_8859-1, iso-ir-100, l1 and latin1 are aliases for this charset. source: ECMA registry

ISO_8859-2:1987
ISO-8859-2, ISO_8859-2, iso-ir-101, l2 and latin2 are aliases for this charset. source: ECMA registry

ISO_8859-3:1988
ISO-8859-3, ISO_8859-3, iso-ir-109, l3 and latin3 are aliases for this charset. source: ECMA registry

ISO_8859-4:1988
ISO-8859-4, ISO_8859-4, iso-ir-110, l4 and latin4 are aliases for this charset. source: ECMA registry

ISO_8859-5:1988
ISO-8859-5, ISO_8859-5, cyrillic and iso-ir-144 are aliases for this charset. source: ECMA registry

ISO_8859-6:1987
ASMO-708, ECMA-114, ISO-8859-6, ISO_8859-6, arabic and iso-ir-127 are aliases for this charset. source: ECMA registry

ISO_8859-7:1987
ECMA-118, ELOT_928, ISO-8859-7, ISO_8859-7, greek, greek8 and iso-ir-126 are aliases for this charset. source: ECMA registry

ISO_8859-8:1988
ISO-8859-8, ISO_8859-8, hebrew and iso-ir-138 are aliases for this charset. source: ECMA registry

ISO_8859-9:1989
ISO-8859-9, ISO_8859-9, iso-ir-148, l5 and latin5 are aliases for this charset. source: ECMA registry

ISO_8859-supp
iso-ir-154 and latin1-2-5 are aliases for this charset. source: ECMA registry

IT
ISO646-IT and iso-ir-15 are aliases for this charset. source: ECMA registry

JIS_C6220-1969-jp
JIS_C6220-1969, iso-ir-13, katakana and x0201-7 are aliases for this charset. source: ECMA registry

JIS_C6220-1969-ro
ISO646-JP, iso-ir-14 and jp are aliases for this charset. source: ECMA registry

JIS_C6229-1984-a
jp-ocr-a is an alias for this charset. source: ECMA registry

JIS_C6229-1984-b
ISO646-JP-OCR-B and jp-ocr-b are aliases for this charset. source: ECMA registry

JIS_C6229-1984-b-add
iso-ir-93 and jp-ocr-b-add are aliases for this charset. source: ECMA registry

JIS_C6229-1984-hand
iso-ir-94 and jp-ocr-hand are aliases for this charset. source: ECMA registry

JIS_C6229-1984-hand-add
iso-ir-95 and jp-ocr-hand-add are aliases for this charset. source: ECMA registry

JIS_C6229-1984-kana
iso-ir-96 is an alias for this charset. source: ECMA registry

JIS_X0201
X0201 is an alias for this charset.

JUS_I.B1.002
ISO646-YU, iso-ir-141, js and yu are aliases for this charset. source: ECMA registry

JUS_I.B1.003-mac
iso-ir-147 and macedonian are aliases for this charset. source: ECMA registry

JUS_I.B1.003-serb
iso-ir-146 and serbian are aliases for this charset. source: ECMA registry

KSC5636
ISO646-KR is an alias for this charset.

Latin-greek-1
iso-ir-27 is an alias for this charset. source: ECMA registry

MSZ_7795.3
ISO646-HU, hu and iso-ir-86 are aliases for this charset. source: ECMA registry

NATS-DANO
iso-ir-9-1 is an alias for this charset. source: ECMA registry

NATS-DANO-ADD
iso-ir-9-2 is an alias for this charset. source: ECMA registry

NATS-SEFI
iso-ir-8-1 is an alias for this charset. source: ECMA registry

NATS-SEFI-ADD
iso-ir-8-2 is an alias for this charset. source: ECMA registry

NC_NC00-10:81
ISO646-CU, cuba and iso-ir-151 are aliases for this charset. source: ECMA registry

NF_Z_62-010
ISO646-FR, fr and iso-ir-69 are aliases for this charset. source: ECMA registry

NF_Z_62-010_(1973)
ISO646-FR1 and iso-ir-25 are aliases for this charset. source: ECMA registry

NS_4551-1
ISO646-NO, iso-ir-60 and no are aliases for this charset. source: ECMA registry

NS_4551-2
ISO646-NO2, iso-ir-61 and no2 are aliases for this charset. source: ECMA registry

PT
ISO646-PT and iso-ir-16 are aliases for this charset. source: ECMA registry

PT2
ISO646-PT2 and iso-ir-84 are aliases for this charset. source: ECMA registry

SEN_850200_B
FI, ISO646-FI, ISO646-SE, iso-ir-10 and se are aliases for this charset. source: ECMA registry

SEN_850200_C
ISO646-SE2, iso-ir-11 and se2 are aliases for this charset. source: ECMA registry

T.61-7bit
iso-ir-102 is an alias for this charset. source: ECMA registry

dk-us

greek-ccitt
iso-ir-150 is an alias for this charset. source: ECMA registry

greek7
iso-ir-88 is an alias for this charset. source: ECMA registry

greek7-old
iso-ir-18 is an alias for this charset. source: ECMA registry

hp-roman8
r8 and roman8 are aliases for this charset. source: LaserJet IIP Printer User's Manual, HP part no 33471-90901, Hewlet-Packard, June 1989.

latin-greek
iso-ir-19 is an alias for this charset. source: ECMA registry

latin-lap
iso-ir-158 and lap are aliases for this charset. source: ECMA registry

latin6
iso-ir-157 and l6 are aliases for this charset. source: ECMA registry

macintosh
mac is an alias for this charset. source: The Unicode Standard ver1.0, ISBN 0-201-56788-1, Oct 1991

us-dk
for compatibility with ASCII

Go to the previous, next section.