Previous Next Contents

12. How to make other programs work with non-ASCII chars

In the bad old days this used to be quite a hassle. Every separate program had to be convinced individually to leave your bits alone. Not that all is easy now, but recently a lot of gnu utilities have learned to react to LC_CTYPE=iso_8859_1 or LC_CTYPE=iso-8859-1. Try this first, and if it doesn't help look at the hints below.

First of all, the 8-th bit should survive the kernel input processing, so make sure to have stty cs8 -istrip -parenb set.

A. For emacs, put lines

        (standard-display-european t)
        (set-input-mode nil nil 1)
        (require 'iso-syntax)
and perhaps also
        (load-file "iso-insert.el")
        (define-key global-map [?\C-.] 8859-1-map)
into your $HOME/.emacs. (The latter line will not work under xterm, if you use emacs -nw, but in that case you can put
        XTerm*VT100.Translations:       #override\n\
        Ctrl <KeyPress> . : string("\0308")
in your .Xresources.)

B. For less, put LESSCHARSET=latin1 in the environment.

C. For ls, give the option -N. (Probably you want to make an alias.)

D. For bash (version 1.13.*), put

        set meta-flag on
        set convert-meta off
and, according to the Danish HOWTO,
        set output-meta on
into your $HOME/.inputrc.

E. For tcsh, use

        setenv LANG     US_en
        setenv LC_CTYPE iso_8859_1
If you have nls on your system, then the corresponding routines are used. Otherwise tcsh will assume iso_8859_1, regardless of the values given to LANG and LC_CTYPE. See the section NATIVE LANGUAGE SYSTEM in tcsh(1). (The Danish HOWTO says: setenv LC_CTYPE ISO-8859-1; stty pass8)

F. For flex, give the option -8 if the parser it generates must be able to handle 8-bit input. (Of course it must.)

G. For elm, set displaycharset to ISO-8859-1. (Danish HOWTO: LANG=C and LC_CTYPE=ISO-8859-1)

H. For programs using curses (such as lynx) David Sibley reports: The regular curses package uses the high-order bit for reverse video mode (see flag _STANDOUT defined in /usr/include/curses.h). However, ncurses seems to be 8-bit clean and does display iso-latin-8859-1 correctly.

I. For programs using groff (such as man), make sure to use -Tlatin1 instead of -Tascii. Old versions of the program man also use col, and the next point also applies.

J. For col, make sure 1) that it is fixed so as to do setlocale(LC_CTYPE,""); and 2) to put LC_CTYPE=ISO-8859-1 in the environment.

K. For rlogin, use option -8.

L. For joe, sunsite.unc.edu:/pub/Linux/apps/editors/joe-1.0.8-linux.tar.gz is said to work after editing the configuration file. Someone else said: joe: Put the -asis option in /isr/lib/joerc in the first column.

M. For LaTeX: \documentstyle[isolatin]{article}. For LaTeX2e: \documentclass{article}\usepackage{isolatin} where isolatin.sty is available from ftp://ftp.vlsivie.tuwien.ac.at/pub/8bit.

A nice discussion on the topic of ISO-8859-1 and how to manage 8-bit characters is contained in the file grasp.insa-lyon.fr:/pub/faq/fr/accents (in French). Another fine discussion (in English) can be found in rtfm.mit.edu:pub/usenet-by-group/comp.answers/character-sets/iso-8859-1-faq. And another(?), in ftp.vlsivie.tuwien.ac.at:/pub/8bit/FAQ-ISO-8859-1.


Previous Next Contents