Go to the previous, next section.

Installing gawk

This chapter provides instructions for installing gawk on the various platforms that are supported by the developers. The primary developers support Unix (and one day, GNU), while the other ports were contributed. The file `ACKNOWLEDGMENT' in the gawk distribution lists the electronic mail addresses of the people who did the respective ports.

The gawk Distribution

This section first describes how to get and extract the gawk distribution, and then discusses what is in the various files and subdirectories.

Getting the gawk Distribution

gawk is distributed as a tar file compressed with the GNU Zip program, gzip. You can get it via anonymous ftp to the Internet host prep.ai.mit.edu. Like all GNU software, it will be archived at other well known systems, from which it will be possible to use some sort of anonymous uucp to obtain the distribution as well. You can also order gawk on tape or CD-ROM directly from the Free Software Foundation. (The address is on the copyright page.) Doing so directly contributes to the support of the foundation and to the production of more free software.

Once you have the distribution (for example, `gawk-2.15.0.tar.z'), first use gzip to expand the file, and then use tar to extract it. You can use the following pipeline to produce the gawk distribution:

# Under System V, add 'o' to the tar flags
gzip -d -c gawk-2.15.0.tar.z | tar -xvpf -

This will create a directory named `gawk-2.15' in the current directory.

The distribution file name is of the form `gawk-2.15.n.tar.Z'. The n represents a patchlevel, meaning that minor bugs have been fixed in the major release. The current patchlevel is 0, but when retrieving distributions, you should get the version with the highest patchlevel.

If you are not on a Unix system, you will need to make other arrangements for getting and extracting the gawk distribution. You should consult a local expert.

Contents of the gawk Distribution

gawk has a number of C source files, documentation files, subdirectories and files related to the configuration process (see section Compiling and Installing gawk on Unix), and several subdirectories related to different, non-Unix, operating systems.

various `.c', `.y', and `.h' files

The C and YACC source files are the actual gawk source code.

Descriptive files: `README' for gawk under Unix, and the rest for the various hardware and software combinations.

A list of systems to which gawk has been ported, and which have successfully run the test suite.

A list of the people who contributed major parts of the code or documentation.

A list of changes to gawk since the last release or patch.

The GNU General Public License.

A brief list of features and/or changes being contemplated for future releases, with some indication of the time frame for the feature, based on its difficulty.

A list of those factors that limit gawk's performance. Most of these depend on the hardware or operating system software, and are not limits in gawk itself.

A file describing known problems with the current release.

The troff source for a manual page describing gawk.

The texinfo source file for this manual. It should be processed with TeX to produce a printed manual, and with makeinfo to produce the Info file.

These files and subdirectories are used when configuring gawk for various Unix systems. They are explained in detail in section Compiling and Installing gawk on Unix.

Files needed for building gawk on an Atari ST. See section Installing gawk on the Atari ST, for details.

Files needed for building gawk under MS-DOS. See section Installing gawk on MS-DOS, for details.

Files needed for building gawk under VMS. See section Compiling, Installing, and Running gawk on VMS, for details.

Many interesting awk programs, provided as a test suite for gawk. You can use `make test' from the top level gawk directory to run your version of gawk against the test suite. If gawk successfully passes `make test' then you can be confident of a successful port.

Compiling and Installing gawk on Unix

Often, you can compile and install gawk by typing only two commands. However, if you do not use a supported system, you may need to configure gawk for your system yourself.

Compiling gawk for a Supported Unix Version

After you have extracted the gawk distribution, cd to `gawk-2.15'. Look in the `config' subdirectory for a file that matches your hardware/software combination. In general, only the software is relevant; for example sunos41 is used for SunOS 4.1, on both Sun 3 and Sun 4 hardware.

If you find such a file, run the command:

# assume you have SunOS 4.1
./configure sunos41

This produces a `Makefile' and `config.h' tailored to your system. You may wish to edit the `Makefile' to use a different C compiler, such as gcc, the GNU C compiler, if you have it. You may also wish to change the CFLAGS variable, which controls the command line options that are passed to the C compiler (such as optimization levels, or compiling for debugging).

After you have configured `Makefile' and `config.h', type:


and shortly thereafter, you should have an executable version of gawk. That's all there is to it!

The Configuration Process

(This section is of interest only if you know something about using the C language and the Unix operating system.)

The source code for gawk generally attempts to adhere to industry standards wherever possible. This means that gawk uses library routines that are specified by the ANSI C standard and by the POSIX operating system interface standard. When using an ANSI C compiler, function prototypes are provided to help improve the compile-time checking.

Many older Unix systems do not support all of either the ANSI or the POSIX standards. The `missing' subdirectory in the gawk distribution contains replacement versions of those subroutines that are most likely to be missing.

The `config.h' file that is created by the configure program contains definitions that describe features of the particular operating system where you are attempting to compile gawk. For the most part, it lists which standard subroutines are not available. For example, if your system lacks the `getopt' routine, then `GETOPT_MISSING' would be defined.

`config.h' also defines constants that describe facts about your variant of Unix. For example, there may not be an `st_blksize' element in the stat structure. In this case `BLKSIZE_MISSING' would be defined.

Based on the list in `config.h' of standard subroutines that are missing, `missing.c' will do a `#include' of the appropriate file(s) from the `missing' subdirectory.

Conditionally compiled code in the other source files relies on the other definitions in the `config.h' file.

Besides creating `config.h', configure produces a `Makefile' from `Makefile.in'. There are a number of lines in `Makefile.in' that are system or feature specific. For example, there is line that begins with `##MAKE_ALLOCA_C##'. This is normally a comment line, since it starts with `#'. If a configuration file has `MAKE_ALLOCA_C' in it, then configure will delete the `##MAKE_ALLOCA_C##' from the beginning of the line. This will enable the rules in the `Makefile' that use a C version of `alloca'. There are several similar features that work in this fashion.

Configuring gawk for a New System

(This section is of interest only if you know something about using the C language and the Unix operating system, and if you have to install gawk on a system that is not supported by the gawk distribution. If you are a C or Unix novice, get help from a local expert.)

If you need to configure gawk for a Unix system that is not supported in the distribution, first see section The Configuration Process. Then, copy `config.in' to `config.h', and copy `Makefile.in' to `Makefile'.

Next, edit both files. Both files are liberally commented, and the necessary changes should be straightforward.

While editing `config.h', you need to determine what library routines you do or do not have by consulting your system documentation, or by perusing your actual libraries using the ar or nm utilities. In the worst case, simply do not define any of the macros for missing subroutines. When you compile gawk, the final link-editing step will fail. The link editor will provide you with a list of unresolved external references--these are the missing subroutines. Edit `config.h' again and recompile, and you should be set.

Editing the `Makefile' should also be straightforward. Enable or disable the lines that begin with `##MAKE_whatever##', as appropriate. Select the correct C compiler and CFLAGS for it. Then run make.

Getting a correct configuration is likely to be an iterative process. Do not be discouraged if it takes you several tries. If you have no luck whatsoever, please report your system type, and the steps you took. Once you do have a working configuration, please send it to the maintainers so that support for your system can be added to the official release.

See section Reporting Problems and Bugs, for information on how to report problems in configuring gawk. You may also use the same mechanisms for sending in new configurations.

Compiling, Installing, and Running gawk on VMS

This section describes how to compile and install gawk under VMS.

Compiling gawk under VMS

To compile gawk under VMS, there is a DCL command procedure that will issue all the necessary CC and LINK commands, and there is also a `Makefile' for use with the MMS utility. From the source directory, use either




Depending upon which C compiler you are using, follow one of the sets of instructions in this table:

VAX C V3.x
Use either `vmsbuild.com' or `descrip.mms' as is. These use CC/OPTIMIZE=NOLINE, which is essential for Version 3.0.

VAX C V2.x
You must have Version 2.3 or 2.4; older ones won't work. Edit either `vmsbuild.com' or `descrip.mms' according to the comments in them. For `vmsbuild.com', this just entails removing two `!' delimiters. Also edit `config.h' (which is a copy of file `[.config]vms-conf.h') and comment out or delete the two lines `#define __STDC__ 0' and `#define VAXC_BUILTINS' near the end.

Edit `vmsbuild.com' or `descrip.mms'; the changes are different from those for VAX C V2.x, but equally straightforward. No changes to `config.h' should be needed.

Edit `vmsbuild.com' or `descrip.mms' according to their comments. No changes to `config.h' should be needed.

gawk 2.15 has been tested under VAX/VMS 5.5-1 using VAX C V3.2, GNU C 1.40 and 2.3. It should work without modifications for VMS V4.6 and up.

Installing gawk on VMS

To install gawk, all you need is a "foreign" command, which is a DCL symbol whose value begins with a dollar sign.

$ GAWK :== $device:[directory]GAWK

(Substitute the actual location of gawk.exe for `device:[directory]'.) The symbol should be placed in the `login.com' of any user who wishes to run gawk, so that it will be defined every time the user logs on. Alternatively, the symbol may be placed in the system-wide `sylogin.com' procedure, which will allow all users to run gawk.

Optionally, the help entry can be loaded into a VMS help library:


(You may want to substitute a site-specific help library rather than the standard VMS library `HELPLIB'.) After loading the help text,


will provide information about both the gawk implementation and the awk programming language.

The logical name `AWK_LIBRARY' can designate a default location for awk program files. For the `-f' option, if the specified filename has no device or directory path information in it, gawk will look in the current directory first, then in the directory specified by the translation of `AWK_LIBRARY' if the file was not found. If after searching in both directories, the file still is not found, then gawk appends the suffix `.awk' to the filename and the file search will be re-tried. If `AWK_LIBRARY' is not defined, that portion of the file search will fail benignly.

Running gawk on VMS

Command line parsing and quoting conventions are significantly different on VMS, so examples in this manual or from other sources often need minor changes. They are minor though, and all awk programs should run correctly.

Here are a couple of trivial tests:

$ gawk -- "BEGIN {print ""Hello, World!""}"
$ gawk -"W" version     ! could also be -"W version" or "-W version"

Note that upper-case and mixed-case text must be quoted.

The VMS port of gawk includes a DCL-style interface in addition to the original shell-style interface (see the help entry for details). One side-effect of dual command line parsing is that if there is only a single parameter (as in the quoted string program above), the command becomes ambiguous. To work around this, the normally optional `--' flag is required to force Unix style rather than DCL parsing. If any other dash-type options (or multiple parameters such as data files to be processed) are present, there is no ambiguity and `--' can be omitted.

The default search path when looking for awk program files specified by the `-f' option is "SYS$DISK:[],AWK_LIBRARY:". The logical name `AWKPATH' can be used to override this default. The format of `AWKPATH' is a comma-separated list of directory specifications. When defining it, the value should be quoted so that it retains a single translation, and not a multi-translation RMS searchlist.

Building and using gawk under VMS POSIX

Ignore the instructions above, although `vms/gawk.hlp' should still be made available in a help library. Make sure that the two scripts, `configure' and `mungeconf', are executable; use `chmod +x' on them if necessary. Then execute the following commands:

psx> configure vms-posix
psx> make awktab.c gawk

The first command will construct files `config.h' and `Makefile' out of templates. The second command will compile and link gawk. Due to a make bug in VMS POSIX V1.0 and V1.1, the file `awktab.c' must be given as an explicit target or it will not be built and the final link step will fail. Ignore the warning `"Could not find lib m in lib list"'; it is harmless, caused by the explicit use of `-lm' as a linker option which is not needed under VMS POSIX. Under V1.1 (but not V1.0) a problem with the yacc skeleton `/etc/yyparse.c' will cause a compiler warning for `awktab.c', followed by a linker warning about compilation warnings in the resulting object module. These warnings can be ignored.

Once built, gawk will work like any other shell utility. Unlike the normal VMS port of gawk, no special command line manipulation is needed in the VMS POSIX environment.

Installing gawk on MS-DOS

The first step is to get all the files in the gawk distribution onto your PC. Move all the files from the `pc' directory into the main directory where the other files are. Edit the file `make.bat' so that it will be an acceptable MS-DOS batch file. This means making sure that all lines are terminated with the ASCII carriage return and line feed characters. restrictions.

gawk has only been compiled with version 5.1 of the Microsoft C compiler. The file `make.bat' from the `pc' directory assumes that you have this compiler.

Copy the file `setargv.obj' from the library directory where it resides to the gawk source code directory.

Run `make.bat'. This will compile gawk for you, and link it. That's all there is to it!

Installing gawk on the Atari ST

This section assumes that you are running TOS. It applies to other Atari models (STe, TT) as well.

In order to use gawk, you need to have a shell, either text or graphics, that does not map all the characters of a command line to upper case. Maintaining case distinction in option flags is very important (see section Invoking awk). Popular shells like gulam or gemini will work, as will newer versions of desktop. Support for I/O redirection is necessary to make it easy to import awk programs from other environments. Pipes are nice to have, but not vital.

If you have received an executable version of gawk, place it, as usual, anywhere in your PATH where your shell will find it.

While executing, gawk creates a number of temporary files. gawk looks for either of the environment variables TEMP or TMPDIR, in that order. If either one is found, its value is assumed to be a directory for temporary files. This directory must exist, and if you can spare the memory, it is a good idea to put it on a RAM drive. If neither TEMP nor TMPDIR are found, then gawk uses the current directory for its temporary files.

The ST version of gawk searches for its program files as described in section The AWKPATH Environment Variable. On the ST, the default value for the AWKPATH variable is ".,c:\lib\awk,c:\gnu\lib\awk". The search path can be modified by explicitly setting AWKPATH to whatever you wish. Note that colons cannot be used on the ST to separate elements in the AWKPATH variable, since they have another, reserved, meaning. Instead, you must use a comma to separate elements in the path. If you are recompiling gawk on the ST, then you can choose a new default search path, by setting the value of `DEFPATH' in the file `...\config\atari'. You may choose a different separator character by setting the value of `ENVSEP' in the same file. The new values will be used when creating the header file `config.h'.

Although awk allows great flexibility in doing I/O redirections from within a program, this facility should be used with care on the ST. In some circumstances the OS routines for file handle pool processing lose track of certain events, causing the computer to crash, and requiring a reboot. Often a warm reboot is sufficient. Fortunately, this happens infrequently, and in rather esoteric situations. In particular, avoid having one part of an awk program using print statements explicitly redirected to "/dev/stdout", while other print statements use the default standard output, and a calling shell has redirected standard output to a file.

When gawk is compiled with the ST version of gcc and its usual libraries, it will accept both `/' and `\' as path separators. While this is convenient, it should be remembered that this removes one, technically legal, character (`/') from your file names, and that it may create problems for external programs, called via the system() function, which may not support this convention. Whenever it is possible that a file created by gawk will be used by some other program, use only backslashes. Also remember that in awk, backslashes in strings have to be doubled in order to get literal backslashes.

The initial port of gawk to the ST was done with gcc. If you wish to recompile gawk from scratch, you will need to use a compiler that accepts ANSI standard C (such as gcc, Turbo C, or Prospero C). If sizeof(int) != sizeof(int *), the correctness of the generated code depends heavily on the fact that all function calls have function prototypes in the current scope. If your compiler does not accept function prototypes, you will probably have to add a number of casts to the code.

If you are using gcc, make sure that you have up-to-date libraries. Older versions have problems with some library functions (atan2(), strftime(), the `%g' conversion in sprintf()) which may affect the operation of gawk.

In the `atari' subdirectory of the gawk distribution is a version of the system() function that has been tested with gulam and msh; it should work with other shells as well. With gulam, it passes the string to be executed without spawning an extra copy of a shell. It is possible to replace this version of system() with a similar function from a library or from some other source if that version would be a better choice for the shell you prefer.

The files needed to recompile gawk on the ST can be found in the `atari' directory. The provided files and instructions below assume that you have the GNU C compiler (gcc), the gulam shell, and an ST version of sed. The `Makefile' is set up to use `byacc' as a `yacc' replacement. With a different set of tools some adjustments and/or editing will be needed.

cd to the `atari' directory. Copy `Makefile.st' to `makefile' in the source (parent) directory. Possibly adjust `../config/atari' to suit your system. Execute the script `mkconf.g' which will create the header file `../config.h'. Go back to the source directory. If you are not using gcc, check the file `missing.c'. It may be necessary to change forward slashes in the references to files from the `atari' subdirectory into backslashes. Type make and enjoy.

Compilation with gcc of some of the bigger modules, like `awk_tab.c', may require a full four megabytes of memory. On smaller machines you would need to cut down on optimizations, or you would have to switch to another, less memory hungry, compiler.

Go to the previous, next section.