8 SWIG library

To help build extension modules, SWIG is packaged with a library of support files that you can include in your own interfaces. These files often define new SWIG directives or provide utility functions that can be used to access parts of the standard C and C++ libraries. This chapter provides a reference to the current set of supported library files.

Compatibility note: Older versions of SWIG included a number of library files for manipulating pointers, arrays, and other structures. Most these files are now deprecated and have been removed from the distribution. Alternative libraries provide similar functionality. Please read this chapter carefully if you used the old libraries.

8.1 The %include directive and library search path

Library files are included using the %include directive. When searching for files, directories are searched in the following order:

Within each directory, SWIG first looks for a subdirectory corresponding to a target language (e.g., python, tcl, etc.). If found, SWIG will search the language specific directory first. This allows for language-specific implementations of library files.

You can override the location of the SWIG library by setting the SWIG_LIB environment variable.

8.2 C Arrays and Pointers

This section describes library modules for manipulating low-level C arrays and pointers. The primary use of these modules is in supporting C declarations that manipulate bare pointers such as int *, double *, or void *. The modules can be used to allocate memory, manufacture pointers, dereference memory, and wrap pointers as class-like objects. Since these functions provide direct access to memory, their use is potentially unsafe and you should exercise caution.

8.2.1 cpointer.i

The cpointer.i module defines macros that can be used to used to generate wrappers around simple C pointers. The primary use of this module is in generating pointers to primitive datatypes such as int and double.

%pointer_functions(type,name)

Generates a collection of four functions for manipulating a pointer type *:

type *new_name()

Creates a new object of type type and returns a pointer to it. In C, the object is created using calloc(). In C++, new is used.

type *copy_name(type value)

Creates a new object of type type and returns a pointer to it. An initial value is set by copying it from value. In C, the object is created using calloc(). In C++, new is used.

type *delete_name(type *obj)

Deletes an object type type.

void name_assign(type *obj, type value)

Assigns *obj = value.

type name_value(type *obj)

Returns the value of *obj.

When using this macro, type may be any type and name must be a legal identifier in the target language. name should not correspond to any other name used in the interface file.

Here is a simple example of using %pointer_functions():

%module example
%include "cpointer.i"

/* Create some functions for working with "int *" */
%pointer_functions(int, intp);

/* A function that uses an "int *" */
void add(int x, int y, int *result);

Now, in Python:

>>> import example
>>> c = example.new_intp()     # Create an "int" for storing result
>>> example.add(3,4,c)         # Call function   
>>> example.intp_value(c)      # Dereference
7
>>> example.delete_intp(c)     # Delete

%pointer_class(type,name)

Wraps a pointer of type * inside a class-based interface. This interface is as follows:

struct name {
   name();                            // Create pointer object
  ~name();                            // Delete pointer object
   void assign(type value);           // Assign value
   type value();                      // Get value
   type *cast();                      // Cast the pointer to original type
   static name *frompointer(type *);  // Create class wrapper from existing
                                      // pointer
};

When using this macro, type is restricted to a simple type name like int, float, or Foo. Pointers and other complicated types are not allowed. name must be a valid identifier not already in use. When a pointer is wrapped as a class, the "class" may be transparently passed to any function that expects the pointer.

If the target language does not support proxy classes, the use of this macro will produce the example same functions as %pointer_functions() macro.

It should be noted that the class interface does introduce a new object or wrap a pointer inside a special structure. Instead, the raw pointer is used directly.

Here is the same example using a class instead:

%module example
%include "cpointer.i"

/* Wrap a class interface around an "int *" */
%pointer_class(int, intp);

/* A function that uses an "int *" */
void add(int x, int y, int *result);

Now, in Python (using proxy classes)

>>> import example
>>> c = example.intp()         # Create an "int" for storing result
>>> example.add(3,4,c)         # Call function   
>>> c.value()                  # Dereference
7

Of the two macros, %pointer_class is probably the most convenient when working with simple pointers. This is because the pointers are access like objects and they can be easily garbage collected (destruction of the pointer object destroys the underlying object).

%pointer_cast(type1, type2, name)

Creates a casting function that converts type1 to type2. The name of the function is name. For example:

%pointer_cast(int *, unsigned int *, int_to_uint);

In this example, the function int_to_uint() would be used to cast types in the target language.

Note: None of these macros can be used to safely work with strings (char * or char **).

Note: When working with simple pointers, typemaps can often be used to provide more seamless operation.

8.2.2 carrays.i

This module defines macros that assist in wrapping ordinary C pointers as arrays. The module does not provide any safety or an extra layer of wrapping--it merely provides functionality for creating, destroying, and modifying the contents of raw C array data.

%array_functions(type,name)

Creates four functions.

type *new_name(int nelements)

Creates a new array of objects of type type. In C, the array is allocated using calloc(). In C++, new [] is used.

type *delete_name(type *ary)

Deletes an array. In C, free() is used. In C++, delete [] is used.

type name_getitem(type *ary, int index)

Returns the value ary[index].

void name_setitem(type *ary, int index, type value)

Assigns ary[index] = value.

When using this macro, type may be any type and name must be a legal identifier in the target language. name should not correspond to any other name used in the interface file.

Here is an example of %array_functions(). Suppose you had a function like this:

void print_array(double x[10]) {
   int i;
   for (i = 0; i < 10; i++) {
      printf("[%d] = %g\n", i, x[i]);
   }
}

To wrap it, you might write this:

%module example

%include "carrays.i"
%array_functions(double, doubleArray);

void print_array(double x[10]);

Now, in a scripting language, you might write this:

a = new_doubleArray(10)           # Create an array
for i in range(0,10):
    doubleArray_setitem(a,i,2*i)  # Set a value
print_array(a)                    # Pass to C
delete_doubleArray(a)             # Destroy array
%array_class(type,name)

Wraps a pointer of type * inside a class-based interface. This interface is as follows:

struct name {
   name(int nelements);                  // Create an array
  ~name();                               // Delete array
   type getitem(int index);              // Return item
   void setitem(int index, type value);  // Set item
   type *cast();                         // Cast to original type
   static name *frompointer(type *);     // Create class wrapper from
                                         // existing pointer
};

When using this macro, type is restricted to a simple type name like int or float. Pointers and other complicated types are not allowed. name must be a valid identifier not already in use. When a pointer is wrapped as a class, it can be transparently passed to any function that expects the pointer.

When combined with proxy classes, the %array_class() macro can be especially useful. For example:

%module example
%include "carrays.i"
%array_class(double, doubleArray);

void print_array(double x[10]);

Allows you to do this:

import example
c = example.doubleArray(10)  # Create double[10]
for i in range(0,10):
    c[i] = 2*i               # Assign values
example.print_array(c)       # Pass to C

Note: These macros do not encapsulate C arrays inside a special data structure or proxy. There is no bounds checking or safety of any kind. If you want this, you should consider using a special array object rather than a bare pointer.

Note: %array_functions() and %array_class() should not be used with types of char or char *.

8.2.3 cmalloc.i

This module defines macros for wrapping the low-level C memory allocation functions malloc(), calloc(), realloc(), and free().

%malloc(type [,name=type])

Creates a wrapper around malloc() with the following prototype:

type *malloc_name(int nbytes = sizeof(type));

If type is void, then the size parameter nbytes is required. The name parameter only needs to be specified when wrapping a type that is not a valid identifier (e.g., "int *", "double **", etc.).

%calloc(type [,name=type])

Creates a wrapper around calloc() with the following prototype:

type *calloc_name(int nobj =1, int sz = sizeof(type));

If type is void, then the size parameter sz is required.

%realloc(type [,name=type])

Creates a wrapper around realloc() with the following prototype:

type *realloc_name(type *ptr, int nitems);

Note: unlike the C realloc(), the wrapper generated by this macro implicitly includes the size of the corresponding type. For example, realloc_int(p, 100) reallocates p so that it holds 100 integers.

%free(type [,name=type])

Creates a wrapper around free() with the following prototype:

void free_name(type *ptr);

%sizeof(type [,name=type])

Creates the constant:

%constant int sizeof_name = sizeof(type);

%allocators(type [,name=type])

Generates wrappers for all five of the above operations.

Here is a simple example that illustrates the use of these macros:

// SWIG interface
%module example
%include "cmalloc.i"

%malloc(int);
%free(int);

%malloc(int *, intp);
%free(int *, intp);

%allocators(double);

Now, in a script:

>>> from example import *
>>> a = malloc_int()
>>> a
'_000efa70_p_int'
>>> free_int(a)    
>>> b = malloc_intp()
>>> b
'_000efb20_p_p_int'
>>> free_intp(b)
>>> c = calloc_double(50)
>>> c
'_000fab98_p_double'
>>> c = realloc_double(100000)
>>> free_double(c)
>>> print sizeof_double
8
>>>

8.2.4 cdata.i

The cdata.i module defines functions for converting raw C data to and from strings in the target language. The primary applications of this module would be packing/unpacking of binary data structures---for instance, if you needed to extract data from a buffer. The target language must support strings with embedded binary data in order for this to work.

char *cdata(void *ptr, int nbytes)

Converts nbytes of data at ptr into a string. ptr can be any pointer.

void memmove(void *ptr, char *s)

Copies all of the string data in s into the memory pointed to by ptr. The string may contain embedded NULL bytes. The length of the string is implicitly determined in the underlying wrapper code.

One use of these functions is packing and unpacking data from memory. Here is a short example:

// SWIG interface
%module example
%include "carrays.i"
%include "cdata.i"

%array_class(int, intArray);

Python example:

>>> a = intArray(10)
>>> for i in range(0,10):
...    a[i] = i
>>> b = cdata(a,40)
>>> b
'\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04
\x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00\t'
>>> c = intArray(10)
>>> memmove(c,b)
>>> print c[4]
4
>>> 

Since the size of data is not always known, the following macro is also defined:

%cdata(type [,name=type])

Generates the following function for extracting C data for a given type.

char *cdata_name(type* ptr, int nitems)

nitems is the number of items of the given type to extract.

Note: These functions provide direct access to memory and can be used to overwrite data. Clearly they are unsafe.

8.3 C String Handling

A common problem when working with C programs is dealing with functions that manipulate raw character data using char *. In part, problems arise because there are different interpretations of char *---it could be a NULL-terminated string or it could point to binary data. Moreover, functions that manipulate raw strings may mutate data, perform implicit memory allocations, or utilize fixed-sized buffers.

The problems (and perils) of using char * are well-known. However, SWIG is not in the business of enforcing morality. The modules in this section provide basic functionality for manipulating raw C strings.

8.3.1 Default string handling

Suppose you have a C function with this prototype:

char *foo(char *s);

The default wrapping behavior for this function is to set s to a raw char * that refers to the internal string data in the target language. In other words, if you were using a language like Tcl, and you wrote this,

% foo Hello

then s would point to the representation of "Hello" inside the Tcl interpreter. When returning a char *, SWIG assumes that it is a NULL-terminated string and makes a copy of it. This gives the target language its own copy of the result.

There are obvious problems with the default behavior. First, since a char * argument points to data inside the target language, it is NOT safe for a function to modify this data (doing so may corrupt the interpreter and lead to a crash). Furthermore, the default behavior does not work well with binary data. Instead, strings are assumed to be NULL-terminated.

8.3.2 Passing binary data

If you have a function that expects binary data,

int parity(char *str, int len, int initial);

you can wrap the parameters (char *str, int len) as a single argument using a typemap. Just do this:

%apply (char *STRING, int LENGTH) { (char *str, int len) };
...
int parity(char *str, int len, int initial);

Now, in the target language, you can use binary string data like this:

>>> s = "H\x00\x15eg\x09\x20"
>>> parity(s,0)

In the wrapper function, the passed string will be expanded to a pointer and length parameter.

8.3.3 Using %newobject to release memory

If you have a function that allocates memory like this,

char *foo() {
   char *result = (char *) malloc(...);
   ...
   return result;
}

then the SWIG generated wrappers will have a memory leak--the returned data will be copied into a string object and the old contents ignored.

To fix the memory leak, use the %newobject directive.

%newobject foo;
...
char *foo();

This will release the result.

8.3.4 cstring.i

The cstring.i library file provides a collection of macros for dealing with functions that either mutate string arguments or which try to output string data through their arguments. An example of such a function might be this rather questionable implementation:

void get_path(char *s) {
    // Potential buffer overflow---uh, oh.
    sprintf(s,"%s/%s", base_directory, sub_directory);
}
...
// Somewhere else in the C program
{
    char path[1024];
    ...
    get_path(path);
    ...
}

(Off topic rant: If your program really has functions like this, you would be well-advised to replace them with safer alternatives involving bounds checking).

The macros defined in this module all expand to various combinations of typemaps. Therefore, the same pattern matching rules and ideas apply.

%cstring_bounded_output(parm, maxsize)

Turns parameter parm into an output value. The output string is assumed to be NULL-terminated and smaller than maxsize characters. Here is an example:

%cstring_bounded_output(char *path, 1024);
...
void get_path(char *path);

In the target language:

>>> get_path()
/home/beazley/packages/Foo/Bar
>>>

Internally, the wrapper function allocates a small buffer (on the stack) of the requested size and passes it as the pointer value. Data stored in the buffer is then returned as a function return value. If the function already returns a value, then the return value and the output string are returned together (multiple return values). If more than maxsize bytes are written, your program will crash with a buffer overflow!

%cstring_chunk_output(parm, chunksize)

Turns parameter parm into an output value. The output string is always chunksize and may contain binary data. Here is an example:

%cstring_chunk_output(char *packet, PACKETSIZE);
...
void get_packet(char *packet);

In the target language:

>>> get_packet()
'\xa9Y:\xf6\xd7\xe1\x87\xdbH;y\x97\x7f"\xd3\x99\x14V\xec\x06\xea\xa2\x88'
>>>

This macro is essentially identical to %cstring_bounded_output. The only difference is that the result is always chunksize characters. Furthermore, the result can contain binary data. If more than maxsize bytes are written, your program will crash with a buffer overflow!

%cstring_bounded_mutable(parm, maxsize)

Turns parameter parm into a mutable string argument. The input string is assumed to be NULL-terminated and smaller than maxsize characters. The output string is also assumed to be NULL-terminated and less than maxsize characters.

%cstring_bounded_mutable(char *ustr, 1024);
...
void make_upper(char *ustr);

In the target language:

>>> make_upper("hello world")
'HELLO WORLD'
>>>

Internally, this macro is almost exactly the same as %cstring_bounded_output. The only difference is that the parameter accepts an input value that is used to initialize the internal buffer. It is important to emphasize that this function does not mutate the string value passed---instead it makes a copy of the input value, mutates it, and returns it as a result. If more than maxsize bytes are written, your program will crash with a buffer overflow!

%cstring_mutable(parm [, expansion])

Turns parameter parm into a mutable string argument. The input string is assumed to be NULL-terminated. An optional parameter expansion specifies the number of extra characters by which the string might grow when it is modified. The output string is assumed to be NULL-terminated and less than the size of the input string plus any expansion characters.

%cstring_mutable(char *ustr);
...
void make_upper(char *ustr);

%cstring_mutable(char *hstr, HEADER_SIZE);
...
void attach_header(char *hstr);

In the target language:

>>> make_upper("hello world")
'HELLO WORLD'
>>> attach_header("Hello world")
'header: Hello world'
>>>

This macro differs from %cstring_bounded_mutable() in that a buffer is dynamically allocated (on the heap using malloc/new). This buffer is always large enough to store a copy of the input value plus any expansion bytes that might have been requested. It is important to emphasize that this function does not directly mutate the string value passed---instead it makes a copy of the input value, mutates it, and returns it as a result. If the function expands the result by more than expansion extra bytes, then the program will crash with a buffer overflow!

%cstring_output_maxsize(parm, maxparm)

This macro is used to handle bounded character output functions where both a char * and a maximum length parameter are provided. As input, a user simply supplies the maximum length. The return value is assumed to be a NULL-terminated string.

%cstring_output_maxsize(char *path, int maxpath);
...
void get_path(char *path, int maxpath);

In the target language:

>>> get_path(1024)
'/home/beazley/Packages/Foo/Bar'
>>>

This macro provides a safer alternative for functions that need to write string data into a buffer. User supplied buffer size is used to dynamically allocate memory on heap. Results are placed into that buffer and returned as a string object.

%cstring_output_withsize(parm, maxparm)

This macro is used to handle bounded character output functions where both a char * and a pointer int * are passed. Initially, the int * parameter points to a value containing the maximum size. On return, this value is assumed to contain the actual number of bytes. As input, a user simply supplies the maximum length. The output value is a string that may contain binary data.

%cstring_output_withsize(char *data, int *maxdata);
...
void get_data(char *data, int *maxdata);

In the target language:

>>> get_data(1024)
'x627388912'
>>> get_data(1024)
'xyzzy'
>>>

This macro is a somewhat more powerful version of %cstring_output_chunk(). Memory is dynamically allocated and can be arbitrary large. Furthermore, a function can control how much data is actually returned by changing the value of the maxparm argument.

%cstring_output_allocate(parm, release)

This macro is used to return strings that are allocated within the program and returned in a parameter of type char **. For example:

void foo(char **s) {
    *s = (char *) malloc(64);
    sprintf(*s, "Hello world\n");
}

The returned string is assumed to be NULL-terminated. release specifies how the allocated memory is to be released (if applicable). Here is an example:

%cstring_output_allocate(char **s, free(*$1));
...
void foo(char **s);

In the target language:

>>> foo()
'Hello world\n'
>>>

%cstring_output_allocate_size(parm, szparm, release)

This macro is used to return strings that are allocated within the program and returned in two parameters of type char ** and int *. For example:

void foo(char **s, int *sz) {
    *s = (char *) malloc(64);
    *sz = 64;
    // Write some binary data
    ...
}

The returned string may contain binary data. release specifies how the allocated memory is to be released (if applicable). Here is an example:

%cstring_output_allocate_size(char **s, int *slen, free(*$1));
...
void foo(char **s, int *slen);

In the target language:

>>> foo()
'\xa9Y:\xf6\xd7\xe1\x87\xdbH;y\x97\x7f"\xd3\x99\x14V\xec\x06\xea\xa2\x88'
>>>

This is the safest and most reliable way to return binary string data in SWIG. If you have functions that conform to another prototype, you might consider wrapping them with a helper function. For example, if you had this:

char  *get_data(int *len);

You could wrap it with a function like this:

void my_get_data(char **result, int *len) {
   *result = get_data(len);
}

Comments:

8.4 C++ Library

The library modules in this section provide access to parts of the standard C++ library. All of these modules are new in SWIG-1.3.12 and are only the beginning phase of more complete C++ library support including support for the STL.

8.4.1 std_string.i

The std_string.i library provides typemaps for converting C++ std::string objects to and from strings in the target scripting language. For example:

%module example
%include "std_string.i"

std::string foo();
void        bar(const std::string &x);

In the target language:

x = foo();                # Returns a string object
bar("Hello World");       # Pass string as std::string

This module only supports types std::string and const std::string &. Pointers and non-const references are left unmodified and returned as SWIG pointers.

This library file is fully aware of C++ namespaces. If you export std::string or rename it with a typedef, make sure you include those declarations in your interface. For example:

%module example
%include "std_string.i"

using namespace std;
typedef std::string String;
...
void foo(string s, const String &t);     // std_string typemaps still applied

Note: The std_string library is incompatible with Perl on some platforms. We're looking into it.

8.4.2 std_vector.i

The std_vector.i library provides support for the C++ vector class in the STL. Using this library involves the use of the %template directive. All you need to do is to instantiate different versions of vector for the types that you want to use. For example:

%module example
%include "std_vector.i"

namespace std {
   %template(vectori) vector<int>;
   %template(vectord) vector<double>;
};

When a template vector<X> is instantiated a number of things happen:

To illustrate the use of this library, consider the following functions:

/* File : example.h */

#include <vector>
#include <algorithm>
#include <functional>
#include <numeric>

double average(std::vector<int> v) {
    return std::accumulate(v.begin(),v.end(),0.0)/v.size();
}

std::vector<double> half(const std::vector<double>& v) {
    std::vector<double> w(v);
    for (unsigned int i=0; i<w.size(); i++)
        w[i] /= 2.0;
    return w;
}

void halve_in_place(std::vector<double>& v) {
    std::transform(v.begin(),v.end(),v.begin(),
                   std::bind2nd(std::divides<double>(),2.0));
}

To wrap with SWIG, you might write the following:

%module example
%{
#include "example.h"
%}

%include "std_vector.i"
// Instantiate templates used by example
namespace std {
   %template(IntVector) vector<int>;
   %template(DoubleVector) vector<double>;
}

// Include the header file with above prototypes
%include "example.h"

Now, to illustrate the behavior in the scripting interpreter, consider this Python example:

>>> from example import *
>>> iv = IntVector(4)         # Create an vector<int>
>>> for i in range(0,4):
...      iv[i] = i
>>> average(iv)               # Call method
1.5
>>> average([0,1,2,3])        # Call with list
1.5
>>> half([1,2,3])             # Half a list
(0.5,1.0,1.5)
>>> halve_in_place([1,2,3])   # Oops
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: Type error. Expected _p_std__vectorTdouble_t
>>> dv = DoubleVector(4)
>>> for i in range(0,4):
...       dv[i] = i
>>> halve_in_place(dv)       # Ok
>>> for i in dv:
...       print i
...
0.0
0.5
1.0
1.5
>>> dv[20] = 4.5
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "example.py", line 81, in __setitem__
    def __setitem__(*args): return apply(examplec.DoubleVector___setitem__,args)
IndexError: vector index out of range
>>>

This library module is fully aware of C++ namespaces. If you use vectors with other names, make sure you include the appropriate using or typedef directives. For example:

%include "std_vector.i"

namespace std {
    %template(IntVector) vector<int>;
}

using namespace std;
typedef std::vector Vector;

void foo(vector<int> *x, const Vector &x);

Note: This module makes use of several advanced SWIG features including templatized typemaps and template partial specialization. If you are tring to wrap other C++ code with templates, you might look at the code contained in std_vector.i. Alternatively, you can show them the code if you want to make their head explode.

Note: This module is defined for all SWIG target languages. However argument conversion details and the public API exposed to the interpreter vary.

Note: std_vector.i was written by Luigi "The Amazing" Ballabio.

8.5 Utility Libraries

8.5.1 exception.i

The exception.i library provides a language-independent function for raising a run-time exception in the target language.

SWIG_exception(int code, const char *message)

Raises an exception in the target language. code is one of the following symbolic constants:

SWIG_MemoryError
SWIG_IOError
SWIG_RuntimeError
SWIG_IndexError
SWIG_TypeError
SWIG_DivisionByZero
SWIG_OverflowError
SWIG_SyntaxError
SWIG_ValueError
SWIG_SystemError

message is a string indicating more information about the problem.

The primary use of this module is in writing language-independent exception handlers. For example:

%include "exception.i"
%exception std::vector::getitem {
    try {
        $action
    } catch (std::out_of_range& e) {
        SWIG_exception(SWIG_IndexError,const_cast<char*>(e.what()));
    }
}