Having armed ourselves to the teeth with information, and having hand-built a few extensions, we are now ready to exploit SWIG and XS to their hilts. In this section, we’ll first look at the type of code produced by XS. As it happens, SWIG produces almost identical code, so the explanation should suffice for both tools. Then we will write typemaps and snippets of code to help XS deal with C structures, to wrap C structures with Perl objects, and, finally, to interface with C++ objects. Most of this discussion is relevant to SWIG also, which is why we need study only one SWIG example. That said, take note that the specific XS typemap examples described in the following pages are solved simply and elegantly using SWIG, without the need for user-defined typemaps.
To understand XS
typemaps, and the effect of keywords such as CODE
and PPCODE
, it pays to take a good look at the
glue code generated by xsubpp. Consider the
following XS declaration of a module, Test
,
containing a function that takes two arguments and returns an
integer:
MODULE = Test PACKAGE = Test int func_2_args(a, b) int a char* b
xsubpp translates it to the following (comments in italic have been added):
XS(XS_Test_func_2_args) /* Mangled function name, with package name */
/* added to make it unique */
{
dXSARGS; /* declare "items", and init it with */
if (items != 2) /* the number of items on the stack */
croak("Usage: Test::func_2_args(a, b)");
{ /* Start a fresh block to allow variable declarations */
/* Built-in typemaps translate the stack to C variables */
int a = (int)SvIV(ST(0));
char* b = (char *)SvPV(ST(1),na);
/* RETVAL's type matches the function return */
int RETVAL;
RETVAL = func_2_args(a, b);
ST(0) = sv_newmortal();
/* Outgoing typemap to translate C var. to stack */
sv_setiv(ST(0), (IV)RETVAL);
}
XSRETURN(1); /* Let Perl know one return param has been put back */
}
This is practically identical to the code we studied in the section
Section 20.4.2. Notice how the arguments on the
stack are translated into the two arguments a
and
b
. The XS function then calls the real C function,
func_2_args
, gets its return value, and packages
the result back to the argument stack.
Let us now add some of the more common XS keywords to see how they are accommodated by xsubpp. The XS snippet
int func_with_keywords(a, b) int a char* b PREINIT: double c; INIT: c = a * 20.3; CODE: if (c > 50) { RETVAL = test(a,b,c); } OUTPUT: RETVAL
gets translated to this:
XS(XS_Test_func_with_keywords)
{
dXSARGS;
if (items != 2)
croak("Usage: Test::func_with_keywords(a, b)");
{
int a = (int) SvIV(ST(0));
char* b = (char *)SvPV(ST(1),na);
double c; /* PREINIT section */
int RETVAL;
c = a * 20.3; /* INIT section */
if (c > 50) { /* CODE section */
RETVAL = test(a,b,c); /* Call any function */
}
ST(0) = sv_newmortal(); /* generated due to OUTPUT */
sv_setiv(ST(0), (IV)RETVAL); } XSRETURN(1); }
As you can see, the code supplied in PREINIT
goes
right after the typemaps to ensure that all declarations are complete
before the main code starts. The location is important for
traditional C compilers, but would not be an issue for C++ compilers,
which allow variable declarations anywhere in a block. The
INIT
section is inserted before the automatically
generated call to the function or, in this case, before the
CODE
section starts. The CODE
directive allows us the flexibility of inserting any piece of code;
without it, xsubpp would have simply inserted a
call to func_with_keywords(a,b)
, as we saw in the
prior example.
The CODE
keyword behaves like a typical C call:
you can modify input parameters, and you can return at most one
parameter. To deal with a variable number of input arguments or
output results, you need the PPCODE
keyword. To
illustrate the implementation of PPCODE
, consider
a C function, permute
, that takes a string,
computes all its permutations and returns a dynamically allocated
array of strings (a null-terminated char**
).
Let’s say that we want to access it in Perl as follows:
@list = permute($str);
We use PPCODE
here because the function expects to
return a variable number of scalars. The following snippet of code
shows the XS file:
void permute(str) char * str PPCODE: int i = 0; /* Call permute. It returns a null-terminated array of strings */ char ** ret = permute (str); /* Copy these parameters to mortal scalars, and push them onto * the stack */ char **p = ret; for (; *p; p++, i++) { XPUSHs (sv_2mortal(newSVpv(*p, 0))); } free(p); XSRETURN(i);
This gets translated to the following:
XS(XS_Test_permute)
{
dXSARGS;
if (items != 1)
croak("Usage: Test::permute(str)");
/* PPCODE adjusts stack pointer (CODE does not do this) */
SP -= items;
{
char * str = (char *)SvPV(ST(0),na);
int i = 0;
/* Call permute.It returns a null-terminated array of strings */
char ** ret = permute (str);
/* Copy these parameters to mortal scalars, and push them onto
* the stack */
for ( ; *ret ; ret++, ++i) {
XPUSHs (sv_2mortal(newSVpv(*ret, 0)));
}
free(ret);
XSRETURN(i);
PUTBACK; /* These two statements are redundant */
return; /* because XSRETURN does both */
} }
The PPCODE
directive differs from
CODE
in one small but significant way: it adjusts
the stack pointer SP to point to the bottom of the Perl stack frame
for this function call (that is, to ST(0)
), to
enable us to use the XPUSHs
macro to extend and
push any number of arguments (recall our discussion in the section
Section 20.4.2.2). We’ll shortly see why we
cannot do this using typemaps.
A
typemap is a
snippet of code that translates a scalar value on the argument stack
to a corresponding C scalar entity (int, double, pointer), or vice
versa. A typemap applies only to one direction. It is important to
stress here that both the input and the output for a typemap are
scalars in their respective domains. You cannot have a typemap take a
scalar value and return a C structure, for example; you can, however,
have it return a pointer to the structure. This
is the reason why the permute
example in the
preceding section cannot use a typemap. We could write a typemap to
convert a char**
to a
reference to an array and then leave it to the
script writer to dereference it. In SWIG, which doesn’t support
a PPCODE
equivalent, this is the only option.
Another constraint of typemaps is that they convert one argument at a
time, with blinkers on: you cannot take a decision based on multiple
input arguments, as we mentioned in Chapter 18,
(“if argument 1 is `foo', then increase argument 2
by 10”). XS offers the CODE
and
PPCODE
directives to help you out in this
situation, while SWIG doesn’t. But recall from the section
Section 18.5 that the two SWIG restrictions mentioned
are easily and efficiently taken care of in script space.
While xsubpp is capable of supplying translations for ordinary C arguments, we have to write custom typemaps for all user-defined types. Assume that we have a C library with the following two functions:
Car* new_car(); void drive(Car *);
In Perl, we want to access it as
$car = Car::new_car; Car::drive($car);
Let us first write the XS file for this problem:
/* Car.XS */ #include <EXTERN.h> #include <perl.h> #include <XSUB.h> #include <Car.h> /* Don't care what Car* looks like */ MODULE = Car PACKAGE = Car Car * new_car () void drive (car) Car * car
As you can see, we need two typemaps: an output typemap for
converting a Car*
to $car
and
an input typemap for the reverse direction. We start off by editing a
typemap file called typemap
,[77] which contains three sections:
TYPEMAP
, INPUT
, and
OUTPUT
, as follows:
TYPEMAP Car * CAR_OBJ INPUT CAR_OBJ $var = (Car *)SvIV($arg); OUTPUT CAR_OBJ sv_setiv($arg, (I32) $var);
The TYPEMAP
section creates an easy-to-use alias
(CAR_OBJ
, in this case) for your potentially
complex C type (Car
*
). The
INPUT
and OUTPUT
sections in
the typemap file can now refer to this alias and contain code to
transform an object of the corresponding type to a Perl value, or
vice versa. When a typemap is used for a particular problem, the
marker $arg
is replaced by the appropriate scalar
on the argument stack, and $var
is replaced by the
corresponding C variable name. In this example, the output typemap
stuffs a Car*
into the integer slot of the scalar
(recall the discussion in Section 20.3.1.3).
The advantage of the TYPEMAP
section’s
aliases is that multiple types can be mapped to the same alias. That
is, a Car*
and a Plane*
can
both be aliased to VEHICLE
, and because the
INPUT
and OUTPUT
sections use
only the alias, both types end up sharing the same translation code.
The Perl distribution comes with a typemap file that supplies all the
basic typemaps (see lib/ExtUtils/typemap
), and
you can freely use one of the aliases defined in that file. For
example, you can use the alias T_PTR
(instead of
CAR_OBJ
) and thereby use the corresponding
INPUT
and OUTPUT
sections for
that alias. In other words, our typemap file need simply say:
TYPEMAP Car * T_PTR
It so happens that the T_PTR
’s
INPUT
and OUTPUT
sections look
identical to that shown above for CAR_OBJ
.
Let us say we want to give the script writer the ability to write something like the following, without changing the C library in any way:
$car = Car::new_car(); # As before $car->drive();
In other words, the OUTPUT
section of our typemap
needs to convert a Car*
(returned by
new_car
) to a blessed scalar reference, as
discussed in Section 20.3.1.3. The
INPUT
section contains the inverse transformation:
TYPEMAP Car * CAR_OBJ OUTPUT CAR_OBJ sv_setref_iv($arg, "Car", (I32) $var); INPUT CAR_OBJ $var = (Car *)SvIV((SV*)SvRV($arg));
sv_setref_iv
gives an integer to a freshly
allocated SV and converts the first argument into a reference, points
it to the new scalar, and blesses it in the appropriate module (refer
to Table 20.1). In this example, we cast the
pointer to an I32
, and make the function think we
are supplying an integer.
The
typemap in the preceding example is restricted to objects of type
Car
only. We can use the TYPEMAP section’s
aliasing capability to generalize this typemap and accommodate any
object pointer. Consider the following typemap, with changes
highlighted:
TYPEMAP Car * ANY_OBJECT OUTPUT ANY_OBJECT sv_setref_pv($arg, CLASS, (void*) $var); INPUT ANY_OBJECT $var = ($type) SvIV((SV*)SvRV($arg));
All we have done is generalize the alias, the cast, and the class
name. $type
is the type of the current C object
(the left-hand side of the alias in the TYPEMAP
section), so in this case it is Car*
. Because we
want to make the class name generic, we adopt the strategy used in
Chapter 7—ask the script user to use the
arrow notation:
$c = Car->new_car();
This invocation supplies the name of the module as the first
parameter, which we capture in the CLASS
argument
in the XS file:
Car * new_car (CLASS) char *CLASS
The only thing remaining is that we would like the user to say
Car
->new
instead of
Car->new_car
. Just because C doesn’t have
polymorphism doesn’t mean the script user has to suffer. The
CODE
keyword achieves this simply:
Car * new (CLASS) char *CLASS CODE: RETVAL = new_car(); OUTPUT: RETVAL
The drive
method doesn’t need any changes.
Having generalized this alias, we can apply the
ANY_OBJECT
alias to other objects too, as long as
they also follow the convention of declaring and initializing a
CLASS
variable in any method that returns a
pointer to the type declared in the TYPEMAP
section. In the preceding example, the initialization happened
automatically because Perl supplies the name of the class as the
first argument.
Suppose you have a C++ class called
Car
that supports a constructor and a method
called drive
. You can declare the corresponding
interfaces in the XS file as follows:
Car * Car::new () void Car::drive()
xsubpp translates the new
declaration to an equivalent constructor call, after translating all
parameters (if any):
XS(XS_Car_new) { dXSARGS; if (items != 1) croak("Usage: Car::new(CLASS)"); { char * CLASS = (char *)SvPV(ST(0),na); Car * RETVAL; RETVAL = new Car(); ST(0) = sv_newmortal(); sv_setref_pv(ST(0), CLASS, (void*) RETVAL); } XSRETURN(1); }
Unlike the previous example, xsubpp
automatically supplies the CLASS
variable. You
still need the typemaps, however, to convert Car*
to an equivalent Perl object reference. The drive
interface declaration is translated as follows:
XS(XS_Car_drive) { dXSARGS; if (items != 1) croak("Usage: Car::drive(THIS)"); { Car * THIS; THIS = (Car *) SvIV((SV*)SvRV(ST(0)));; THIS->drive(); } XSRETURN_EMPTY; }
xsubpp automatically generates the
THIS
variable to refer to the object. Both
CLASS
and THIS
can be used in a
CODE
section.
Dean Roehrich’s XS Cookbooks [Section 20.8] provide several excellent examples of XS typemaps, so be sure to look them up before you start rolling your own.
We have conveniently ignored the
issue of memory management so far. In the preceding sections, the
new
function allocates an object that is
subsequently stuffed into a scalar value by the typemapping code.
When the scalar goes out of scope or is assigned something else, Perl
ignores this pointer if the scalar has not been blessed — not
surprising, considering that it has been led to believe that the
scalar contains just an integer value. This is most definitely a
memory leak. But if the scalar is blessed, Perl calls its
DESTROY
routine called when the scalar is cleared.
If this routine is written in XS, as shown below, it gives us the
opportunity to delete allocated memory:
void DESTROY(car) Car *car CODE: delete_car(car); /* deallocate that object */
The C++ interface is simpler:
void Car::DESTROY()
In this case, xsubpp automatically calls
"delete
THIS
“,
where THIS
represents the object, as we saw
earlier.
The Perl library provides a set of functions and macros to replace the conventional dynamic memory management routines (listed on the left-hand side of the table):
Instead of: |
Use: |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The Perl replacements use the version of malloc
provided by Perl (by default), and optionally collect statistics on
memory usage. It is recommended that you use these routines instead
of the conventional memory management routines.
SWIG produces practically the same
code as xsubpp. Consequently, you can expect its
typemaps to be very similar (if not identical) to that of XS.
Consider the permute
function discussed earlier.
We want a char**
converted to a list, but since
typemaps allow their input and output to be scalars, the following
typemap translates it to a list reference:
%typemap(perl5,out) char ** { // All functions returning char ** // get this typemap // $source is of type char ** // $target is of type RV (referring to an AV) AV *ret_av = newAV(); int i = 0; char **p = $source; /* First allocate a new AV, of the right size */ while (*p++) ; /* Incr. p while *p is non-null */ av_extend(ret_av, p - $source); /* For each element in the array of strings, create a new * mortalscalar, and stuff it into the above array */ p = $source; for (i = 0, p = $source; *p; p++, i++ { av_store(ret_av, i, sv_2mortal(newSVPV(*p, 0))); p++; } /* Finally, create a reference to the array; the "target" of this typemap */ $target = sv_2mortal(newRV((SV*)ret_av)); }
SWIG typemaps are specific to language, hence the
perl5
argument. out
refers to
function return parameters, and this typemap applies to
all functions with a char**
return value. $source
and
$target
are variables of the appropriate types:
for an in
typemap, $source
is a
Perl type, and $target
is the data type expected
by the corresponding function parameter. Note that unlike XS’s
$arg
and $val
, SWIG’s
$source
and $target
switch
meanings depending on the direction of the typemap.
If you don’t want this typemap applied to all functions
returning char**
’s, you can name exactly
which parameter or function you want it applied to, like this:
%typemap(perl5,out) char ** permute {
...
}
Please refer to the SWIG documentation for a number of other typemap-related features.
[77] We choose this particular name because the h2xs-generated makefile recognizes it and feeds it to xsubpp. It also allows for multiple typemap files to be picked up from different directories.