A rock pile ceases to be a rock pile the moment a single man contemplates it, bearing within him the image of a cathedral.
Much as there are reasons for writing C extensions for Perl, there are any number of reasons to execute Perl scripts from within C/C++ applications; we refer to this activity as embedding the Perl interpreter. Embedding does not mean that we wish to conceal the interpreter; it just indicates that the application retains overall control and, when required, makes calls to the Perl internal API.
This
chapter introduces a simple API for embedding the Perl interpreter in
your C application. These functions are not standard (that is, they
have been introduced in this book), and shield you from having to
know anything at all about Perl internals, reference counting, memory
management, and calling conventions. Although these details will be
discussed in the next chapter, you shouldn’t
have to know them to get useful work done. The
perlembed
document written by Jon Orwant and
Doug MacEachern [1] provides a fine tutorial-style coverage of this
subject, but expects you to be conversant with the internals.
A C application can make use of a scripting language in different ways:
Applications such as Emacs, Microsoft Office, and Autocad provide scripting language frontends. Although they work reasonably well on their own, their real power comes from the large community of developers writing scripted extensions. To paraphrase Brian Kernighan, a good tool is one that is used in ways its developers never thought of. The calc package in Emacs is capable of doing symbolic mathematics, for example. Who would have thought of putting this in a text editor?! [63]
Emacs is an excellent example of an application that implements its basic functionality in C for speed and operating system interfaces and everything else in LISP (it has an embedded LISP interpreter), which provide the necessary glue for the C code. The editor won’t even start without some crucial LISP code.
I once had to work on a Unix-based application talking to a mainframe. The files coming off the mainframe were curiously formatted, and of course, wouldn’t match the specifications. Since munging files is so much easier in Perl than in C, I used Perl scripts and an embedded Perl interpreter to parse these files so that I could change the parsing strategy at will.
I could have chosen the easier option of spawning an external Perl
script using system(3)
or
popen(3)
and fetching its output from a temporary
file or a pipe. This approach works very well for a large number of
applications, as is evident from the success of CGI. There is much to
be said for separating application functionality into two separately
debuggable programs. But it wasn’t fast enough for my
application. Additionally, the data flowing across the interface
wasn’t simple enough, so I would have had to write a lot of
code to format this data on one end and to parse it on the other.
Spawning external scripts has the additional problem that it
doesn’t give you a persistent context. That is, every time you
launch a Perl script, it doesn’t remember anything from the
time it was last invoked, and it would have to reopen socket
connections, database connections, restart transactions, and so on.
The embedded interpreter approach is taken by the Apache web server
[Section 19.5].
Writing a scripting frontend forces you to simplify the interface functions to ease the integration with the scripting language. Happily enough, this also makes life easy for other C programmers using your libraries.
A scripting facility presents an opportunity to provide programmatic access to instrumentation probes embedded in the code (for monitoring performance, memory usage, dynamic assertions, etc.). For example, you can automatically set up an audit trail of all inbound user connections when the number of users exceeds 50.
Applications may not be satisfied with simple configuration files (name-value properties, such as those provided by the Microsoft Windows Registry).