OK, I've got the first part of my monologue on macros ready. Let me begin by defining some terms:
A
macro definition (or simply a
macro) is a piece of code within
a program which describes a set of transformations to be performed on another
set of code prior to that code being translated (either compiled or
interpreted).
A
macro application is an instance of a macro being used to alter a
section of code.
The
macro expansion of a given macro application is the resulting
code that is actual translated and executed.
To give a simple example using the familiar C pre-processor, the following
macro definition,
is illustrative of a typical textual macro as seen in C, C++ and many
assembly languages. It defines the name of the macro, a parameter, and a simple
set of string changes to make to the argument. One could apply it to a piece of
code, in this case a numerical literal,
and get the following macro expansion:
Now, it has to be understood that this simple type of macro, called a textual
macro, is simply performing a string interpolation on the argument; that is to
say, it is replacing the the parameter with the text of the argument given to
the macro application, and inserting it verbatim into the body of the macro,
which then replaces the macro application. one could have written
and gotten the expansion
The result of which is clearly not the same as the previous application, even
though they would appear on the surface to be identical semantically. This
highlights one of the flaws in naive macros of this sort, and introduced the
most basic issue of
macro hygiene. avoiding duplicate evaluation.
The macros possible with the C preprocessor are extremely limited; they do not
compose effectively without heroic effort, they are generally restricted to a
single line of code (though they can span multiple lines with a line extension)
and they are rather simplistic in how they are expanded. They also represent
something of an exceptional case for the C language, as they do not match the
syntactic structures of the language overall. Macros, and the C pre-processor
directives in general, form a separate mini-language independent from C itself,
and indeed the pre-processor can be run separately from the compiler in most
implementations.
Similarly, the m4 macro processor, frequently used in conjunction with the
Unix as assembler, is actually a stand-alone program in it's own right, and
invoked automatically by the assembler. It is a significantly more sophisticated
macro processor than CPP, but it is still basically a textual macro expansion.
A key reason why these textual macros are limited is because they are separate
from the languages they pre-process. However, it is possible, in languages which
possess the property of
homoiconicity - that is to say, the code
in which the language is written is itself a data structure of the language and
can be manipulated programmatically within the code without needing to call
out to a separate translator - can support a more powerful form of macro, known
as
lexical macros. The classic examples of lexical macros are to be
found in the various dialects of the Lisp family of languages.
In Common Lisp. a macro is a piece of Lisp code the same as any other, except
that the compiler recognizes that it is a macro when it scans it and calls the
macro expander to perform the transformation at compile time, prior to compiling
the code. To use the simple example from before,
As it happens, Common Lisp provides a simple means with which to check what the
expansion of a given macro application would be, the
macroexpand-1
special form. So, if we write
at the Lisp listener, we get
as expected. Unfortunately, we still haven't solved the duplicate evaluation
problem, as shown here:
Code: Select all
> (macroexpand-1 '(square (- 3 1)))
(* (- 3 1) (- 3 1)) ;
(the right angle bracket here is the listener prompt.) As it happens, this does
not have the consequences it did in C, thanks to a quirk of Lisp
syntax: because Lisp functions are always fully parenthesized, the specific
fault of mismatching the order of operations doesn't happen, though this
doesn't mean we're free of the duplicate evaluation problem.
If you look at the macro again, you should note the backquote and the commas;
these are relevant, as they indicate what parts of the code should be evaluated
at compile time and which at run time. To put it succinctly, a Lisp macro is a
Lisp function that is run before compile time, with it's expansion being the
function's output. The backquote says that the following part of the code
should be passed through to the expansion unaltered,
except that it may
contain elements which should be evaluated, which are indicated by the commas.
Had we omitted these indicators,
It would result in the whole code being evaluated a compile time,
Code: Select all
> (macroexpand-1 '(square2 3))
9 ;
> (macroexpand-1 '(square2 (- 3 1)))
*** - *: (- 3 1) is not a number
You can see how, in the first case, it is can be a very useful thing to have the
expression evaluated at compile time, as the expression essentially becomes a
simple constant; but you can see in the second example the limitations of this,
as the arguments passed to the macro are not evaluated at all.
Because they have the full Common Lisp language available to them for
generating their expansion, Common Lisp macros do not have many of the
limitations that textual macros had. For example, one could write a macro that
does evaluate an arbitrary numeric argument to its square at compile
time:
Code: Select all
(defmacro square3 (x)
(let ((y (eval x)))
(if (numberp y)
(* y y)
nil)))
which does indeed expand as desired:
Code: Select all
> (macroexpand-1 '(square3 (- 3 1)))
4 ;
Furthermore, because it evaluates the value of
x prior to binding it
to
y before it is squared, it avoids re-evaluating the expression;
no matter how complex the expression is, so long as the value is a number,
it will be evaluated exactly once, at compile time. Of course, this still has its
limitations: the expression must be evaluable at compile time, so it can't
depend on any variables entered by the user at run time, for example. Still,
this can be a useful property for certain applications.