Wednesday, October 3, 2012

CL "block" and code walking

This article is kind of a preparation to a series of more complicated and elaborate ones. Here we'll talk about a technical issue with the block/return-from macros from the CL package. This nuance can reveal itself when you do the code walking. So, let's start.

Block and return-from

The CL package contains a pair of Common Lisp-like macros named block and return-from. These are supposed to emulate lexical non-local exits, as in true Common Lisp.

This trick works because block expands into catch, and return-from expands into throw. See this:

(ux-mexp-all is an improved version of cl-macroexpand-all; see my previous post named "Let the hacking commence").

As you can see, block has been expanded into a catch establishing a special tag (named --cl-block-outer--, which is interned symbol, by the way), and return-from has become a throw function form throwing the specified return value from the corresponding catch.

So, lexically-scoped non-local exits are really absent in Elisp; Elisp has only dynamic non-local exits: they are catch and throw. Lexical things are emulated by choosing specific names for the catch tag, so that no other throw call can inadvertently bring us to the catch that a block was expanded into.

Optimization (catch elimination)

The CL package provides 1 (pretty trivial) optimization for block/return-from tandem. This is the elimination of enclosing catch in the case when no return-from matching the block are encountered within the scope of this block. We'll call the catch elimination.

For example, in the sample below block is completely unnecessary:

Let's see what it is expanded into:

So why has the catch not been eliminated as I just said ? And what is cl-block-wrapper ?

Look attentively, can you see any differences between the example of macroexpansion above and this one?

Yes, you're right: here we haven't bound cl-macroexpand-cmacs to t around the call to ux-mexp-all. Now let us try to bind it and repeat the experiment:

Ha ? It turns out that the optimization takes place only when cl-macroexpand-cmacs is t, that is, during the compiler-macro-expansion.

cl-block-wrapper and cl-block-throw

Let us expand the piece of code to the full extent but not perform compiler-macro-expansion. Here's what we have:

This resembles one of the previous samples. block is actually expanded into cl-block-wrapper, and return-from -- into cl-block-throw. Here are the definitions of the two:

block first checks whether its body is safe. If it is, then no return-froms are guaranteed to be present within the body. This trick is also a kind of optimization, though not important in our discussion. (I think they added such a trick for block to perform better under interpreter).

Otherwise, block is expanded into a call to cl-block-wrapper which wraps a catch form containing the body passed to the block initially. return-from is expanded into a call to cl-block-throw with the same quoted tag symbol.

Compiler macros for cl-block-wrapper and cl-block-throw are not so simple as their function definitions:

cl-active-block-names is an alist mapping catch tags (which directly correspond to block names) to boolean flags (t/nil). These flags show us whether the optimization is possible: initially all the flags are set to nil, and macroexpand-all is invoked. The latter function performs compiler-macro-expansions, and the compiler macro for cl-block-throw sets corresponding flags to t, thus showing us that a return-from targeting the respective block has been encountered and the catch elimination is impossible. So after the return from macroexpand-all, we know whether we can eliminate the enclosing catch or not.

What is wrong about all this

There's one little (potential) problem with this approach: it relies on an implicit assumption that cl-block-wrapper wraps a catch form that corresponds to a block. Everything is okay as long as you don't do advanced Emacs Lisp programming (such as code walking, for instance). But when you try to make certain manipulations on code, your meta-programming facilities treat cl-block-wrapper as a usual function (which it is) and don't know that they cannot lexically detach it from its wrapped catch form.

The above 2 pieces of code must be equivalent, but they are not. This code will fail when you try to compile it, because of how cl-block-wrapper deals with its argument. See the definition of the compiler macro for cl-block-wrapper above.

Amendments for CL

Make it correct

So, the first and the most basic correction will be to make the compiler macro for cl-block-wrapper check its argument explicitly. When it doesn't happen to look like a catch form, leave it alone; otherwise, do the usual optimizations.

Make it perform better

By this change we've made the optimization technique correct, because it is not done when the argument of cl-block-wrapper doesn't appear to be a catch form. But even in this case the optimization can still be done.

The fundamental flaw with this compiler-macro-optimization thing is that source-level optimizations should better take place at macroexpansion time, not at compile-macro-expansion time. Compiler macros are supplementary by their nature; they are advisory, optional. Such an optimization should better have been placed in macro definition.

So we're going to do exactly this -- move the optimization logic into the definition of block.

ux-alternative-block-expander is an alternative expander function for the block macro. It also uses cl-active-block-names alist for remembering return-froms which were encountered. But instead of calling macroexpand-all, we now rely upon our home-made and corrected ux-mexp-all expansion function (ux-mexp-implicit-progn is just a helper which calls ux-mexp-all).

ux-alternative-return-from-expander look also much like the CL compiler macro for return-from. cl-active-block-names is scanned looking for block-name; if it is found, we set the corresponding flag and expand into a usual throw. Otherwise, an error is signaled (return-from has not sense out of block).

Finally, ux-establish-alternative-block is a function that takes environment env and returns that same environment possibly augmented by alternative expanders for block and return-from. If alternative definitions are already established in env, we just return env unchanged; otherwise, 2 new conses are pushed onto env and returned.

How to use ux-establish-alternative-block.

When you need to expand a form to the full depth and then to make some further manipulations on the result, you most probably need to augment the environment you've captured with the alternative definitions for block and return-from. The augmented environment should be used in a subsequent call to ux-mexp-all. This will ensure all catch eliminations will take place during the macroexpansion, and what you get will be the final result.

That's all I wanted to tell about here. I perfectly realize that this post rather describes yet another kludge for Emacs Lisp and CL package; it doesn't look like something new, useful and valuable. Nevertheless, these alternative macro definitions of block and return-form will help us greatly in higher-level stuff which I hope to publish later.

Thank you for attention.