3 A Model of Expansion

7.4.0.4

3 A Model of Expansion

To better understand how macro expansion works in Racket, let’s approach it the same way as in a traditional programming-languages course: by building an interpreter for a simple Lisp.

3.1 An Interpreter

Here’s an initial interpeter:

lambda+let.rkt

Even in the case of implementing this interpreter, we can’t help using a domain-specific variant of Racket. The Plait language (as referenced by via #lang plait) is a statically typed language with polymorphism and type inference. The type system and semantics of Plait are very close to ML, but the surface syntax of Plait looks like Racket. That combination makes the language especially suitable for a programming-language course.

The language that the interpreter implements is the λ-calculus enriched with numbers, addition, multiplication, if0, and let. Let’s call this language Curly, since we use curly braces in place of parentheses to help distinguish the language:

‹exp›	::=	‹number›
	\|	‹id›
	\|	{ + ‹exp› ‹exp› }
	\|	{ * ‹exp› ‹exp› }
	\|	{ lambda { ‹id› } ‹exp› }
	\|	{ ‹exp› ‹exp› }
	\|	{ if0 ‹exp› ‹exp› ‹exp› }
	\|	{ let { [ ‹id› ‹exp› ] } ‹exp› }

Since Curly has no letrec form, a Curly programmer must use the Y combinator to implement a recursive function like fib:

{let {[Y {lambda {f}
           {{lambda {fX}
              {fX fX}}
            {lambda {fX}
              {f {lambda {x} {{fX fX} x}}}}}}]}
  {let {[fib
         {Y {lambda {fib}
              {lambda {n}
                {if0 n
                     1
                     {if0 {+ n -1}
                          1
                          {+ {fib {+ n -1}}
                             {fib {+ n -2}}}}}}}}]}
    {fib 30}}}

So, that kind of program is what the interpreter in lambda+let.rkt must evaluate.

Even if the Plait language is not familiar, you’ll recognize the structure of Plait code that interprets Curly programs. It starts with a datatype for representing Curly expressions:

(define-type Exp
  (numE [n : Number])
  (idE [s : Symbol])
  (plusE [l : Exp]
         [r : Exp])
  (multE [l : Exp]
         [r : Exp])
  (lamE [n : Symbol]
        [body : Exp])
  ....)

The value of a Curly expression can be either a number or a function, where a function is represented as closure of an expression and an environment:

(define-type Value
  (numV [n : Number])
  (closV [arg : Symbol]
         [body : Exp]
         [env : Env]))

The interp function takes a Curly expression and an environment, and it returns a value:

(define (interp [a : Exp] [env : Env]) : Value
  (type-case Exp a
    [(numE n) (numV n)]
    [(idE s) (lookup s env)]
    [(plusE l r) (num+ (interp l env) (interp r env))]
    [(multE l r) (num* (interp l env) (interp r env))]
    [(lamE n body) (closV n body env)]
    ....))

The one part of lambda+let.rkt that may be less familiar is the parse function, and we’ll look at that next.

3.2 A Parser

The job of the parse function in lambda+let.rkt is to take a Curly program in textual form (with curly braces) and produce its representation as an instance of the Exp datatype. Instead of actually parsing at the character level, however, the Curly parser takes advantage of Plait’s S-expressions. It also takes advantage of the fact that { and } can be used as synonyms for ( and ) when writing an S-expression.

An S-expression Plait is written with a backquote: `. After the backquote is a number, boolean, string, identifier, or parenthesized form. For example,

`10

is an S-expression that encapsulates the number 10, while

`apple

encapsulates the symbol 'apple. Although the first of those S-experssions has a number inside and the second has a symbol, both expressions have type S-Exp in Plait.

Th S-expression

`(lambda (x) 10)

which can be equivalently written

`{lambda {x} 10}

encapsulates a list. The list has three S-expressions inside: `lambda, `{x}, and `10.

Plait’s operations on S-expressions include a way to check whether an S-expression has a number, s-exp-number?, and a way to extract the number, s-exp->number:

> (s-exp-number? `apple)
- Boolean
#f
> (s-exp-number? `10)
- Boolean
#t
> (s-exp->number `10)
- Number
10

There are also s-exp-symbol?, s-exp->symbol, s-exp-list?, and s-exp->list:

> (s-exp-list? `{lambda {x} 10})
- Boolean
#t
> (first (s-exp->list `{lambda {x} 10}))
- S-Exp
`lambda

So, if we put a ` in front of the Curly implementation of fib shown earlier, then we have an S-expression that we can pass to parse. The parse function can use functions like s-exp-number?, s-exp-list?, and s-exp->list to recursively pull the S-expression apart and convert it to an Exp:

(define (parse [s : S-Exp]) : Exp
  (cond
    [(s-exp-number? s) (numE (s-exp->number s))]
    [(s-exp-symbol? s) (idE (s-exp->symbol s))]
    ; check for `{+ <exp> <exp>}:
    [(and (s-exp-list? s)
          (= 3 (length (s-exp->list s)))
          (s-exp-symbol? (first (s-exp->list s)))
          (symbol=? '+ (s-exp->symbol (first (s-exp->list s)))))
     (plusE (parse (second (s-exp->list s)))
            (parse (third (s-exp->list s))))]
    ....))

The check to recognize a `{+ ‹expr› ‹expr›} S-expression is especially tedious. Plait includes one more S-expression helper for parsing S-expressions: s-exp-match?. The s-exp-match? function takes two S-expressions, where the first one is used as a pattern and the second one is matched against the pattern. In the pattern S-expression, the special symbol 'ANY matches any S-expression, 'SYMBOL matches any S-expression, and so on, and non-special symbols must match literally. So,

(s-exp-match? `{+ ANY ANY} s)

is equivalent to

(and (s-exp-list? s)
     (= 3 (length (s-exp->list s)))
     (s-exp-symbol? (first (s-exp->list s)))
     (symbol=? '+ (s-exp->symbol (first (s-exp->list s)))))

Using s-exp-match? makes parse easier to understand:

(define (parse [s : S-Exp]) : Exp
  (cond
    [(s-exp-match? `NUMBER s) (numE (s-exp->number s))]
    [(s-exp-match? `SYMBOL s) (idE (s-exp->symbol s))]
    [(s-exp-match? `{+ ANY ANY} s)
     (plusE (parse (second (s-exp->list s)))
            (parse (third (s-exp->list s))))]
    [(s-exp-match? `{* ANY ANY} s)
     (multE (parse (second (s-exp->list s)))
            (parse (third (s-exp->list s))))]
    [(s-exp-match? `{lambda {SYMBOL} ANY} s)
     (lamE (s-exp->symbol (first (s-exp->list
                                  (second (s-exp->list s)))))
           (parse (third (s-exp->list s))))]
    ....))

Note that after recognizing the shape of a lambda form, pulling out the relevant parts is still a little tedious, so there’s room for an even better pattern-and-template language. But simple pattern matching is good enough for now.

The implementation of lambda+let.rkt is missing most of its tests, but it includes at least one test to show how parse works:

(test (parse `{let {[x {+ 1 2}]}
                y})
      (letE 'x (plusE (numE 1) (numE 2))
            (idE 'y)))

The tests for interp use parse for convenience:

(test (interp (parse `{+ {* 2 3} {+ 5 8}})
mt-env)
(numV 19))

3.3 Desugaring

The interpreter in lambda+let.rkt handles let as a core form, but a let form is trivially converted into a lambda and application form:

{let {[‹id› ‹rhs-exp›]}
‹body-exp›}
=
{{lambda {‹id›} ‹body-exp›}
‹rhs-exp›}

In other words, let is syntactic sugar that can be replaced with other forms that are already in the language.

Instead of having interp handle a letE form, we could instead have parse generate an appE plus lambdaE instead of letE. That way, only parse has to deal with let forms. In other words, parse can desugar source programs as it parses:

(define-type Exp ....) ; no letE case, anymore

(define (parse [s : S-Exp]) : Exp
  (cond
    ....
    [(s-exp-match? `{let {[SYMBOL ANY]} ANY} s)
     (let ([bs (s-exp->list (first
                             (s-exp->list (second
                                           (s-exp->list s)))))])
       (appE (lamE (s-exp->symbol (first bs))
                   (parse (third (s-exp->list s))))
             (parse (second bs))))]
    ....))

This desugaring approach is better than handling letE in the rest of the program, but it still duplicates information in the sense that it repeats the encoding of lambda and application in terms of appE and lambdaE. We could avoid that duplication by converting a let S-expression to an application and lambda S-expression and then recurring:

(define (parse [s : S-Exp]) : Exp
  (cond
    ....
    [(s-exp-match? `{let {[SYMBOL ANY]} ANY} s)
     (let ([bs (s-exp->list (first
                             (s-exp->list (second
                                           (s-exp->list s)))))])
       (parse (list->s-exp
               (list (list->s-exp
                      (list `lambda
                            (list->s-exp
                             (list (first bs)))
                            (third (s-exp->list s))))
                     (second bs)))))]
    ....))

This approach is conceptually nicer, but all the list constructions and list->s-exp coercions are painful.

3.4 Quasiquotation

Prefixing a parenthesized form with ` quotes it so that no evaluation takes place for sub-forms, even if they look like Plait expressions that could be evaluated.

> `(lambda (x) (+ 1 2))
- S-Exp
`(lambda (x) (+ 1 2))

If we want to evaluate a sub-form and treat its result as part of the enclosing S-expression, then we can escape back to Plait by using the unquote operator , as an escape. The result of an escaped expression must be an S-expression so that it can be part of the enclosing S-expression.

> `(lambda (x) ,(number->s-exp (+ 1 2)))
- S-Exp
`(lambda (x) 3)

Unquoting makes let desugaring much easier to implement and read, especially because nested parts extracted from an existing S-expression are already S-expressions:

(define (parse [s : S-Exp]) : Exp
  (cond
    ....
    [(s-exp-match? `{let {[SYMBOL ANY]} ANY} s)
     (let ([bs (s-exp->list (first
                             (s-exp->list (second
                                           (s-exp->list s)))))])
       (parse `{{lambda {,(first bs)} ,(third (s-exp->list s))}
                ,(second bs)}))]
    ....))

Here’s the updated interpeter:

lambda.rkt

Quasiquotation (i.e., quoting with unquoting to escape) is even more convenient in an untyped language like Racket where lists, symbols, and numbers can be treated directly as S-expressions without coercion. Here are some examples in Racket (not Plait):

> `(1 2 3 4)
'(1 2 3 4)
> `(1 ,(+ 2 3) 4)
'(1 5 4)
> (define (make-adder n)
`(lambda (x) (+ x ,n)))
> (make-adder 5)
'(lambda (x) (+ x 5))
> (make-adder `(current-seconds))
'(lambda (x) (+ x (current-seconds)))

Since plain Racket symbols and lists are S-expressions, Racket’s plain quoting operator ' (which does not support escapes) also creates S-expressions.

In addition to the unquote operator ,, both Racket and Plait support a splicing unquote operator ,@. A splicing unquote expects the escaped expression to produce a list, and it splices all of the individual elements of the list in place of the escape.

> `(1 ,@(list 2 3) 4)
'(1 2 3 4)
> `(1 ,(list 2 3) 4)
'(1 (2 3) 4)
> (define (make-adder* ns)
`(lambda (x) (+ x ,@ns)))
> (make-adder* `(5 (current-seconds)))
'(lambda (x) (+ x 5 (current-seconds)))

The ', `, ,, and ,@ operators are actually shorthands for quote, quasiquote, unquote, and unquote-splicing, respectively.

> '(quote apple)
''apple
> '(quasiquote apple)
'`apple
> '(unquote apple)
',apple
> '(unquote-splicing apple)
',@apple

3.5 Macros

We can build as much desugaring as we want into parse to create an ever richer Curly language. Or we can give programmers the tools to add those extensions themselves by adding macros to Curly. A macro is implemented by a macro transformer that takes an S-expression (such as the S-expression for a let form) and converts it to a different S-expression (such as an S-expression that uses application and lambda, instead). Instead of implementing that transformation in Plait as part of the Curly implementation, a macro transformer is implemented in Curly to be used on later Curly expressions.

Adding macros to Curly requires two sets of changes:

We must add list and symbol values and operations on them, so they can be used for S-expressions. Since Curly is untyped, we won’t have to distinguish S-expressions from lists, symbols, and numbers.
We must adjust parse to have an environment mapping names to macro transformers. It must also recognize a let-macro form, which maps a macro name to a transformer that is implemented in Curly; parse will need to directly call interp to evaluate the right-hand side of let-macro, and then parse can call itself again for the body of let-macro using an updated macro environment.

While quasiquoting is convenient, it’s not essential, so we’ll just add quote forms, which can be written either with quote or using the ' shorthand. Quoting provides a way to write symbol literals and the empty list, and it also supports more general literal lists and S-expressions.

‹exp›	::=	....
	\|	{ quote ‹s-exp› }
	\|	{ cons ‹exp› ‹exp› }
	\|	{ first ‹exp› }
	\|	{ rest ‹exp› }
	\|	{ empty? ‹exp› }
‹s-exp›	::=	‹number›
	\|	‹id›
	\|	{ ‹s-exp›* }

Each result value is now either a number, a closure, a symbol, or a list of values:

(define-type Value
  (numV [n : Number])
  (closV [arg : Symbol]
         [body : Exp]
         [env : Env])
  (symV [s : Symbol])
  (listV [l : (Listof Value)]))

For macro binding, we add let-macro to the grammar:

	‹exp›	::=	....
		\|	{ let-macro { [ ‹id› ‹exp› ] } ‹exp› }

The parse function will expect the first ‹exp› in let-macro to produce a function, but that constraint is not expressed in the grammar.

Putting these pieces together, here’s a Curly program that defines its own let macro, and then uses it to bind x:

{let-macro {[let {lambda {s}
                   {cons {cons 'lambda
                               {cons
                                {cons {first {first {first {rest s}}}}
                                      '{}}
                                {cons {first {rest {rest s}}}
                                      '{}}}}
                         {cons {first {rest {first {first {rest s}}}}}
                               '{}}}}]}
   {let {[x 5]}
     {+ x x}}}

One advantage of making Curly’s syntax a subset of Racket is that we can extract out just the transformer expression and try it as a plain Racket function:

> (let ([let-transformer
         ; same as above
         {lambda {s}
           {cons {cons 'lambda
                       {cons
                        {cons {first {first {first {rest s}}}}
                              '{}}
                        {cons {first {rest {rest s}}}
                              '{}}}}
                 {cons {first {rest {first {first {rest s}}}}}
                       '{}}}}])
    ; apply it to an S-expression
    (let-transformer
     '{let {[x 5]} {+ x x}}))
'((lambda (x) (+ x x)) 5)

The full implementation of Curly with macros is

let-macro.rkt

Here are the most relevant parts of parse:

(define (parse* [s : S-Exp] [env : Env]) : Exp
  (cond
    [(s-exp-match? `{let-macro {[SYMBOL ANY]} ANY} s)
     (let ([bs (s-exp->list (first
                             (s-exp->list (second
                                           (s-exp->list s)))))])
       (let ([proc (interp (parse (second bs))
                           mt-env)])
         (parse* (third (s-exp->list s))
                 (extend-env (bind (s-exp->symbol (first bs))
                                   proc)
                             env))))]
    ....
    [(s-exp-match? `{ANY ANY ...} s)
     (let ([rator (first (s-exp->list s))])
       ; try as macro...
       (try
        (parse* (apply-macro (lookup (s-exp->symbol rator) env)
                             s)
                env)
        ; ... fall back to application
        (lambda ()
          (if (= (length (s-exp->list s)) 2)
              (appE (parse* rator env)
                    (parse* (second (s-exp->list s)) env))
              (error 'parse "invalid input")))))]
     ....))

3.6 Exercises

Start with let-macro.rkt.

Fill in the let-macro-bound implementation of - in the following program so that it produces 3:
(interp
(parse
  `{let-macro {[- {lambda {s}
                    ....}]}
     {- 7 4}})
mt-env)
You can define - in terms of +, *, and -1. Note that this - cannot be defined as a function, since Curly supports only single-argument functions and application.

Having only cons instead of list makes the implementation of macros in Curly relatively difficult. Replace .... in the following test case with the implementation of a list macro so that the test passes:

(test (interp
       (parse
        `{let-macro {[let {lambda {s}
                            {let-macro {[list ....]}
                              {list
                               {list 'lambda
                                     {list {first {first {first {rest s}}}}}
                                     {first {rest {rest s}}}}
                               {first {rest {first {first {rest s}}}}}}}}]}
           {let {[x 5]}
             {+ x x}}})
       mt-env)
      (numV 10))

Hint: You can make a macro recursive by having one step of expansion produce an expression that refers to the macro again. For example, {list 1 2 3} could expand to {cons 1 {list 2 3}}.

Hint: Remember to use if0 to branch on the result of empty? in Curly.

Food for thought: Why is list defined inside the lambda for the implementation of let?

← prev up next →

1	Example Racket Languages
2	From Syntactic Abstraction to Language Construction
3	A Model of Expansion

3.1	An Interpreter
3.2	A Parser
3.3	Desugaring
3.4	Quasiquotation
3.5	Macros
3.6	Exercises