On this page:
6.1 Introducing syntax-parse
6.2 Using syntax-parse
6.3 Ellipses
6.4 Syntax Classes
6.5 Simple Macros
6.6 Exercises
7.4.0.4

6 Patterns and Templates

While quasisyntax is usually a lot better than list and datum->syntax to construct syntax objects, it doesn’t help at all for parsing the input syntax object to a transformer. The syntax-parse form from the syntax/parse library provides much better support for both parsing a syntax object and constructing a new one. It’s derived from pattern-based macro tools that started in Scheme.

6.1 Introducing syntax-parse

The basic form of a syntax-parse expression is

(syntax-parse stx-exp
  [pattern result-exp]
  ...)

In a pattern, identifiers generally stand for “any syntax object,” and the identifier is bound as a pattern variable in the result-exp. The value of a pattern variable is the part of the stx-exp syntax object that it matched. A pattern variable cannot be referenced directly; it can only be used in a template, which is written with #' or syntax.The syntax form cooperates with forms like syntax-parse to support pattern variables. Until this point, we have used syntax in a more primitive way. The actual primitive form, which does not support pattern variables, is called quote-syntax.

> (syntax-parse #'(one two three)
    [(a b c) #'a])

#<syntax:eval:2:0 one>

> (syntax-parse #'(one two three)
    [(a b c) #'c])

#<syntax:eval:3:0 three>

> (syntax-parse #'(one (two zwei (dos)) three)
    [(a (b1 b2 (b3)) c) #'b3])

#<syntax:eval:4:0 dos>

A template doesn’t have to contain just a single pattern variable. It can refer to multiple pattern variables, and anything that isn’t a pattern variable is kept as-is.

> (syntax-parse #'(one two three)
    [(a b c) #'(alpha (a b) beta ((c) b) a gamma)])

#<syntax:eval:5:0 (alpha (one two) beta ((three) two) one gamma)>

When there are multiple patterns, syntax-parse looks for the first match.

> (syntax-parse #'(one two three)
    [(a b) #'a]
    [(a b c) #'b]
    [(a b c d) #'d])

#<syntax:eval:6:0 two>

The identifier _ is special, because it matches anything without binding a pattern variable.

> (syntax-parse #'(one two three)
    [(a _) #'a]
    [(_ b _) #'b]
    [(_ _ _ d) #'d])

#<syntax:eval:7:0 two>

Another special identifier is ~literal, which wraps an identifier within a pattern to make it match a literal identifier, instead of treating the identifier as a pattern variable.

> (syntax-parse #'(one two three)
    [(_ (~literal two) c) #'c]
    [(a (~literal zwei) _) #'a])

#<syntax:eval:8:0 three>

> (syntax-parse #'(ein zwei drei)
    [(_ (~literal two) c) #'c]
    [(a (~literal zwei) _) #'a])

#<syntax:eval:9:0 ein>

6.2 Using syntax-parse

Let’s revisit the time-it example, but with syntax-parse.

#lang racket/base
(require (for-syntax racket/base
                     syntax/parse))
 
(define (time-thunk thunk)
  (let* ([before (current-inexact-milliseconds)]
         [answer (thunk)]
         [after (current-inexact-milliseconds)])
   (printf "It took ~a ms to compute.\n" (- after before))
   answer))
 
(define-syntax (time-it stx)
  (syntax-parse stx
    [(_ expr)
     #'(time-thunk (lambda () expr))]))
 
(time-it (+ 1 2))

Besides being easier to read than our earlier implementations, note that time-it reports a helpful error message if it is wrapped about more than one expression.

6.3 Ellipses

A syntax-parse pattern can include ... to mean “any number of repetitions.” It applies to the sub-pattern that immediate precedes the ...:

> (syntax-parse #'(one two three)
    [(_ ...) 'ok])

'ok

> (syntax-parse #'(one (two) (three))
    [((_) ...) 'first]
    [(_ (_) ...) 'second])

'second

> (syntax-parse #'((one) (two zwei dos) (three drei tres troi))
    [((_ ...) ...) 'ok])

'ok

When a pattern variable is repeated by ..., then the pattern variable is bound to a sequence of matches, instead of just one match. When the pattern variable is referenced in a template, a ... must appear after the reference to consume the sequence of matches. At least one sequence-bound pattern variable must appear before ... in a template, and anything else in the sub-template before ... is duplicated for each element of the sequence.

> (syntax-parse #'(one (two) (three))
    [(_ (b) ...) #'(pre b ... post)])

#<syntax:eval:13:0 (pre two three post)>

> (syntax-parse #'(one (two) (three))
    [(_ (b) ...) #'((pre b post) ...)])

#<syntax:eval:14:0 ((pre two post) (pre three post))>

Since the run macro is supposed to allow any number of argument, ellipses are needed for its pattern form:

(define-syntax (run stx)
  (syntax-parse stx
    [(_ proc arg ...)
     #`(run-program (symbol->string 'proc)
                    (symbol->string 'arg) ...)]))

While this variant of the macro is convenient to write, it defers symbol->string to run time. To perform the conversion at compile time, it makes sense to mix pattern matching and quasisyntax:

(define-syntax (run stx)
  (syntax-parse stx
    [(_ proc arg ...)
     #`(run-program #,(symbol->string (syntax-e #'proc))
                    #,@(map symbol->string
                            (map syntax-e
                                 (syntax->list #'(arg ...)))))]))

Another way to write run is to use syntax-parse’s #:where directive. A #:where is written within one pattern–result clause, and it is followed by two forms: a pattern and an expression. The pattern is matched against the result of the expression, and any pattern variables bound by the pattern can be used in the remainder of the clause.

(define-syntax (run stx)
  (syntax-parse stx
    [(_ proc arg ...)
     #:with proc-str (symbol->string (syntax-e #'proc))
     #:with (arg-str ...) (map symbol->string
                                (map syntax-e
                                     (syntax->list #'(arg ...))))
     #`(run-program 'proc-str 'arg-str ...)]))

Unlike quasisyntax, which uses something like #'here as the first argument to datum->syntax to coerce to a syntax object, #:with uses #f. It turns out that binding matters even for strings when they are in expression position, so the above variant of run compensates by using an explicit quote around each string.

6.4 Syntax Classes

Although our latest variants of run report a good error in some cases, such as when no program and arguments are provided, the error message is bad if a program or argument is anything other than an identifier. To make syntax-parse provide a good error message in those cases, we can annotate the pattern variables with the syntax class id, which matches only identifiers. Annotate a pattern variable with a syntax class by using : followed by the syntax class without spaces around :.

(define-syntax (run stx)
  (syntax-parse stx
    [(_ proc:id arg:id ...)
     #`(run-program #,(symbol->string (syntax-e #'proc))
                    #,@(map symbol->string
                            (map syntax-e
                                 (syntax->list #'(arg ...)))))]))

The id syntax class is predefined, but syntax-parse enables programmers to define new syntax classes with define-syntax-class. Syntax classes not only specify constraints on pattern matching, but they can also synthesize attributes; they work in general as composable parsers. We don’t have time to get into those details here, though.

6.5 Simple Macros

Many macros are simple like time-it: they have one pattern followed immediately by the result template. For those cases, syntax/parse/define provides define-simple-macro. With define-simple-macro, the pattern is incorporated into the definition header (so that the underlying syntax-object argument is not named), and no #' is needed before the result template.

#lang racket/base
(require syntax/parse/define)
 
(define (time-thunk thunk) ....)
 
(define-simple-macro (time-it expr)
  (time-thunk (lambda () expr)))
 
(time-it (+ 1 2))

Note that syntax/parse/define is required without for-syntax, since it binds define-simple-macro for run-time definition positions—even though define-simple-macro’s work is at compile time.

6.6 Exercises

  1. Define define-five with run-time warnings as a pattern-based macro using define-simple-macro.

    Probably unlike your earlier define-five, this one should provide a good error message if the form after define-five is not an identifier.

  2. Define define-five with compile-time warnings as a pattern-based macro.

    Hint: You may not want define-simple-macro. Or you can use the #:do pattern directive; see Pattern Directives.

  3. Define a (result .... where ....) form that has local bindings written after an expression that uses them. For example,

    (result (+ x y)
            where
            (define x 1)
            (define y 2))

    should be equivalent to Using an empty let is an idiomatic Racket way of creating a nested definition context.

    (let ()
      (define x 1)
      (define y 2)
      (+ x y))

    The result macro should require literally the word where after the result expression. The result form should also allow exactly one result expression but any number of definitions.