[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Document formatting in Scheme (fwd)



Hello!

I believe that you may be interested in Oleg Kiselyov's comments on the subject,
I had forwarded initial message of the thread to him, please find his reply
below.

Best regards,
        Kirill.

---------- Forwarded message ----------
Date: Fri, 26 Oct 2001 19:34:29 -0700 (PDT)
From: oleg@pobox.com
To: lisovsky@acm.org
Subject: Re: Document formatting in Scheme


Hello!

I checked the digest of the PLT Scheme list. Alas, the
discussion thread about Document formatting isn't indexed
yet.  Still, I wrote a message about SXML, which, I
hope, is on the topic of the thread. If it really is, can you
forward/post the message? 

BTW, what Noel Welsh writes in the first part of his message has
already been implemented: it's Mole.

>>>> Comment on Noel's message
Document formatting in Scheme, or SXML-> (HTML XML LaTeX)

> (manual	(title "scmunit")
>     (introduction "scmunit is a unit-testing framework for...")
> 		(chapter 1 "blah blah"))

This *IS* an XML document -- in the abstract syntax, to be precise. In
the spirit of XML, you espouse logical markup and its transformations
to particular presentation formats. S-expressions, DOM trees and
syntax-heavy XML documents are three different realizations of a
hierarchy of containers made of strings and other containers
(Infoset). Unlike DOM trees, S-expressions and XML documents both have
an external representation.

One of the benefits of SXML is that you can convert the same document
into various presentation formats -- for example, HTML or LaTeX
[sic!]. In fact, the SXML specification itself was authored in SXML
and then converted to a web page, a LaTeX document and then to a
Postscript file.
	http://pobox.com/~oleg/ftp/Scheme/xml.html#SXML-spec

I do want to point out that SXML code is far more expressive than the
corresponding LaTeX code. For example, you can write in SXML:
 
        (verbatim "some code")
        (table
         ...
         (tr
          (td (@ (colspan "2") (align "center"))
            (br) (n_)
            (verbatim
             "    A long line of indented code"))))
 
This piece of code shows off a table cell that spans several rows and
several columns. Yet this piece of code will be correctly rendered in
HTML and in TeX. LaTeX users will appreciate the expressiveness,
because they cannot use 'verbatim' within a tabular
environment. Multi-row table cells also require some effort to appear
right. Yet in SXML you can use 'verbatim' anywhere you need to,
whether inside or outside of tabular environments. In SXML, even
complex tables can be "typeset" in a straightforward manner.

Of course, a better example of a _logical_ markup will be the following
excerpt from SXML.scm. It specifies the *TOP* element of SXML:

   (productions
     (production 1
       (nonterm "TOP")
       ((term-lit "*TOP*")
	(ebnf-opt (nonterm "namespaces"))
	(ebnf-* (nonterm "PI"))
	(ebnf-* (nonterm "comment"))
	(nonterm "Element"))))


When converted to HTML, the above fragment becomes

<table border=0 bgcolor="#f5dcb3">
<tr valign=top><td align=right><a name="prod-1">[1]</a>&nbsp;</td>
<td align=right><code>&lt;TOP&gt;</code></td>
<td align=center><code> ::= </code></td>
<td align=left><code><em>*TOP*</em> &lt;namespaces&gt;? &lt;PI&gt;* 
  &lt;comment&gt;* &lt;Element&gt;</code> </td></tr>
</table>

In LaTeX, it doesn't look much better:

\begin{tabular}{rrcp{2.8in}}
{[}1{]} & \texttt{<TOP>} &  $::=$ & \texttt{\textit{*TOP*} <namespaces>? <PI>* <comment>* <Element> } \\
\end{tabular}
\\

IMHO, the SXML form is most lucid. It was also easier to type
(especially given suitable Emacs key bindings, similar to those
designed for LAML). 

Using SXML to express its grammar has another important advantage: I
can easily write a transformation or an SXPath query on the whole
SXML.scm to make sure that every 'nonterm' mentioned on the right-hand
of some production appears on the left-hand side of exactly one
production. The SXML.scm code thus lends itself not only to a flexible
presentation to a human but to a formal reasoning about as well.

As a matter of fact, the ability to treat SXML as a "code" or as
"data" is used during some SXML transformations. The expansion process
of some tag can re-scan the SXML document, with a different
stylesheet. That's how you can implement hierarchical tables of
contents. Unlike LaTeX, you don't need to write auxiliary files and
don't need to re-run the document processor.

Cheers,
     Oleg.