[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[plt-scheme] A Parameterizable version of Oleg's SXPath library



Oleg Kiselyov has mentioned on the SSAX-SXML list the possibility of to 
create a parameterizable version of his SXPath library which could support 
other representation of XML, besides his SXML. (See 
http://www.geocrawler.com/archives/3/15235/2002/3/0/8107786/.)

I have implemented this idea using PLT Scheme's units (and modules), 
providing instantiations for SXML and my own "RS-XML" (based on structures 
rather than s-expressions). This implementation is based on Kirill Lisovsky's 
slightly extended "SXPathLib".

This new library is available at http://celtic.benderweb.net/sxpath/. This 
requires PLT Scheme v200 (and was tested using the v200alpha12 version).

In fact I provide two versions. The first "pure" version, maintains the 
existing API quite closely.

The second version (nicknames "ice") makes the following changes to the API:
- The attribute list and namespaces list are no longer "children" of an 
element. This means that (select-kids (node-typeof? '@)) will return an empty 
nodeset.
- New functions (node-attributes pred) and (node-namespaces pred) replace the 
practice of using (node-join (select-kids? '@) (select-kids? pred) to match 
particular attributes (or namespace prefix bindings) of an element.
- Interestingly, despite this the notation used by the sxpath function 
remains the same. (sxpath '(@ href)) will still return the href attribute of 
the elements in the nodeset being processed.
- Mostly for the convenience of RS-XML, I allow the use of a predicate to 
test the type of an element/attribute where a symbol can be used. For 
instance, (sxpath '(// http://www.w3.org/html:p) can be replaced with (sxpath 
`(// ,h4:p?)), where h4:p? is a function which tests whether an element is 
the p element in the HTML namespace.
- This change requires a change in the sxpath notation: where another type of 
procedure is to be included in the path, this must be done using (! ,<some 
procedure>).

For both versions (and both the SXML and RS-XML instantiations), a 
"regression test suite", closely following that which was originally provided 
by Oleg, is included in the distribution. These also function as 
illustrations of the various queries the library allows.

Although this code depends on PLT Scheme's units, it could easily be used as 
the basis for a port of the library to support MIT Scheme's XML library 
(based on records)-- or other representations of XML in Scheme.

The instantiation for RS-XML depends on my WebIt! collection (which includes 
the RS-XML libraries), available at http://celtic.benderweb.net/webit/.

Jim Bender