[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

MzScheme Special Form -> Native Numerical Code?



Inspired by Noel's suggestion of special forms for numerical code:

DrScheme, MrEd, and MzScheme are great as they are.
  http://www.cs.rice.edu/CS/PLT/packages/drscheme/index.html
But, it would sure be nice if they could generate native numerical code
to enable users to get quicker numerical results from numerical
simulations.

The philosophy of using special forms for numerical code which Noel
pointed out (as used in Gambit Scheme) is somewhat akin to the Python/C
philosophy. There the idea is use Python for glue and do the heavy
lifting in C called from Python. An example of this in supercomputing is
here:
  http://www.python.org/workshops/1997-10/proceedings/beazley.html
A MzScheme special form translator approach might instead use general
MzScheme for glue code (or numerical steering), and do the heavy
numerical lifting in special restricted forms of MzScheme functions
designed for intensive floating point computations. The advantage over
the Python approach is then one is doing all development in one language
-- MzScheme.

Going a bit further, with a simple cross-platform-ish assembler and call
approach in MzScheme, one would then getting near C speeds from the
special numerical forms without the need to depend on a C compiler. And,
for most numerical code, there is little value in optimizing more than
the inner computational loops. I'm happy if numerical performance is
within 2X-5X of C, which I think might be doable with this approach. 

Obviously, while it is easy to generate C code, what I'd really love
is MzScheme generating platform specific assembler/Forth/binaries for
the numerical operations which it can dynamically call. Only 68030+,
PPC, and 80486+ is needed to be useful enough for shipping most
Mac/PC/Linux apps... I understand the convenience of depending on a C
compiler, but I'd love to eliminate that so as to make an easily
installable system for people without one. But first I have to have a
sensible subset of Scheme numerical operators, perhaps with some more
easily translatable constructs [probably for looping or array indexing]
added as well. And it would help if there was a platform independent 
  (call-native-code someSortOfByteArray) 
function. I know this could be done indirectly through a loadable code
module (like a DLL) but I think there is a lot to be said for calling
code directly.

I have an article squirreled away somewhere on how to do
dynamic machine code translation with Delphi (translating simple parsed
code to native code which is jumped to) and it was quite neat and
relatively straightforward for mainly numerical code under the 80X86.
This process can be automated somewhat by generating unoptimized machine
code for simple known C expressions and then reverse engineering the
machine code.
(This is to an extent how the Squeak Just-In-Time compiler works...)

If someone can point out code that already does this (assembler,
call-native-code, other numerical related optimizations) which could be
easily ported to or integrated with MzScheme, I'd be much obliged. There
must be something out there with a compatible license. I'm not looking
for gcc -- I'd really like a primarily Scheme-only solution. 

A much easier approach would be to integrate MzScheme with a Forth with
good numerical performance and native code generation, like GForth, 
  http://www.complang.tuwien.ac.at/projects/forth.html
to do the numerical heavy lifting. (Forth implementations are generally
much smaller in footprint than gcc, and are generally easier to
understand). Unfortunately, GForth is under the GPL so there are some
licensing issues if the core is to be kept LGPL (maybe it could be
installed as a service? although that defeats my goal of tight
integration). Also, while I think it runs under the PPC, there isn't a
Mac version I'm aware of. However, in general, translation of a special
numerical form of MzScheme to Forth would be very easy. However, Forth
isn't usually optimized for numerical performance.

By the way, the educational motivation for generating and calling native
code from MzScheme is that then TeachScheme! can then be easily extended
into classes for advanced students on low level machine architecture.
Students could learn how a real processor works by poking at it through
MzScheme interface functions. Need to add a new language level then --
"crash and burn horribly". :-) But as I've heard it said, "If you can't
crash it, you're not the one doing the driving..."

And also keeping in line with the learning notion, to take TeachScheme! 
    http://www.cs.rice.edu/CS/PLT/Teaching/
to the next level, learning Scheme can become a gateway into a world of
modeling -- where students learn about subject materials by building
their own models. This is going beyond just teaching logical thinking --
to writing Scheme programs to aid in understanding specific real content
and specific real processes. 

The point is not to make great models -- just that the students learn a
lot by designing even poor models. Seymour Papert 
  http://papert.www.media.mit.edu/people/papert/
suggests this, as do others like David Ferguson (coauthor of "Learning
to Design, Designing to Learn: Using Technology to Transform the
Curriculum").
  http://www.cvc.sunysb.edu/CELT/profiledir.html
  http://www.spir.sunysb.edu/bio/ferguson_david.html
(David Ferguson was one of my professors in grad school, who along with
Prof. Thomas Liao, Prof. Lev Ginzburg, and others inspired me and my
wife to develop educational simulations.)

The reason I bring this up in this context is that if one goes down this
educational route, good numerical performance is a big plus in an
educationally-oriented Scheme, because many (not all) interesting models
use lots of floating point computation to simulate systems with emergent
properties (i.e. gas molecules colliding to cause pressure, water
percolation through soil layers, structural integrity of meshes, etc).
  
Here is an example of the sort of numerical code I would want to
translate from Scheme to numerical binaries (taken from our garden
simulator so it is currently in Delphi Pascal). (Anyone know of a Object
Pascal -> MzScheme translator?) This computes a simulated relative
humidity based on a triangular probability distribution involving lower
and upper limits and a mean.
(I picked it because it was short... Other code has loops.)
======================================
class function EP.RelativeHumidityForDay_frn(meanRelHumForDay_frn:
single): single;
  var
    upperLimit: single;
    lowerLimit: single;
  begin
  try
  upperLimit :=
EQ.RelHumTriangularDistUpperLimit_frn(meanRelHumForDay_frn);
  lowerLimit :=
EQ.RelHumTriangularDistLowerLimit_frn(meanRelHumForDay_frn);
  result := Utils_TriangularDistribution(lowerLimit,
meanRelHumForDay_frn, upperLimit);
  result := result * (safediv(meanRelHumForDay_frn, (upperLimit +
meanRelHumForDay_frn + lowerLimit) / 3.0));
  if (result >= 1.0) then result := 0.99;
  except on e: Exception do result := errorMessage('Exception in
EP.RelativeHumidityForDay_frn: ' + e.message); end;
  end;
====================================
Assuming the function it calls were also easily translatable, I would
think this could be converted fairly easily to reasonably quick
Scheme-generated machine code even without a lot of fancy optimizations
done by most C compilers.

-Paul Fernhout
Kurtz-Fernhout Software 
=========================================================
Developers of custom software and educational simulations
Creators of the Garden with Insight(TM) garden simulator
http://www.kurtz-fernhout.com

Noel Welsh wrote:
> 
> Here are the links to the AOP research that describes the `loop
> collapsing':
> 
> http://www.stanford.edu/~amitp/Java/amgoyal/aop.html
> http://www.parc.xerox.com/csl/groups/sda/publications/papers/Kiczales-ECOOP97/for-web.pdf
> 
> There is a package called SERIES for Common Lisp that appears to do
> the same sort of thing:
> 
> http://series.sourceforge.net/
> 
> All of the above is Lisp so should be eminently stealable and
> hackable. It is possible that doing this kind of smart programming
> would get back the losses from a rather simple minded
> compiler. Anything that is better than Matlab will get my attention!
> 
> -------
> 
> Type hinting:
> 
> The way Gambit does type hinting (which I think is the same as Common
> Lisp) is with the declare special form.
> 
> http://www.iro.umontreal.ca/~gambit/doc/gambit-c.html
> 
> "special form: declare declaration...
> 
> This form introduces declarations to be used by the compiler(currently
> the interpreter ignores the declarations). This form can only appear
> where a define form is acceptable. Declarations are lexically scoped
> in the same way as macros. The following declarations are accepted by
> the compiler:
> 
> [snip]
> 
> (number-type primitive...)
>             Numeric arguments and result of the specified primitives
> are known to be of the given type (all primitives if none
> specified). number-type can be: `generic', `fixnum', or `flonum'."
> 
> -------
> 
> Re Paul's simple translation program:
> 
> Take a look at the pattern matching system in MzLib (match.ss). I find
> it makes this sort of code a lot easier.
> 
> E.g.
> 
> (define (print-infix-call-arg arg)
>   (if (list? arg)
>       (print-call arg)
>       (print arg)))
> 
> can be written as
> 
> (require-library "match.ss" "mzlib")
> 
> (define print-infix-call-arg
>   (match-lambda
>     [(arg . rest) (print-call arg)]
>     [arg (print arg)]))
> 
> Note square brackets [] are equivalent to parentheses ()
> 
> DrScheme comes with a parser system called Zodiac, described best in
> the DrScheme Extension Manual. I suspect its is over-kill if you only
> want to compile a limited subset of Scheme to C (but maybe the
> vocabularies will help?)
> 
> cya,
> Noel
> --
> Noel Welsh
> http://www.dai.ed.ac.uk/~noelw/   noelw@dai.ed.ac.uk