[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hello from a new user interested in shipping applications



Noel-

Thanks for the great pointers and suggestions.

Certainly improving MzScheme numerical performance automatically by
analyzing the program would be fantastic. I'm not sure I'm personally up
to spending time doing this though (given limited time).

However, Squeak does translation of Smalltalk to C and I think the type
hinting it uses is not that bad. What follows is inspired in part by
that. 

One of the things that appeals to me in Scheme (over say Python or
Smalltalk) is that it is somewhat more straightforward to process Scheme
programs with hints or assumptions. 

For example, if I write this in MzScheme and save it in the file:

  (define (foo a b) (* a b))
  (define (bar a b) (/ a b))
  (define (baz a b) (+ (foo a b) (bar a b)))
  (define (main) (print (baz 10 20)))
  (main)

there is little to then stop me from easily parsing the input file as a
list and then running my own operations on it to generate C code with an
assumption that all arguments are floats unlest specifed otherwise, to
get:

  float foo(float a, float b) {return a * b;}
  float bar(float a, float b) {return a / b;}
  float baz(float a, float b) {return foo(a, b) + bar(a, b);}
  void main(void) {printf("%f", baz(10, 20));}

Then I can almost trivially have a list of variable names to treat
otherwise.
Ex. '((int p q r) (string s t u) (float-array l m n))
Obviously, you need to work in a subset of MzScheme to do this (no "car"
or "cdr" use for example) but in practice the core numerical parts of
something like a simulation are often mainly loops, tests, array
references, and numerical operations. [Note: I did this conversion by
hand.]

This is very different from compiling these functions to code that would
work generically with MzScheme, where a and b might be strings or lists
or objects.
To optimize that sort of code, you definitely would need a system for
inferring types.

Obvious limitations of this approach are that it won't neccesarily give
the same exact results under MzScheme -- for example, MzScheme prints
"401/2" as the answer to the above, whereas the C program prints
"200.500000". They are equivalent mathematically in this case, but might
not be in others.

The important point to make here is that because the algorithm is
encoded in MzScheme, there are all sorts of interesting things that are
easy to do with it (analyze interdependencies, variable useages, browse
it, generate assembler code directly, etc.). C is a much more difficult
informational container to extract algorithm related information from,
because you need to have a C parser, etc.

-Paul Fernhout
Kurtz-Fernhout Software 
=========================================================
Developers of custom software and educational simulations
Creators of the Garden with Insight(TM) garden simulator
http://www.kurtz-fernhout.com

Noel Welsh wrote:
> 
> Paul Fernhout writes:
> > On Numerical performance. This does appear to be an issue for further
> > consideration. It looks like I would then need to improve this area
> > myself
> > (similar to the library for Gambit or NumPy for Python) if I was to use
> > MzScheme for numerical work. Or alternatively, I could generate C code
> > to do number crunching from Scheme with type hints. Either of these
> > might be doable with a reasonable effort (perhaps leveraging off the
> > approach done for say Python's NumPy) -- assuming the other parts of
> > the system work well enough for my purposes.
> 
> I've just looked at NumPy, and it appears that they've written some C
> code to do the basic matrix operations in reasonable time, and then
> interfaced to the Netlib (http://www.netlib.org/) libraries for all
> the hard algorithms. You could certainly follow this approach in
> MzScheme, and get respectable performance.
> 
> More interesting is the second approach you mention - generating C
> code with(out) type hints. I think almost all the basic research has
> been done to get great numerical performance in Scheme. The following
> pointers may help:
> 
> - Andrew Wright's soft scheme system for infering types (no need for
> type hints. See
> http://www.intertrust.com/star/wright/personal-index.html) This would
> be doubly useful if a distinction is made between fixnums and bignums
> 
> - Unboxing of floats (e.g: http://www-fp.dcs.st-and.ac.uk/~kh/papers/pseudoknot/subsubsection3_5_6_4.html)
> 
> - It should be possible to avoid most index out-of-bounds checks by
> inducing the range of index variables. I don't know if anyone has done
> this. I suppose it would require adding invariant properties to the
> type system, so that the compiler could deduce any variable in the
> range [0 (vector-length)) doesn't require an out-of-bounds check.
> 
> Finally, I read somewhere that researchers in Aspect-Orientated
> Programming (http://www.parc.xerox.com/csl/projects/aop/
> http://www.acl.lanl.gov/iscope97/papers/lamping.html
> http://www.diku.dk/research-groups/topps/activities/PartialEvaluation.html)
> developed a system that collapsed operations on arrays into a single
> loops. Eg, if one is doing something like Ax + b, the multiplication
> and addition are collapsed into one loop without the programmer having
> to write special purpose code. The good thing is, this work was
> originally done in Scheme. "All are academic projects, with promising
> results so far. The Scheme systems are the most mature."
> 
> cya,
> Noel
> --
> Noel Welsh
> http://www.dai.ed.ac.uk/~noelw/   noelw@dai.ed.ac.uk