Basic ML Programming

In this section, we set out to conquer the basics of ML syntax. Our exegetical strategy will be to make a bottom-up pass through the grammar. In other words, we will start with simple things (constants, variables) and proceed to consider how more complex expressions are constructed from simpler ones. By the end we will have covered most of the ways in which an ML program can be built. We will follow the close association between the form of a program and its type. ML programming emphasizes the consideration of types, as we shall see.



How ML processes a program

ML compilers, at least the ones we will be using, have several ways in which they can be used.

The read-eval-print loop allows one to interact with the ML interpreter, entering ML expressions (including programs), and getting answers back. This mode is commonly used in program development. Our examples will primarily be presented by using the interactive system.

The batch compiler is like that for many compiled languages, generating object code files which can be linked together to obtain stand-alone executables.

Whether it resides in a file, or is cut-and-pasted into an interactive session, a program is processed as follows:

Note. Since ML is an expression-based language, the distinction between expressions and programs is blurred. Therefore even very simple expressions, like the expression 42 for example, count as programs and are dealt with as just described.


Basic Types and Constants

In the following table, we present a few common types and constants that belong to them. Some types have only a few elements, like bool with only true and false while others, like int, have many.

Each type provided by the ML implementation tends to have an associated structure of the same name, wherein various useful functions on that type are collected.

                  Sample
     Type         Constants                Structure
    -------------------------------------------------

     bool        true, false                 Bool
     int         ... ~1, 0,1,0xF, ...        Int
     real        1.0, 0.1, 3.32E5, ...       Real
     char        #"a", #"b", #"c", ...       Char
     string      "", "foo\n", "\127", ...    String
     word8       0w255, 0wxFF, ...           Word8
     unit        ()
     instream    stdIn                       TextIO
     outstream   stdOut, stdErr              TextIO
     time        zeroTime                    Time
The types bool, int, real, char, and string are just the same as those found in a typical programming language. The type word8 represents 8 bit bytes. There is also another type word found in structure Word which represents the type of n bit words, where n depends on the word size of the machine.

The unit type has just one element: (). It is useful when programming with imperative constructs.

For performing input and output, there are types of input streams instream and outstream, along with the usual pre-defined streams. These are found in the structure TextIO. Binary versions of these may be found in BinIO, at least in MoscowML.

For measurements, there is a time type, with a constant representing a starting time.


Variables, Binding, and Scope

Variables are never declared in ML. Instead, the act of binding attaches names to values. One can think of a binding as an assignment that can only happen once. A binding is made by
    val pat = exp
where a pattern pat is an expression built out of variables and constants. If the structure of pat matches that of exp, i.e., can be made identical by substituting for the variables of pat, then the equalizing substitutions get added to the current environment. In the following, x can obviously be made identical to 2:
    - val x = 2 
    > val x = 2 : int

    - val y = x 
    > val y = 2 : int
After the second binding, y and x have the same value, but they do not point to the same storage location. The next bindings show further pattern matching behaviour.
    - val 2 = x;
    -

    - val 3 = x
    ! Uncaught exception: 
    ! Bind
The last binding attempt fails because 3 is not equal to 2.

Another kind of pattern is wildcards, which are represented by an underscore. No binding is made to a wildcard position in a pattern. One can think of a wildcard as a variable guaranteed not to occur anywhere else in the program.

    - val _ = 2;
Variables and wildcards are basic patterns.We will see more complex patterns in due course.

Scoped bindings

The keywords let and local allow scope control. After both of the following two expressions, the values of x and y are inaccessible. After this last expression is evaluated, the value of z alone has been added to the current environment.


Tuples and Records

Tuples and records are used to gather data together. The following two evaluations bind t to a tuple and r to a record. As can be seen, the types of tuples and records are automatically computed from the types of the components.
    - val t = ("Dave", 1, true);
    > val t = ("Dave", 1, true) : string * int * bool

    - val r = {name="Dave", rank = 1, inClub=true};
    > val r = {inClub = true, name = "Dave", rank = 1} 
       : {inClub : bool, name : string, rank : int}
Elements are accessed via indices (tuples) or field names (records)
    #1 t = "Dave" : string
    #2 t = 1 : int
    #3 t = true : bool

    #name r = "Dave"
    #rank r = 1 : int
    #inClub r = true : bool
The syntax of patterns has tuple and record patterns. These allow simultaneous binding.
    - val (a,_,c) = t;
    > val a = "Dave" : string
      val c = true : bool

    - val {name, inClub, ...} = r;
    > val name = "Dave" : string
      val inClub = true : bool


Applications and functions

The type of functions from type ty1 to type ty2 is ty1 -> ty2.

A function application (M N) is an expression that applies function M to argument N. For example, the expression Char.chr 50 is the application of the function Char.chr, which has type int -> char, to the number 50.

   - Char.chr 50;
   > val it = #"2" : char
In general terms, if M has type ty1->ty2, and N has type ty1, then (M N) is well-typed and has type ty2.

The evaluation order of ML is call-by-value. This means that a function appliction (M N) is evaluated by evaluating M to yield a function f and evaluating N to a value x. Then f is run on x.

Replicated application associates to the left: (M N O P) is understood as (((M N) O) P). For this to be well-typed, M must have the type ty1->ty2->ty3->ty4, where ty1,ty2,ty3 are the types of N,O,P, respectively. If M has a type of such a shape, it is said to be a Curried function (after Haskell Curry).

An uncurried function has a single argument, which is a tuple or a record. If M above were uncurried, its application would be M (N,O,P).


Infixes


Pre-existing functions on basic types

  Type            Functions                 Structure
 ------------------------------------------------------

  bool        andalso, orelse, not             Bool
  int         ~,+,-,*,div,mod,<,<=,>,>=        Int
  real        (same as int)                    Real
  char        chr, ord                         Char
  string      explode, implode, sub, ^         String
  word        andb, orb, xorb                  Word
  instream    openIn, closeIn, input           TextIO
  outstream   openOut, closeOut, output        TextIO
  time        now, toSeconds, toMilliseconds   Time


Function expressions


Function binding


Polymorphic types


Lists

Standard functions on lists


Options


Datatypes

Tree traversals


More pattern-matching examples


Polymorphic values

Example

We will use quicksort for demonstration purposes. It's especially nice since it has a very compact implementation.


Abstract types


Reference types


References and polymorphism

The value restriction


Equality types


Exceptions

Example: file handling

In this example, we want to open a file designated by the given string. An empty string or a string that denotes an existing file (checked by FileSys.access) is disallowed.
    exception ERROR of string;   (* declare our exception *)

    fun open_file st =
      if st = "" orelse FileSys.access(st,[])
         then raise ERROR ("Unable to open file: " ^ st)
         else TextIO.openOut st

    val ostrm = open_file "foo" 
                handle ERROR s => take some appropriate action

Example: map a function over a list

This is the same as map except that any raised exceptions are handled and ignored.
    fun mapfilter f [] = []
      | mapfilter f (h::t) = 
         let val t' = mapfilter f t
         in 
           f h :: t' handle _ => t'
         end

Example: iteration to failure

In this example, we repeatedly invoke a function f on an argument until the function fails.
    (* repeat : ('a -> 'a) -> 'a -> 'a *)

    fun repeat f =
      let exception LAST of 'a
          fun loop x = 
           let val y = (f x handle 
                           Interrupt => raise Interrupt
                         | otherwise => raise (LAST x))
           in loop y 
           end
      in fn x => loop x handle (LAST y) => y
      end
In the following example, we use repeat to find the maximum element of the int type.
    - Mosml.time (repeat inc) 0;
    > User: 144.730  System: 0.010  GC: 0.000  Real: 144.886
    > val it = 1073741823 : int
Caution. The above example will not terminate on implementations of ML that implement int with bignums.


Konrad Slind, slind@cs.utah.edu