Take Home Problem: Meetamath2000

Chris Alfeld and Chad Barb

Introduction

Insect Pets Incorporated, or IPI, has a problem. You see, it all comes down to meetings, lots of meetings. Meetings about this and meetings about that. Meetings day, and sometimes nights. Meetings in the morning and meetings in the afternoon.

Actually, it's not the meetings that are the problem. No, the problem is that the new line of cockroach care products involves a lot of math, and this math inevitably comes up at meetings. Every time a new mathematical equation is discussed, the meeting halts as everyone thinks and scribbles furiously to arrive at the answer. Only at that point does the meeting resume. This is the problem.

IPI has brought together a team of engineers to investigate and solve their problem. After much contemplation the team decide that the best solution would be to have a computer, well known to be good at math, automatically evaluate any math expression that arises during a meeting. Since a fast computer would provide an answer nearly instantly, meetings would not be delayed at all.

You have the task of writing the heart of the product. The expression evaluator. Your task is to evaluate a mathematical expression and produce a numerical answer. A speech to text converter will convert the spoken expressions into a string of words. Even better, recent research on cockroach physiometrics has lead to a breakthrough in tonal analysis, and a tonal inflections processor will insert parenthesis where appropriate, which makes your job much simpler.

The Problem

Your program will take mathematical expressions as strings of words, parse the strings, evaluate the expressions, and output numerical results. To simplify your task we will provided you with the complete grammar you are required to support.

Here are the details of how your expression evaluator should work:

The grammar has been divided into 6 levels. Level 0 is already implemented in the sample code provided. Your task is to add support for the other five levels. Be aware that some levels are worth more points than others (see "Scoring" below) and some levels are much harder than others. Each level is a superset of preceding levels, so if you implement levels 1 and 2 and then try to implement level 4 without level 3, you will fail the level 4 test cases.

Limitations

Be aware of the following limitations:

A Note On Grammars

The notation used to describe the grammar that your program must understand is quite simple. The first line defines the top level expression to be decoded. It is always of the form "expr ::= ... ", where ::= means "is defined to be".

Words following the ::= may be in either bold or normal type. Bold words are those which will appear (literally) in expressions to be evaluated. Normal type means that the word is expanded further in a following line.

Most expansions have multiple options separated by vertical bars. A bar means "or", so any one of the options is possible for that construct.

See the Level 0 Grammar section below for an example. The first line says "expr ::= number operator number", which means that an expression is a number followed by an operator followed by a number. Both number and operator are further defined in following lines. The next line says "number ::= zero | one | ...", which means that a number could be "zero" or "one" or ... The final line says that an operator could be either "plus" or "minus".

One aspect of the notation that is potentially confusing is used in Level 3. The definition of an expression becomes "expr ::= number | expr operator expr | ( expr )". In this case, an expression is described in terms of an expression, which is a recursive definition. This is how one defines a situation where one expression can be nested inside another. For example, the expression "two times ( one plus three )" contains "one plus three" and "two", both of which are expressions themselves.

Level 0: The Starting Point

We have provided a skeleton program that implements Level 0. This is the starting point from which you will begin. Level 0 handles basic expressions of the form "A plus B" or "A minus B" where A and B are single digit numbers.

L0 Grammar

expr ::= number operator number

number ::= zero | one | two | three | four | five | six | seven | eight | nine

operator ::= plus | minus

L0 Examples

For all examples, each input line is followed by the output (in bold) on the following line.

one plus two

3

three minus four

-1

eight plus seven

15

bye

Level 1: Number Parsing

Level 1 is all about parsing numbers. Numbers can now be up to three digits long. Multiplication and modulo (remainder) operators have also been added.

L1 Grammar

expr ::= number operator number

number ::= short_number | long_number | negative short_number|
negative long_number

short_number ::= zero | one_digit

long_number ::= two_digit | three_digit

one_digit ::= one | two | three | four | five | six | seven | eight | nine

two_digit ::= ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | 
seventeen | eighteen | nineteen | tens_word | tens_word one_digit

tens_word ::= twenty | thirty | forty | fifty | sixty | seventy | eighty | ninety

three_digit ::= one_digit hundred two_digit | one_digit hundred one_digit | 
one_digit hundred

operator ::= plus | minus | times | mod 

L1 Examples

five plus twenty three

28

four hundred eighty two mod eight

2

eight times nine

72

twenty minus five hundred

-480

eleven times negative two

-22

thirteen mod zero

DIVISION BY ZERO

bye

Level 2: Multi word operators

Level 2 simply adds a few operators that require multiple words.

L2 Grammar

All of L1 Grammar with the following change:
operator ::= plus | minus | times | mod | divided by | to the power of

L2 Examples

six to the power of two

36

nine divided by three

3

twelve divided by five

2

eight divided by zero

DIVISION BY ZERO

two to the power of four

16

bye

Level 3: Complex Expressions

Now it's getting harder. In this level we introduce sub-expressions. Instead of just a number as an operand you can have an entire sub-expression that may include parentheses. Note that a number with no operator is also a valid expression.

Recursion would be one way to deal with this, and a recursive solution would get you most of the points. For extra style points though you might try to find a more efficient non-recursive solution.

The other challenge introduced here is precedence of operations. Expressions like "four plus five times eight" are now possible. The precedence of operation is as follows:

()

to the power of

times, divided by, mod

plus, minus

Within a group, evaluation is from left to right (like C++).

You will never (at this point) have a case where there are multiple possible answers. There will always be enough parentheses to make the answer definite.

L3 Grammar

Exactly like level 2 except for the following change:

expr ::= number | expr operator expr | ( expr )

L3 Examples

one plus two times three

7

one plus two minus three

0

( four to the power of two ) divided by ( ten minus two )

2

two to the power of ( two to the power of ( four minus ( two to the power of two ) ) )

2

seventy two

72

bye

Level 4: Mathematical Functions

Level 4 adds a different form of operators. The brand new operators are logarithms and roots.

The new order of precedence is:

()

logs and square root

exponentials (to the power of, squares, and cubes)

times, divided by, mod

plus, minus

Partial credit will be given if you implement some but not all of the functions. Note that lg means log to the base 2 and that log with no specified base means the natural log.

L4 Grammar

Same as L3 Grammar with the following change:

expr ::= number | expr operator expr | ( expr ) | log base expr of expr | 
square root of expr | square of expr | expr squared | cube of expr | 
expr cubed | log expr | lg expr

L4 Examples

log base two of sixteen

4

square root of sixteen

4

( four plus three ) squared

49

lg eight

3

log base ( five times two ) of ( ten to the power of three )

3

bye

Level 5: The Really Hard Part

If you have finished all four levels this is your chance to be creative and clever. There is no new grammar for level 5. Rather you may now encounter tokens/words that are not part of the grammar. In addition there may be tokens/words missing. In many cases it will be impossible for you to determine a single answer. It is allowable to output a list of possible answers with the most likely answer first.

Scoring this section will be in large part subjective and will depend on how well thought out your technique is and how well it does in determining probable answers.

If you can not find any potential answers output "UNKNOWN". You should also output this if you find over 10 potential answers.

L5 Examples (answers may vary from your program)

four pls two

6

four two

6 2 8 16 0

what is the answer to five minus three

2

too to the power of to

4

the quick brown fox jumped over the vast lazy dog

UNKNOWN

ate

8

log base two four

2

eighteen is the answer to nine times two

18

bye

Hints and Suggestions

Don't panic. Yes Levels 3 and above are difficult. However, fully half of the test case points are for level one alone.

If you get to level five and have no idea how to proceed, try focusing on the cases where an expression is part of an input line (ex. "what is four plus two"). This is much more manageable problem and a very useful feature.

Scoring

Scoring will be on a 150 point range. 100 of these points will be from test cases, the other 50 are a more subjective evaluation by a judge as described below.

Test Case Points

Test case points are on an absolute scale. To get all the points you must pass every test case.

Other Points

Documentation is your README file (required!) and the comments in your source code. Every file should have a comment at the top describing the purpose and contents of the file. Every function should be commented with its purpose, arguments, return code, side effects, assumptions, and any caveats. Comments should appear in the code whenever they clarify (not to repeat what the code says).

Style is what your code looks like. This is proper indenting, good variable and function names, clear code, clear expressions, good modularity, etc.

Efficiency. These points are for how well your program runs in terms of speed and memory usage. The average will be 5 points with less going to inefficient algorithms and data structures and more going to those that took special care to tune their code. CAVEAT: very cryptic code rarely makes great gains in performance and is never worth it. Aim for the big picture first when tackling efficiency.

What to Turn In

Contact Info

Past history has indicated to us that our take-home problems aren't always bug-free. :-( We will keep you posted on this and any other announcements related to the contest via the contest web page.

If you have any questions about any of this, send email to hspc@cs.utah.edu. You could leave a message with Sandy Hiskey in the CS Office, 581-8224, and we'll get back to you as soon as possible.

We also have an email list of team advisors. If you would like to have additional people on that list, please let us know.