next up previous
Next: YOUR PROBLEMS Up: No Title Previous: THE STORY SO

DETAILS

You will be provided with a prototype of the program together with some sample binary images containing the string ``1234567890'' in several distinct fonts (see Figures 1-3). A binary image is basically a two-dimensional array containing only ones and zeros. Ones represent the black and zeros represent the white parts of the image. Note that the images of the harder fonts are not very smooth -- they have dots which shouldn't be there. These dots are called ``noise''. Noise is produced when a low resolution optical scanner (like a ZIP code reader) is used.

  
Figure 1: Easy fonts: CURIER.TXT, HELVICA.TXT, GENEVA.TXT

  
Figure 2: Harder Fonts: ATHENS.TXT, CHICAGO.TXT, VENICE.TXT

  
Figure 3: Really Hard Fonts: WINDSOR.TXT, SF.TXT, LONDON.TXT

The prototype program will be able to read a binary image file from the disk and separate each of the 10 digits into its own ``bounding box.'' You are to write code that will analyze the subimage in each bounding box and arrive at a conclusion as to which digit is depicted there. The current version of the program prints out several possibilities for some digits, since it is not capable of distinguishing all digits from one another. Your talk is to enhance the prototype so that it only prints out a single option.

You may use whatever means to arrive at your solution. However, we suggest a method called Likelihood Distribution. The idea is to start with all possible answers and find one answer which is more probable than the rest. So, in our ZIP code problem, let us first assume that a digit is equally likely to be a 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9. Let us now compute some ``properties'' of the image. Then, for each property we can decide which digits are more likely to have it and which digits are not. For instance, if we know that the image has a single hole in it, then digits 0, 4, 6 or 9 are more likely the right answer than digits 1, 2, 3, 5, 7 or 8. After we have processed all of our properties we will get a likelihood distribution of all possible answers. Then the answer simply becomes the one with the highest overall likelihood. Of course, if only a few properties are being used, you might not get a single right answer. So, using this method you must add more and more properties until you get a single (and hopefully right!) answer for each case.

The prototype program provided already contains two very useful properties; these properties are the number of ``holes'' in the digit (actually, one minus the number of holes, which is called an Euler Number) and whether or not the digit contains a vertical line. By adding additional properties you will make your program more and more ``intelligent.'' But of course, for this to work, your choice of properties has to be intelligent as well!

One useful property you may wish to try is counting the number of times a digit hits the left and the right side of the bounding box. You will find that this technique can differentiate between quite a few different digits.

Other things you might want to add to the program are preprocessing and postprocessing. Preprocessing is doing something to the image before you start computing your properties. The function ImproveImage is an example of preprocessing. Postprocessing is doing something after you have computed your guesses. For example, even after processing all of your properties, some digits may still not give a single answer. Here you may use the fact that all of the digits are in the same font and use the already recognized digits to help decide on the others.



next up previous
Next: YOUR PROBLEMS Up: No Title Previous: THE STORY SO



Alyosha Efros
Sat Oct 26 14:39:38 MDT 1996