Utah High School ACM Programming Contest

2004 Take-Home Problem

Crypto-Analysis System

Frequently Asked Questions


  1. I was wondering what SDK for Java will be used at the programming contest and what SDK should we use to program the take home project?
  2. ANS: You should use Sun's JDK 1.4. You should *not* use new Java language features that are present only in Java 1.5 beta. We will *not* be using a JDK 1.5 beta.

  3. Can I safely assume that an "I" in the middle of a sentence will be correctly capitalized. For example:
          Where am I?
    
    is allowable, and
          Where am i?
    
    is not?
  4. ANS: Simple answer: No.

    Complicated answer: Capitalization is sometimes good for clues, but it is not an iron-clad predictor of particular words. For example, consider the following sentence:

            The letter 'i' is the ninth letter of the English alphabet.
    
    Similarly, capitalization is sometimes used for emphasis:
            Suddendly Maria yelled, "LOOK OUT!  HE'S GOT A GUN!"
    
    Once encrypted with a rotation or substitution cipher, the word "A" could be represented by any capital letter. In this case, your program would be wrong to assume that all one-letter capitalized words within a sentence are encrypted versions of the word "I".

    In sum, your program should use rules that look at all the words in a sentence to judge whether or not a key is the right key. A good decryption program will inevitably need to deal with words that it doesn't recognize in any case; weird one-letter words are just a special case of those unrecognized words.

  5. Since the take-home has a max runtime limit, can you provide the specs for the machine, or will we be able to get ssh capability to a computer of equivalent speed and power so that we may test the speed of the program on it?
  6. ANS: I'm not sure which machines we'll use, but they'll be reasonably fast. The best way to avoid problems with the time limit is to implement a fast search strategy, by using lots of information about English words in order to find the most likely ciphers quickly. A well-planned search might need several seconds on a difficult puzzle, but in general, a "smart" program on a relatively modern machine should not run anywhere close to the time limit.

  7. If we find some really cool code on the web that solves this problem, can we copy it into our solution?
  8. ANS: No. Absolutely not! That would be plagiarism. You can search the web for ideas, and you may incorporate those ideas into your solution, but you must write your own code to do that.

  9. Will all substitution encryptions be 'well-formed'? I.E., A can not map to A, B can not map to B, etc.
  10. ANS: No; you cannot assume that no letter will map to itself.

    As described in the project details, we define that there are 26 unique rotation ciphers and 26! (26 factorial) unique substitution ciphers. These counts include ciphers in which some or all letters map to themselves.

  11. May we store the words in your dictionary in an external text file to be read in on program initialization?
  12. ANS: Yes, you may have your program read an external dictionary file. If you do this, remember the following:

    We would suggest that your external dictionary not be *very* large, for two reasons. First and most important, having a *very* large dictionary is likely not to be worth the cost: don't waste your time assembling a huge dictionary of uncommon words. Second, reading and processing a very large dictionary file will make your program start more slowly, which makes it more difficult for you test your program thoroughly, and leaves less time for your program to do a good job searching for the right cipher.

    In sum, we do not recommend that you use an external dictionary file, but you may if you wish. If you do, be sure to keep the above points in mind.

  13. Assuming I can input my dictionary from a file, would it be against the rules to OUTPUT to a file information concerning word frequencies to be used by subsequent runs?
  14. ANS: Yes, you may do this, but we *seriously* advise against it for three reasons:

    Our advice is that your program be predictable and as easy to understand as possible. Trying to "learn" words dynamically is a clever idea, but doesn't move your program toward those goals.

  15. Will 'partial' credit be given for a response that has some or most, but not all words correct?
  16. ANS: Yes.

  17. Will the actual test set be composed similar to the practice set with about half shorter and half longer problems? Will there be any specific subject area?
  18. ANS: It is hard to answer these questions precisely, but in sum: our "test set" will resemble the provided "practice set," but it will not be the same.

    As described in the project details, some of the scoring will be based on your program's performance on selected examples from the practice set. We will also evaluate your program on cryptograms that are not in the practice set. Some of them will be short and some of them will be long. Some of them will be about subject areas like those in the test set, and some of them will not.

    We will test your program on as many cryptograms as we can during the grading period. We cannot say exactly how many, but it won't be more than your program is required to be able to get through during the testing period (i.e., one puzzle every 30 seconds).

  19. In both the rotation and substition versions of 0212 in the sample set, my program (running in Windows) has problems with the cent character.
  20. ANS: Thank you for reporting this! It was a bug: the cryptograms should contain ASCII characters only, and the cent sign is not an ASCII character. This was fixed in the March 9 version of the test suite.


    Utah High School Programming Contest