CRA-W Distributed Mentor Project

Project Journal

Weeks 1 & 2

During the first two weeks, I got settled in and was introduced to my project and some new people. My fellow intern for the Distributed Mentor Project (DMP) is Rachel. There are some other undergrad interns in the lab, but they are here through a different program. They are Larissa, Nausheen, Sabina, and Scott. We all live in the same building on campus. Rachel and I are even in the same apartment. There are several grad students here in the graphics lab, but I knew most of them already because I was here at the University of Utah last summer.

Work started out with the usual administrative stuff: getting computer accounts set up, setting up a workstation, obtaining an access card for the lab, etc. Once that was all taken care of, I started on my project. It has to do with P vs. NP and phase transitions. I will soon post more information about it on my project page. As a starting point, I was assigned to read the paper "Where the Really Hard Problems Are" by Peter Cheeseman, Bob Kanefsky, and William M. Taylor.

Week 3

This week I read a lot of papers that built on the results found in "Where the Really Hard Problems Are." I also presented the paper to a group of graphics grad students and professors. Dr. Cohen helped me out with my presentation. She went over it with me beforehand and gave me tips about the look of my slides, what sort of information to include or leave out, and when to say it. It was really helpful.

In working on my project, I am trying to find as many pre-made components as possible. For example, I found a number of satisfiability (SAT) solvers readily available on the internet. Years of research have gone into creating fast, accurate SAT solvers, and the ones that are available are much better than what I could come up with in my time here. I am in the process of trying to find a Hamiltonian Circuit (HC) solver, but have had no luck so far. I am also looking for instance generators for the SAT and HC problems but expect that I may have to write them. It shouldn't be too hard, but I will have to figure out a way to make sure that I don't generate any duplicate test cases and mess up our results.

In the mean time, I have chosen a primary SAT solver, zChaff, to use on our project. It is one of the fastest SAT solvers out there, and it is complete. (A complete solver is one which will return a solution if one is possible, given enough time. Some sovers use heuristics to facillitate the search but occasionally conclude that no solution exists when one really does.) The zChaff solver only works on one problem at a time, so I had to write a script in order to run jobs in batch. It is the first script I have ever written, and I am kind of proud of it. The script takes as an argument the name of a directory of test instances. It then runs the solver on each problem instance in the directory and outputs all of the results to a result file. It does all sorts of error checking, like checking if the directory actually exists and if there are really any test cases in the directory. And if the result file already exists, it asks the user whether to overwrite it. It's kind of cool. (Okay, so it's kind of cool to me. Remember, this is my first script.)

Week 4

Well, I finally found a Hamiltonian Circuit (HC) solver. It took me forever. At one point, I didn't think I'd be able to find a real HC solver, and so I started looking for Travellng Salesman Problem (TSP) solvers. I actually found one that looked really good, but then it turned out it used a Linear Program (LP) solver that I don't have installed. Then I spend a while looking for an LP solver. Well, to make a long story short, I finally gave up on the TSP search, went back to the HC search, and finally found one. It actually includes a graph generator, but the graph generator has a few problems.

Now I need to write a graph generator. I started on it on Friday, but I didn't get very far. That will be my major project for next week.

Week 5

I spent all of this week coding up the graph generator. I managed to finish most of it. There are just a few command line options that I need to finish writing.

Week 6

I finally got my graph generator written. The program started out easily enough. But I got slowed down near the end of the week while working on file I/O. In any case, it is now done, and I am running a batch of tests over the weekend.

Of course, before I could run the tests, I had to generate the graphs. (That is what the graph generator program was for, after all). So I started making graphs, but it was taking a long time. I began to wonder, "If it is taking hours just to create these graphs, how long is it going to take to run them all?" I did some calculations and finally estimated that it would take 144 days! I had to drastically reduce the number of graps I was creating. That included deleting most of the graphs that I had already generated. In any case I was finally able to pare it down to about 4 days worth of tests.

Week 7

My tests did not finish running over the weekend. I am starting to get worried because they are still not done. I finally figured out that I had better put a time limit on my test cases when I saw that one of them had taken 23 hours to finish. So I put on a time limit of three hours on the remaining test cases. In any case, my workstation is working on the remaining cases nearly 24 hours a day, and I have started harnessing Rachel's workstation at night.

We are trying to secure some time on a cluster of machines to run the Satisfiability (SAT) problems. That would be good, because I can't spend a week waiting for all of those results to come in, too.

While my computer was busy running these tests, I was busy coding. This week I worked on a program to convert the results in the ouput files from the HC tests into a format that the volume vizualizer will understand. I finished most of it. I managed to consolidate all of the data from the separate result files into a single file. Now all that's left is to put the data into an 3-dimensional array, and output it to a binary file. I also coded up a SAT instance generator. It was a lot easier than I thought it would be, and only took two days.

Week 8

This week I finished the data conversion program for the Hamiltonian Circuit (HC) results. I may even get to use it if my tests ever finish running (stupid hard cases running on my stupid slow 195 MHz computer). When I finished that program, I started writing a data conversion program for the Sasisfiability (SAT) problem results. We haven't even started running the SAT tests yet, but I figure we're going to need the program eventually, so I might as well get it out of the way.

Speaking of the SAT tests, Bruce has been talking with someone and managed to get some time on the machine cluster upstairs. Yay! Mac, that's the someone Bruce talked to, and I have discussed the cluster issue a little bit, and we are working out how and when to run my tests. It looks like it will be sometime next week.

Week 9

Hooray!! My tests finally finished running! They finished on Monday, and I ran my data conversion programs on the result files. On Tuesday I took the data to Gordon, who knows all about the volume rendering software. We looked at one of the datasets, and it was not exactly what I expected. It looked sort of similar, but it was a lot more lumpy than I had pictured.

On Tuesday evening, I got an e-mail from Mac saying that a lot of machines, or nodes, in the cluster had suddenly freed up. So if I had something ready to run, now was the time to do it. Well, I had my SAT generator and the SAT solver ready, and I had scripts to run both of them separately. But I didn't have anything that coordinated the generator and the solver. The problem is that we can't generate all of the test cases at one time, because that would take somewhere between 100 and 200 GB! So we need to generate a few test cases, run the solver, save the results, delete the test cases, and generate the next batch.

As quickly as I could, I whipped up a script to coordinate the SAT generator and solver. Then I ran up to see Mac. He allocated 16 nodes for the project, edited my scripts a little, and did a whole bunch of other stuff to get the project started. We ran into problems along the way, and he ended up staying until 9:30pm. At that point, we figured we had it mostly done, so I stayed to finish getting a few bugs out of my stuff. As usual, I ran into more problems than expected. In particular, I had a problem with permissions on the nodes. Mac e-mailed me a command to use to fix the problem, but the node somenow kept "forgetting" that I had set the permission. So I figured I would just set it over and over in my script. At that point things finally worked, and I left the lab at 2:30am.

So, to make a long story a little less long (it's too late to make it short), the next day I got up to find an e-mail from Mac saying that by calling the permission command so frequently, and on 16 nodes at once, I had bogged down a file server. They had to reboot my nodes to kill my processes and free up the file server for other people. Oops. :-}