update: I had a small problem where I was treating the distance parameter of the camera differently than everybody else. This made my budda slightly smaller, which artificially deflated my running times. I fixed the problem, made another pass of optimizations, and lowered the times some more.

The code for this assignment is available here.
 
This image was rendered with 25 samples per pixel.
 
System Compiler File Read Time (sec) BVH Build Time (sec) Render Time (sec) Total Run Time (sec)
sinner gcc 2.96 2.97 3.13 17.17 23.27
P4 2.4 GHz, 512 MB MSVC .NET 4.14 2.83 16.89 23.86
lab4-2.eng.utah.edu gcc 2.96 3.32 4.16 18.70 26.17
Athlon 1.53 GHz, 384 MB MSVC 6.0 4.63 4.33 18.19 27.15
labnix14 gcc 2.96 4.03 6.06 23.43 33.52
faith gcc 2.95.2 6.69 12.25 51.60 70.54
war gcc 2.96 12.78 15.66 70.00 98.44