James Bigler: CS6620 Homework 3




Implement a bounding volume hierarchy for your raytracer and render 1 million spheres.



All images are 500x500 and include shadows. Timing was done on an Origin 3k with R12k (400Mhz) processors. I did timing for scenes with 10e3, 10e4, 10e5, and 10e6 spheres to get an idea of how the algorithms scale. Most of these runs were run with 10 processors using pthreads to implement the parallelism. The work time is the total time used by all the processors added up. The scaling is near linear, so these numbers should be close.

Best results are acheived by creating the smallest bounding volumes. Simply sorting the objects once would create a situation where you have many thin slices of data in each bounding volume. I rotated which axis to sort by in each iteration. This means I sorted in X the first pass, then Y, then Z, and finally repeating this pattern. This heuristic poved to perform well.

This produced decent rendering results, but I was troubled at the scene creation time. It was growing super linearly.
num primsBuilding timeRendertime
10000.02279.45
100000.37114.3
1000006.537.8
1000000110123


This was kind of slow, so I optimized a few things and I got faster times, but still the same poor scaling.
num primsBuilding timeRendertime
10000.005724.89
100000.12710.9
1000003.026.5
10000005471.0


I knew there must be a better way to do this. After reading another classmates page about his qsplit, I came to the same conclusion. It doesn't matter if all the element in the array are sorted, just that the first half is smaller than the second half. That is when Brian Budge pointed me the nth_element function in stl. This function did exactly what I needed (elements in the first half less than elements in the second half), but was order N. Thus I could get linear scaling, but keep the same performance. This is nice! Here are some timeing results. I could use some additional optimizations, but the numbers scale. Also note that the MipsPro compiler CC did a better job of optimizations.

Using CC -Ofast stl::nth_element

num primsBuilding timeRendertime
10000.004573.72273
100000.0577499.22949
1000001.0795823.9744
100000014.9161.9529


Using CC -Ofast stl::sort

num primsBuilding timeRendertime
10000.0061913.81053
100000.0999319.2021
1000002.9727724.2589
100000054.109363.8467


Using g++ -O3 stl::nth_element

num primsBuilding timeRendertime
10000.0075057.0683
100000.09953217.0937
1000001.4909341.997
100000019.361397.7004


Using g++ -O3 stl::sort

num primsBuilding timeRendertime
10000.0103297.05071
100000.15653617.0274
1000003.9631141.9395
100000066.450397.4918




Here are some images where you can see the bounding boxes. Notice that they are the same for either sorting method.
stl::nth_elementstl::sort



All these images were created using 16 jittered samples per pixel.

1,000 Spheres

10,000 Spheres

100,000 Spheres

1,000,000 Spheres



// this is sample code