Project 1

Image Mosaicing

For ease and speed of implementation, I used mynrrdlibrary for reading in the images, as well as the correspondence points data. Images are either two dimensional (grayscale) or three dimensional (color) arrays, sonrrdis a natural use for them. Putting the correspondence point data into an array form is also reletively straightforward, as will be discussed below. Thenrrdlibrary relies in turn onbiff(for error reporting) and onair(basic utility functions). The source for these libraries is in/home/gk/usr/local/src; the#includes are in/home/gk/usr/local/include, and the libraries themselves are in/home/gk/usr/local/lib.This description of the functionality will start with the lowest level and work its way up to high level calls which perform useful work. As described in the previous section, an SVD calculation is needed to determine the warp between two overlapping images. The SVD code comes from Numerical Recipes. Because its SVD call (

svdcmp) uses two dimensional arrays which are 1-based instead of 0-based, I had to write a wrapper function (mossSVD) which allocated and initialized the 1-based arrays based on the 0-based input data. Another wrapper around this (mossPseudoInverse) calculates the pseudo-inverse based on the SVD result. This is all insvdcmp.c, and some supporting code is inpythag.candnrutil.c, both largely ripped off from the Numerical Recipes code base.As described in the previous section, the SVD is used to convert information about correspondence points into a transformation matrix which maps between image planes. The software follows exactly the procedure described already; this functionality is contained in the function

_mossCalcPTsofcorresp.c. The transformation itself is stored as a 3x3 array offloats, in a struct (mossCorresp) which stores all the correspodence points and the matrices together. This is defined inally.h. All the psuedo-methods to create and destroy this and otherstructare inmethods.cBut this program operates on more than just image pairs. The user can specify an arbitrarily large set of images, and as long as every image is tied to at least one other image, then they form a cohesive whole, and the mosaic can be calculated. Still, one particular image is specified (by the user) to be the reference image, which defines the plane onto which all the other images are projected. Thus, in order to perform resampling, the mosaicing tool needs to be able to determine the transformation which maps from the reference image to

anyof the other images.To organize the required information about mapping between images into a single place, I used the notion of a "matrix of transformations": If N images are being mosaiced, then the matrix of transformations is an N by N matrix, each element of which is a transformation (which, somewhat confusingly, is itself a matrix). The transformation at row

iand columnjrecords how to transform pixels in imagejonto the plane of imagei. This matrix is similar to an adjacency matrix used to represent a graph (nodes connected by edges), but instead of a binary number at each entry, we have the transformation. Like an adjacency matrix, we know that the images form a connected graph (mosaic) if the row for each node (image) contains more than just one element-- this means that every image is tied to at least one other image. The user's correspondence information has to have this property, or elsemosscomplains. Populating the matrix of transformations with the the transformations derived directly from the user's correspondences is done by_mossLearnPoints()incorresp.c.The next step is filling the whole matrix with all the intermediate transformations which relate two images which weren't explicitly tied together by the user. Assuming that the mosaic isn't over-constrained (there is a loop in the adjacency graph), then figuring out the intermediate transformations is essentially a graph problem: to find a way of transforming one image to the other, a path must be found through the adjacency graph. Between each pair of images along the path, the matrices associated with each image pair is cumulatively composited to produce a new transformation, which tells how to transform between the images at the

endpointsof the path. Starting with the correspondences the user defines, the whole matrix of transformations can be filled out, even though only one whole row (or column) needs to be fleshed out. This process is done by various functions (_mossFindPathTo,_mossCalculateInter,mossSetMatrix) incorresp.c.Now the process of creating the output image actually starts. Given the reference image, and the transformations which take all the other images into its plane, we can determine the bounding box of the final image, and allocate space for it. Then the pixel by pixel creation of the mosaic: for every pixel in the output image, find the images who's projections cover that pixel, use the transformations calculated to calculate the pre-image of the target pixel in the source image, and use bilinear interpolation to find the pixel value. All the interpolated pixel values for all the overlapping images are averaged according to the weighting scheme described in the previous section. This functionality is all in

image.c. The function which starts the whole process off ismossDoitinmosaic.c, which also contains functions to check the validity of all the user input parameters (such as making sure at least four points were specified to stitch two images together). The only job formain.cis to read in all the input data and callmossDoit.As was said before, the

nrrdlibarary handles the reading and writing of PGM and PPM files. It also handles the reading in of the user-specified correspondence data. Usingnrrdfor this was simply a matter of programming convenience more than one of user convenience. The user records all the corresponds points by creating an ASCIInrrdfile which defines a 3-dimensional array of floating point numbers. The first axis of this array always has only 2 elements, for the X and Y coordinates of the correspondence points. The second axis has as many elements as there are images to be stitched together, and the third axis has as many elements as there are correspondence points. Written as an ASCII file, there is thenrrdheader which is fairly straightforward, and then one line per correspondance point. Each correspondance point is recorded as a coordinate pair intwodifferent images, and each image has its own column. Sincenrrdcan't deal with incomplete data, the user just fills in -1 for the coordinates in all the images not involved in a given correspondence point. An example may clarify things:Here, there are 4 different images, and 18 correspondence points in total. For example, from the first line of data we can see that the first point ties location (264,105) in image #0 to (134, 10) in image #1. I found that aNRRD00.01 dimension: 3 type: float encoding: ascii sizes: 2 4 18 #image 0 image 1 image 2 image 3 264 105 134 10 -1 -1 -1 -1 369 84 231 8 -1 -1 -1 -1 248 385 130 233 -1 -1 -1 -1 361 384 232 231 -1 -1 -1 -1 530 197 353 109 -1 -1 -1 -1 509 382 340 230 -1 -1 -1 -1 -1 -1 434 6 81 72 -1 -1 -1 -1 451 136 106 227 -1 -1 -1 -1 542 222 212 328 -1 -1 -1 -1 402 295 63 418 -1 -1 -1 -1 563 267 243 383 -1 -1 -1 -1 566 306 248 429 -1 -1 -1 -1 256 164 -1 -1 428 107 -1 -1 313 165 -1 -1 489 109 -1 -1 218 246 -1 -1 382 197 -1 -1 16 362 -1 -1 155 317 -1 -1 3 249 -1 -1 140 192 -1 -1 2 340 -1 -1 139 293veryconvenient way of finding the correspondence points between a pair of images is to use two copies ofxvrunning on each image. Middle clicking inside the image causes coordinate and color information to be written at the top or bottom of the image,but it alsocauses said information to be copied into X's crude cut/paste buffer, so middle clicking inside an emacs window pastes the same information in a line like:Keyboard macros can then be used to massage the point data into the necessary format. Note that these coordinates need not be integers- floating point values are valid. Also note that negative coordinates are also valid, if by some bizarre reason you know that the proper location for a correspondence point is outside the image. So using the coordinate (-1,-1) as a placeholder and sentinel meaning "no data for this point inside this image) is really a hack, but it sure simplified the programming.346, 185 = 35, 35,143 #23238f (240 75 56 HSV) [ 0, 0]Although I did create a seperate library (

libmoss.a) for this project, there is only one stand-along program to run which accomplishes the work, calledmoss. Using it is just a matter of setting up your correspondence point file correctly and making sure all the input images are either PGMs or PPMs. Here is the usage information it prints:usage: moss <imgOut> <points> <which> <img0In> <img1In> ...The ordering of the images on the command-line is very important, as this must exactly match the ordering of the coordinates in the columns of the correspondence point file. Obviously, the number of images given to be stitched together must be compatible with the data in the

imgOutis the desired filename for the output image.pointsis the correspondence point info asnrrdfilewhichis an integer (from 0 to N-1, for N images) which specifies which image is desired as the basis image.img0In,img1In, ... are the N images to be mosaiced.pointscorrespondence file.As the program runs, it will spit out a lot of information about matrices being computed and intermediate transformations being determined. Until I learn to use a debugger I'll keep to

printf.