The World Wide Web has taken the world by storm. Now everyone is browsing the Web, and digital images are everywhere. One important issue about downloading an image over the Web is the size of the image file. The bigger it is, the longer it takes; and no one likes to wait very long for an image to download. That is where compression comes into play.
In order to display an image, it is usually stored in the computer with all the details of that image in memory. When the image is transmitted over the Web or stored on a disk, however, we create a file without all the details, but with enough information to allow the original image to be regenerated. This is called "compression".
After the compressed file is transmitted over the Web or retrieved from disk, the original image data is recreated so it can be displayed. This is called "decompression" or "uncompression".
So how might you go about compressing an image? One way is to generate a set of instructions for drawing the image. You could then send only those instructions, rather than sending the entire image. At the other end, someone would read the instructions and be able to draw the image. If the storage required for the instructions is less than that required for the original image, you have reduced the amount of information required to transmit (or store) it. Hence, the orginal image was compressed, transferred, then decompressed back to its original form.
Perhaps an example is in order. Say I have a checker board with 64 squares, and 16 red and black checkers as shown below:
+---+---+---+---+---+---+---+---+
| | R | | R | | R | | R |
+---+---+---+---+---+---+---+---+
| R | | R | | R | | R | |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
| | | | | | | | |
+---+---+---+---+---+---+---+---+
| | B | | B | | B | | B |
+---+---+---+---+---+---+---+---+
| B | | B | | B | | B | |
+---+---+---+---+---+---+---+---+
I want to tell my friend over the Internet how to recreate the same
checker board I have on my table. One possibility would be to
send the entire sequence of characters shown in the figure above.
My friend could just print them out at the other end. This would
require that a total of 578 characters (or bytes) be transmitted.
Is there a way to send the same information, but with fewer characters? One possibility would be to send a message that tells him how to make his own picture. I might write down something like this:
Red checkers at row 1 column 2, row 1 column 4, r1c6, r1c8, r2c1,
r2c3, r2c5, and r2c7. Black checkers at r7c2, r7c4, r7c6, r7c8, r8c1,
r8c3, r8c5, and r8c7.
That describes the entire board pretty clearly (assuming my friend understands what a checkerboard looks like). And it only took 158 characters, which is only 27% of the actual picture (27% = 158/578). And they both mean the same thing. This is how compression works. Instead of sending the whole board, I just send an equivalent description. The same works for image files. Instead of sending the whole image, I just send a description of it that takes less space.
There are many ways of compressing images. An image is composed of many dots, called pixels, and each pixel has a color. When we see all the pixels together, it makes an image. Pictures contain some number of rows (R) and columns (C) of pixels. An image is stored in the computer as a set of values, each of which represents the color of a pixel. The total number of values in the image is R times C. To transmit an image, we could send all of these values. Alternatively, we could send a set of instructions that allow the picture to be redrawn when it is decompressed.
One relatively simple way to compress an image is called Run Length Encoding (RLE), which describes the image as a list of "runs", where a run is a sequence of horizontally adjacent pixels of the same color. For example, if I have 8 pixels in a row of a certain color, say green, I'd encode it as 8G. So lets see how this works. Why don't we try to make a smiley face? Here's a simple image.
......XXXXXXX.......
....XXX.....XXX.....
...XX.........XX....
.XX...XX...XX...XX..
.X...............X..
.XX......X......XX..
..XX..XX...XX..XX...
...XX..XXXXX..XX....
....XX.......XX.....
......XXXXXXX.......
This picure is 20 pixels across and 10 pixels high. So we say it is 20x10, and 200 characters are required to draw it. To generate an RLE description of this picture, we go across the rows - usually left to right, top to bottom - and write down all the runs we find of the same color. Starting at the top left, we find 6 .'s, followed by 7 X's. Then we find 7 .'s in the top row, and another 4 .'s at the beginning of the next row. So lets call that 11 .'s. If we continue to the end of the picture, we end up with something like this:
20x10=6.7X11.3X5.3X8.2X9.2X5.2X3.2X3.
2X3.2X3.1X15.1X3.2x6.1X6.2X4.2X2.2X3.
2X2.2X6.2X2.5X2.2X8.2X7.2X11.7X7.
Now we have an RLE description of our picture, and it's easy to rebuild the actual image when we decompress it. (Note that the first 6 characters of the RLE description indicate the size of the picture. This information is required to recreate the original picture.) The compression for this image with RLE was 46.5% (because the data was 53.5% of the original size - 107/200). And we can get the exact same image back when we decompress.
RLE is just one way to compress images. There are many other ways to do it. Run Length Encoding is a simple example of Lossless Compression, because we got the exact same image back after we decompressed. Lossless compression is what we use when we want to compress a text file and later be able to decompress it. But with images, we can play tricks that make the image look almost the same, even though it might not be exactly identical. This is called Lossy Compression. An example of lossy compression is the JPEG format commonly used for images on the Web. It is especially good for photographs, where there are many different colors that are almost exacly the same. (See the links below for more information on JPEG.)
A simpler way to do lossy compression is to pretend that colors that are almost identical are the same color. If I have two or three shades of yellow that are almost the same, I can just say they're all yellow, and it will be hard to tell the difference. We could call this method summarizing colors. This kind of lossy compression would lose a little bit of the color information.
To explain this a little better, we first need to discuss binary numbers. We mentioned that a byte is enough to store one character. Bytes are one of the basic units inside a computer. Each byte is made up of 8 bits, and a bit is a digit in the binary number system. Each bit represents either a 0 or a 1. With 8 bits, you can represent up to 256 different values (2 to the 8th power).
To encode characters using one byte, each different character is assigned a unique combination of the 8 bits (sometimes called a "code"). Computers usually use a code called ASCII, which you probably have heard of before. It assigns a binary value to each letter and number and symbol. For example, in ASCII a space has the value 00100000, which is a binary number that corresponds to decimal 32. (You might want to have your teacher review binary numbers for you.)
Now back to compression. A pixel in an image can be represented many ways. One common way is RGB (Red-Green-Blue). That means a pixel has three numbers, one for red, one for green, and one for blue, and by combining the three, you get a color. The numbers represent the quantity of red, green, and blue light that are required to produce the color of that pixel on the screen.
For "True Color" images, 24 bits (3 bytes) are used to represent each pixel's color. That is over 16 million possible colors (2 to the 24th power), so the images look very real. In the 24 bits, one byte is usually used for each of the three RGB values.
One way to "summarize" colors is to discard the least significant bit of each color value. If we were to list all the possible intensities of the color red, that would mean we would be throwing away every other one of those. The same would be true for green and blue. That would allow us to reduce 24-bit color to 21-bit color, which still leaves us up to 2 million different colors. This technique is "lossy", however, since we no longer know whether the discarded bit was 1 or 0.
Going from 8 bits to 7 bits for each of the three colors essentially combines groups of 8 very close colors into a single color. As you might guess, the human eye probably can't notice this, so the image quality is almost exactly the same, but the amount of data required to represent the image went down by 12.5% (21 bits for every 24 bits in the original).
The color ummarization process could be taken a step farther by reducing the numbers of bits used for each color to 5. That would allow you to represent each pixel using only 2 bytes. The more the colors are summarized, however, the more quality is lost.
We just talked about losing some of the color information to make our compressed image smaller. You could also choose to lose some of the actual pixels. Say I have a picture that is very detailed. If I want to compress it a lot, I could choose to keep only every other pixel in my compressed data. If I store a pixel, then skip a pixel, then store the next one, and go on like that, when I want to decompress it, I can guess the color of the pixel between each pair of pixels that are actually stored. Maybe I'll guess that it was just the color that is halfway between the two colors that I know. So if I had stored a black pixel followed by a white pixel, I would guess the color of the pixel between them is gray. This approach loses some of the detail and makes the picture look more blurred.
Sometimes we even combine different compression methods. Say I compress by summarizing colors like I did earlier, then I use Run Length Encoding to compress the summarized colors even further. I can write down runs of the 21-bit color values, save that as my compressed file, and get back the same summarized color image when I decompress. Note that this even combines a lossy compression with a lossless compression.
Another thing you might want to try is to reorder things. Since you're going to use every pixel's RGB values, you might try compressing all the red values first, then all the green values, then the blue. This might give better results with some pictures. For instance, say we're using RLE. In an image it is not always likely to find many pixels in a row with exactly the same RGB values. It is much more likely that the blue value will be the same from one pixel to the next, which creates longer runs and gives us better compression.
It is also very important to remember that some compression algorithms work better or worse depending on the actual data or image you're trying to compress. RLE in the example gave us about 50% compression, but if each pixel is different from the ones next to it, every run will be of length 1. In this situation, it would take up nearly twice as much space to describe a run as it would to simply record just the color. Hence, plain RLE would almost double the size of the file.
You may want to write several compression algorithms into your take-home problem and, depending on the characteristics of the image, choose a different one. Maybe even try several, and only output the best one. It's all up to you to decide which is the best way to solve this problem.
One of the most important things that we could tell you about compression is Be Creative! There are no right or wrong answers, and almost anything you can come up with is a perfectly reasonable way of compressing data! This project leaves you with a lot of flexibility, so here's your chance to show off. Start out with the basics, then make it bigger (or smaller, in this case), better, and funner!
There are places you can visit on the Web that have great information on data compression and image compression. There are also many books on it at your local libraries and bookstores. We encourage you to look around and find more examples of algorithms that might be useful.
You may use any pseudo-code algorithms you find, but it is not acceptable to use someone else's source code. That would be considered cheating, and your team will be disqualified.
Some topics you might want to search for include "data compression", "image compression", "data encoding", "encoding algorithms", "compression algorithms", "encoding", "compression", and the names of any particular algorithms or encodings that you might be interested in. Here are a few links we've gathered to get you started:
Your task is to write functions that tell the computer how to compress and decompress an image. We have provided a considerable amount of code that will help you do this. This includes three programs (hseval, hspack, and hsunpack) and three classes (Compressor, Image, and Pixel). The functions you need to write are part of the Compressor class, which is the only thing you are allowed to change.
The programs you will use are:
All of these programs use images that are 24-bit (True Color) .BMP files ("Bitmap" or "Windows Bitmap"). We provide several files for you to test with, and we will use similar images to test your Compressor on the day of the contest. The images will always be at least 10,000 pixels (100x100, if its square), and will not be more than 307,200 pixels (640x480, for example).
We have also created some C++ classes that will help you:
Each of the different color models can represent exactly the
same colors as all the others. One thing you might be able
to try is to investigate each the models, and figure out
which of its values affect the quality most when they are
summarized more or less. Perhaps the one of the values for a
particular model doesn't matter too much, and you can
summarize it from 8 bits to 4 bits without a noticeable
difference. Maybe when one particular value in a model is
very low, it doesn't matter what the other values are, and
for instance, you might be able summarize all colors with a
Hue value in HSV of 0 to the same color without changing the
appearance of the image.
A Pixel has several functions. First, there is a get<value> (where value is Red, Green, Hue, Purity, Yellow, etc.) for every value in every model. There are also set<model>Color functions for every model (RGB, HSV, YIQ, and CMY).
Note that all values associated with pixels are represented as doubles. You will need to convert these values to bytes in order to put them in files that can be transmitted over the Web. To convert a value with a range of 0.0-1.0 to a byte, simply multiply it by 255, add 0.5, and cast it to an unsigned char. (See the sample programs for examples of how to cast a value from one data type to another.)
The Compressor has declarations for the functions you will write:
Your task will be to write bodies for the compress and decompress functions. You may define variables in the private section of Compressor.h, and you may also add other functions in the private section that you may call from compress and decompress. (This is just like modifying a C++ program, except the function prototypes go in Compressor.h, and the implementations go in Compressor.cc)
The grading of your solution will be mostly based upon the quantity of compression that you achieve and the quality of the decompressed image. Part of your score, however, will be programming style and comments.
For any given image, hseval will give two numbers: compression score and quality score. The compression score will be from 0.0 to 1.0, and will be a ratio of the compressed file size to the original file size. So if your file was the same size as the original (meaning no compression), your compression score will be 1.0. If your compressed file is 10% of the size of the original size (90,000 bytes instead of 900,000, for example), your compression score would be 0.10.
The quality score will also be a number from 0.0 to 1.0, with 1.0 meaning that your decompressed image had the exact same quality as the original image. (More information will be provided later this week on the algorithm used to determine quality.)
The final score for a given image is its quality score minus the compression score. So if no compression occurs and there is no loss, quality would be 1.0, compression would be 1.0, and the final score would be 1.0-1.0=0. Note that scores can be no higher than 1.0, and may be negative (for example if you output an all black or random image, and score very low on quality, with a file that was not well compressed or that was bigger than the original).
The best take-home problem solutions will maintain reasonable quality while achieving good compression rates. Scores above 0 are a good starting point. For lossless compression, quality should come out to be 1.0, and compression will be the pure measure of how well you are doing. With lossy compression, you will get smaller files, but quality will go down too. So the key is to make the file size go down faster than the quality.
So that we can understand your solutions better, please include comments inside your code describing how your algorithm works. Also, make a README file explaining what you did and why, and anything else that you tried that you didn't use in the final solution. Explain what you learned while working on the take-home problem. Also, so that we can continue to improve the contest, please give us some feedback about what you thought about the problem: how difficult or easy it was, what were the hardest parts, what were the easiest parts, and other topics you would like to see in a take-home problem.
When you come to the contest, please bring on disk(s) the following:
We will recompile your executable from source if necessary. Please make sure that your source compiles with one of the following compilers: Microsoft Visual C++, gcc/g++ (free), or djgpp (free) for DOS/Win. Please ask us if you don't have any of these compilers and would like to use a different one. Be warned: If we can't get your source to compile, we won't be able to grade it!