The picture is divided into blocks, because the following stages are difficult and to get a better compression’s rate and fewer losses, small blocks with a size of 8 by 8 are used.
For example, for an image with a size of 24×16.
The blocks must be regrouped four by four like this : the red one, the blue one, the yellow one and the orange one.
Then the green one, a block composed with the last element of the previous block, the blue one and a block composed with the last element of the previous block.
We need to insert those blocks, because the number of blocks in the image must be a multiple of 4
For an image with a size of 16×24.
We regroup the red one, the blue one, the green and the yellow one.
And then the orange one, the blue one and two blocks composed with the last element of the previous block.
Usually, the pixels of a picture are represented by three components
: RGB (red, green and blue). But this representation is not the best onr for the JPEG compression, so we need to transform them into one luminance and two chrominances.
The first component is Y the luminance, then we have Cr or U the first chrominance and Cb or V the second chrominance.
To obtain the luminance and the chrominance, we have to do those operations :
This is a bijective transformation, so there is none or just a little loss (rounding errors) during this stage.
The DCT is a Fourier’s transfornation where the sine has been changed into a cosine. We can understand the DCT as a transformation that takes a signal composed in our case of 64 pixels and send at the end the signals amplitude.
So, after the DCT, the smaller frequencies are in the upper left corner. The element in the corner (0,0) is named the DC and is the average of all the elements in the matrix before the DCT. The other elements are named AC and represent the amplitude of the elements which was at the same place before the DCT with the DC.
This stage is quite critical, because it is lossy : it is a non conservative stage.
The principle of the quantization is to devide each elements of the DCT matrix by the right element in the quantization tables.
Its aim is to filter data coming from the cosine transformation and preserve high frenquencies, because the human eye is less sensible errors in the low frequencies.
Each matrix during the coding is visited with a Zig-Zag pattern. This way of visiting a matrix exposes a lot of zeros that help in the entropy coding process.
Huffman coding is a statistical method of compaction. Its principle is to replace a symbol by a continuation of bits. The larger the frequency of appearance of a symbol is, the more the lenght of the code of this symbol will be small. The smaller the frequency of appearance of a symbol is, the longer its code will be.
This way of coding makes it possible to code each symbol according to Huffman tables.
In JPEG there are 4 tables; one for luminance’s DC, one for chromiance’DC, one for luminance ‘AC and one for chromiance’AC.
To code the DC, we need first the size in number of bits of the DC and next its value. The size is coded with the right Huffman table and the value is translated in bits.
For the AC, we need to write on one byte the numbers of zeros before the AC and its size in number of bits. Then, this number is coded with the right Huffman table. The AC’s value is coded in the same way as the DC’s avlue.
Let me take an example to explain how to code a negative AC’s value
or DC’s value. If we have -11. We take 11 and convert it in bits :
Then we have to reverse the number. So 10001 become 01110.
The decoder will know that it is a negative number, because it begins with 0.
There are some rules to respect before writing a JPEG file (JFIF compliant). The file has to begin with FFD8 and must finish with FFD9.
The first marker is FFE0 and means that the JPEG header is following. Then we have FFDB for the quantization tables, FFC4 for the Huffman tables, FFC0 the start of frame marker and FFDA the start of scan marker.
Sometimes, there is an other marker FFFE. It is used to put some comments in the file.
The markers are explained at this adress : http://www.obrador.com/essentialjpeg/headerinfo.htm
To decode a picture with Huffman’s algorithm, we need to get the code that is after the marker FFDA and use the Huffman tables that are coded after the marker FFC4.
Before decoding, a 16×16 picture is made of 6 matrixes 8×8. The four first are the blocks of the 16×16 matrix of luminance and the two last one are the two chrominances.
The marker FFC4 helps us to get the four Huffman tables. There are one for DC’s luminance, one for DC’s chrominance, one other for AC’s luminance and one for AC’s chrominance.
During the decoding, the first element is a DC. First there is a size. To get its code, we have to look at the right Huffman table. This size indicates the number of bits on which is coded the DC. The DC’s value is the next one. To decode it, we have to translate it in integer. If the value is negative, it begins with 0 and we must reverse the 1 into 0 and the 0 into 1.
Then, there is a block of 63 AC. The first element to decode a AC is a byte where are coded on the four first bits the number of zeros and on the four last bits the size of the AC in number of bits. To get the code of this element, we have look at the right Huffman table. Then, there is the AC’s value which is decoded as the same as the DC’s value.
A block is finished, because we had 64 values or because we found a EOB (End Of Block). The EOB determines the end of block with a lot of zero. Its code is in the AC’s Huffman tables.
To revert the quantization, we have to multiply each element of the matrix with the element at the same place in the quantization’s matrix.
The matrixes are not the same for the luminance and the chorminance :
This Fourier’s transformation defines the inverse of the DCT. It is used to pass from amplitudes to the initial values of the matrix.
The JPEG file is coded in luminance and chromiance, but at the end of the decoding we need to get back to the red, green and blue, because they are required for a PPM file output
The operations are :
The blocks are grouped four by four in the JPEG file, and we would get back each blocks line of the image for our PPM file. So we need to reorganize them.
To write in a PPM file there are some rules to respect.
The initial bytes are "P6" meaning that this is a PPM file.
On the next line, we have to write the width and the lenght of the picture.
On an other line, we must write the color’s level that have to be between 0 and 65536.
Then, we can begin on an other line to write the pixels that are represented by three components (red, green and blue).
This picture illustrate the architecture of the xSTreamC program that code a JPEG image :
Reading and Writing of the PPM and JPEG files are done in the main and not during the pipeline execution.
A first pipeline allows to separate the image in blocks 8×8, then to gather them four by four. In order to code the image correctly, it is necessary to transform the components red, green and blue, into one luminance and two chrominances.
A split join then allows to separate teh luminance and the chroimances in order to work separately on these datas. The chrominances will be also separated by a split join, in order to treat separately at the beginning the first chrominance and the second chrominance.
At the beginning of the chrominance, it is necessary to make a
downsample, because this component is a less important than the
luminance and is minimized during coding.
Then, on the luminance and on the chrominance, we have to do the DCT and the quantification. We must also make the difference between each DC and visit each block in Zigzag.
Once finished, both chrominances are gathered so that the first part of coding begins. What follows is once again made as the same way on the chrominance and the luminance, but separately.
Per block of four, DC are gathered by putting in first their size in number of bits, then their value. Then we have four blocks of 63 AC. We put first, the number of zeros preceding the AC, then the size in number of bits of AC and finally the AC’s value. The following filter allows to gather on one byte, for the AC, the number of zeros and its size in number of bits.
The coding with the Huffman’s tables can finally start.
For DC, the size is coded according to these tables and the value of DC is translated in bits. For AC, the number of zeros and its size are coded using the Huffman tables and the value of AC as the same way as the DC.
We gather then the coded luminance and the coded chrominances preceded by the Huffman tables. The filters constituting the last pipeline, allow to register the file’s heading, then the quantization tables and datas numbering each tables.
The coder is now operational and allows to decode large images. Here is an example:
The following diagram describes the architecture of the program that decode a JPEG image.
In a new split join, we have to treat the luminance and the chrominances in the following way:
We start by doing the difference between the DC to recover the initial values. We dequantify each coef, then we apply the DCT on each matrix.
And only for the chrominance, we have to make an upsample.
To finish, the pixels should be color-space converted. We need a component coming from the luminance, and two others coming from the chrominances. Then we transform these components into red, green and blue.