Friday, 22 July 2016

image processing - JPEG DCT padding


Since the JPEG DCT block used is 8x8, how does the method deal with images with dimensions that are not multiples of 8? What kind of padding does it use? How are the 8x8 blocks of the image chosen?



Answer



The filling is performed to the right ($[1\,,1\,,3\,,x_1\,,x_2\,,x_3\,,x_4\,,x_5]$) or the bottom ($[1\,,1\,,3\,,y_1\,,y_2\,,y_3\,,y_4\,,y_5]^T$), line by line or column by column. The values, as far as I know, is not fixed, and depend on the encoder.


Remember that blocks are formed on luminance/chrominance transformed images, after color space transformation (RGB>YUV) and chroma subsampling. Images are parsed in raster scan: left to right, top to bottom.


The partly occupied blocks on the right and the bottom are filled into Minimum Coded Units of $8\times8$, see JPEG Minimum Coded Unit (MCU) and Partial MCU:



In the case where there are not enough pixels in a row or column to complete a full tile, a partial MCU is used. A partial MCU is automatically extended to be the size of a full MCU but then the overall image dimensions are used to indicate where to cut off the extra later. This extension is generally done by repeating the last pixel of the row or column as necessary.



From Baseline JPEG:




The image is partitioned into blocks of size 8x8. Each block is then independently transformed using the 8x8 DCT. If the image dimensions are not exact multiples of 8, the blocks on the lower and right hand boundaries may be only partially occupied. These boundary blocks must be padded to the full 8x8 block size and processed in an identical fashion to every other block. The compressor is free to select the value used to pad partial boundary blocks.



This last image is taken from Heiko Schwarz, Source Coding and Compression:


JPEG tiling


A mere zero-padding can be applied, but the risk of strong artifacts at the borders is very high. For advanced applications, and with more recent image coders, one can benefit from the symmetry in the basis function to extend the image more inherently with symmetry/antisymmetry. You can check for instance On Reconstruction Methods for Processing Finite-Length Signals with Paraunitary Filter Banks , Oct. 1995 (online version).


No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...