Sunday 6 September 2015

image processing - Gaussian Blur - Standard Deviation, Radius and Kernel Size


I've implemented a gaussian blur fragment shader in GLSL. I understand the main concepts behind all of it: convolution, separation of x and y using linearity, multiple passes to increase radius...


I still have a few questions though:




  • What's the relationship between sigma and radius?


    I've read that sigma is equivalent to radius, I don't see how sigma is expressed in pixels. Or is "radius" just a name for sigma, not related to pixels?




  • How do I choose sigma?



    Considering I use multiple passes to increase sigma, how do I choose a good sigma to obtain the sigma I want at any given pass? If the resulting sigma is equal to the square root of the sum of the squares of the sigmas and sigma is equivalent to radius, what's an easy way to get any desired radius?




  • What's the good size for a kernel, and how does it relate to sigma?


    I've seen most implementations use a 5x5 kernel. This is probably a good choice for a fast implementation with decent quality, but is there another reason to choose another kernel size? How does sigma relate to the kernel size? Should I find the best sigma so that coefficients outside my kernel are negligible and just normalize?





Answer




What's the relationship between sigma and radius? I've read that sigma is equivalent to radius, I don't see how sigma is expressed in pixels. Or is "radius" just a name for sigma, not related to pixels?




There are three things at play here. The variance, ($\sigma^2$), the radius, and the number of pixels. Since this is a 2-dimensional gaussian function, it makes sense to talk of the covariance matrix $\boldsymbol{\Sigma}$ instead. Be that as it may however, those three concepts are weakly related.


First of all, the 2-D gaussian is given by the equation:


$$ g({\bf z}) = \frac{1}{\sqrt{(2 \pi)^2 |\boldsymbol{\Sigma}|}} e^{-\frac{1}{2} ({\bf z}-\boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} \ ({\bf z}-\boldsymbol{\mu})} $$


Where ${\bf z}$ is a column vector containing the $x$ and $y$ coordinate in your image. So, ${\bf z} = \begin{bmatrix} x \\ y\end{bmatrix}$, and $\boldsymbol{\mu}$ is a column vector codifying the mean of your gaussian function, in the $x$ and $y$ directions $\boldsymbol{\mu} = \begin{bmatrix} \mu_x \\ \mu_y\end{bmatrix}$.


Example:


Now, let us say that we set the covariance matrix $\boldsymbol{\Sigma} = \begin{bmatrix} 1 & 0 \\ 0 & 1\end{bmatrix}$, and $\boldsymbol{\mu} = \begin{bmatrix} 0 \\ 0\end{bmatrix}$. I will also set the number of pixels to be $100$ x $100$. Furthermore, my 'grid', where I evaluate this PDF, is going to be going from $-10$ to $10$, in both $x$ and $y$. This means I have a grid resolution of $\frac{10 - (-10)}{100} = 0.2$. But this is completely arbitrary. With those settings, I will get the probability density function image on the left. Now, if I change the 'variance', (really, the covariance), such that $\boldsymbol{\Sigma} = \begin{bmatrix} 9 & 0 \\ 0 & 9\end{bmatrix}$ and keep everything else the same, I get the image on the right.


enter image description here


The number of pixels are still the same for both, $100$ x $100$, but we changed the variance. Suppose instead we do the same experiment, but use $20$ x $20$ pixels instead, but I still ran from $-10$ to $10$. Then, my grid has a resolution of $\frac{10-(-10)}{20} = 1$. If I use the same covariances as before, I get this:


enter image description here



These are how you must understand the interplay between those variables. If you would like the code, I can post that here as well.



How do I choose sigma?



The choice of the variance/covariance-matrix of your gaussian filter is extremely application dependent. There is no 'right' answer. That is like asking what bandwidth should one choose for a filter. Again, it depends on your application. Typically, you want to choose a gaussian filter such that you are nulling out a considerable amount of high frequency components in your image. One thing you can do to get a good measure, is compute the 2D DFT of your image, and overlay its co-efficients with your 2D gaussian image. This will tell you what co-efficients are being heavily penalized.


For example, if your gaussian image has a covariance so wide that it is encompassing many high frequency coefficients of your image, then you need to make its covariance elements smaller.


No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...