Saturday 1 August 2015

What distance metric can I use for comparing images?


I usually use the mean squared error (MSE) or peak signal-to-noise ratio (PSNR) to compare two images, but this isn't good enough. I need to find a formula that returns a very big distance between an image A and its pixellated (or blurred) version B, but I don't know how to proceed. What would be a good metric for my needs?



Answer




The following is not intended to be an answer, but is a statistic that will help us choose an appropriate image comparison technique based on the characteristics of the images you are analyzing.


The first step is to plot a "delta histogram" as follows:


for (x,y) in [0, width] x [0, height] begin
delta = abs( SecondImage(x, y) - FirstImage(x, y) )
hist[delta] += 1
end

Given a plot of this histogram, we will know a bit more about the "magnitude" of the changes you are looking for, and will make better recommendations.


(Alternatively, post some sample images. Remember that if the sample images aren't representative of the image differences you are interested in, we might make inferior recommendations.)





You can also test Structural Similarity (SSIM) on your image set and post your results here. Remember that SSIM is designed to mimic human's ability to recognize obstructiveness of image degradation, so it would detect pixelation but maybe not blurring.




If your images are not photographic images (or, are scientific images that are not ordinary subjects of photography), then please also post examples of their 2D autocorrelation, suitably cropped and scaled.




Face recognition is too big a topic to be discussed in a single question. Blurring arises in multiple context in face recognition - it can be a data quality issue, or it can be done intentionally as an intermediate step in data processing.


In face recognition we want to detect the identity of faces, therefore we have to ignore image differences that are not caused by identity differences. The basic category of differences that should be ignored in face recognition are: pose, illumination and facial expression.


A general approach to ignore irrelevant differences is called normalization, which attempts to apply various operations and transforms on the input image to obtain a "canonical" or "preprocessed" image, which in turn can be used for identification.


A second approach is to extract features from images that are highly-invariant from the irrelevant factors.


The quality of a face image is subject to the capturing device and the environment where it was captured. When a face image is captured without cooperation of the subject (such as from a security camera), poor image quality is an unavoidable consequence and had to be remedied by software so as not to hamper identification.


In cooperative capture, a computerized measure of image quality is good: the operator can be notified of quality problems and the image can be re-taken.



Blurring can also be an example of malicious tampering of biometrics in order to evade detection (along with occlusion and disguise). If the image is encoded digitally, a digital checksum and cryptographic signature is sufficient to solve the problem completely. If the blurred image is submitted in physical print by an impersonator, a computerized measure of facial image quality can be used to reject such submissions.




The lack of 2D-localizable features or interest points in a certain part of facial image can be a sign of intentional blurring.


However, the broad category of digital image tampering (by a skilled user of image-editing software), can only be dealt with Digital image forensics which compares pixel statistics against known camera models.


No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...