Monday 30 May 2016

sound - Why are zero values added in the FFT of a concatenated noise signal?


After concatenating an 11-second *.WAV (sample rate 44100 Hz) file 27 times (just by gluing them head-to-tail), I obtained a ~300-sec noise signal. I used MATLAB(2015b) and the FFT function on both the original file and the concatenated signal. The FFT of the concatenated signal was, as expected, pretty similar to the original. However, for every y-axis point (spectral power) in the concatenated FFT, there were 2 zeros (not precisely zero, but values in the order of x*10-19, so about 18 orders of magnitude below the 'real' FFT values). The original noise file was specifically designed to allow for looping. Below a close up of the concatenated FFT showing the added zeros:


fft conc


Hence, the spectral power plot of the concatenated signal looked like the original, but every frequency power point was preceded and followed by two zeros. How can this be? Does it have to do with the fact that lengthening a signal increases the resolution of the FFT and that frequencies are analyzed in the concatenated signal that are not present in the original signal due to resolution restrictions? Or is this rather an error in my script? Other spectra showed up fine, though.




Answer



You're running into a property of the DFT that is usually used in the opposite direction: stuffing zeros between samples in one domain results in replication of the entire sequence in the opposite domain. Let's start in the frequency domain with the signal you plotted, $X[k]$, and look at the inverse transform to see what time-domain signal it corresponds to.


$$ x[n] = \sum_{k=0}^{N-1}X[k] e^{j 2 \pi k n/N} $$


$N$ is the length of the DFT that you apply to the data. Let's say that $X[k]$ consists of sparse values with $M$ zeros between each one, so that $X[k] = 0\ \forall\ k \neq kM$. The above then looks like:


$$ x[n] = \sum_{k=0}^{\left \lfloor{\frac{N-1}{M}}\right\rfloor}X[k] e^{j 2 \pi k M n/N} $$


Do some rearranging within the exponent to get:


$$ x[n] = \sum_{k=0}^{\left \lfloor{\frac{N-1}{M}}\right\rfloor}X[k] e^{j 2 \pi \frac{k n}{\frac{N}{M}}} $$


This is nothing but a $\frac{N}{M}$-point inverse DFT of the nonzero samples in your frequency-domain signal $X[k]$. Due to the periodicity property of the DFT, you can then show that $x[n]$ is periodic with period $\frac{N}{M}$, which is exactly the form of signal that you shoved into the DFT to begin with.


Summary: Replication of the signal in one domain results in a zero-stuffed signal in the opposite domain. This is used in the theory behind interpolating filters and other multirate processing. There is a dual property where if you downsample a signal (by removing $M$ out of every $N$ samples), the data in the opposite domain collapses on top of one another and adds (resulting in aliasing).


No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...