Wednesday 13 May 2015

fourier transform - Is interpolation of an audio signal to increase frequency resolution possible?


I apologize if some of what I ask is not entirely correct, I'm new to this field, but extremely interested.



I have an Audio signal of sample rate 44.1 kHz that I want to segment into 30 frames, and get the DFT of each frame to find the magnitude of certain frequencies for that frame. However, this means that I have a frequency resolution of 30 Hz in each bin, which isn't narrow enough.


Is it possible to interpolate the data to attain more data samples? As far as I'm aware, doubling the number of points would give a 88.2 kHz sampling rate, but still give a frequency resolution of 30. Would it be possible to treat the interpolated data as still having a sample rate of 44.1 kHz?



Answer



I do not understand why you have 30Hz resolution so I will focus only to the principle of the question "does interpolation increase resolution?".


Short answer is no, no new data, no new information.


A longer answer needs the spectrum visualization below, with time domain, continuous frequency domain and discrete frequency domain from left to right.


upsampling


The interpolation technique here is to preserve the information of spectrum. DFT works on discrete frequency domain which is the part from $-f_s$ to $f_s$ of the continuous frequency version.


First, look at the continuous frequency domain, if you upsample your signal correctly, it is equivalent to changing the sampling frequency. You wish to double the number of data, but it is just removing one-half the spectrum replicas of sampling process.


Now, look at the discrete frequency one. This version is normalized from $-f_s$ to $f_s$ of the continuous frequency counterpart. $f_s$ is doubled, the spectrum is then shrinked by a factor 1/2. If we call $0 < \alpha < 1$ the proportion of non-zero frequencies before upsampling, this proportion is $\alpha/2$ after upsampling. Before upsampling, DFT gives you $N$ bins for $\alpha$ then $N\alpha$ bins for the spectrum; after upsampling it is $2N$ for $\alpha/2$ then always $2N \times \alpha/2 = N\alpha$ bins for the same spectrum. No, your resolution does not change at all.



To have smoother resolution, the only way is to add more data. In your example, instead of dividing to 30 frames, divide your audio file to 15 frames.


No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...