Friday, 6 March 2015

c# - How to resample audio using FFT or DFT


I'm down sampling voice audio by first performing an FFT, then only taking the parts of the result that I need, and then performing an inverse FFT. However, it's only working properly when I'm using frequencies that are both power of two, say down-sampling from 32768 to 8192. I perform an FFT on the 32k data, discard the top 3/4 of the data and then perform an inverse FFT on the remaining 1/4.


However, whenever I try to do this with data that doesn't line up properly one of two things happen: The math library I'm (Aforge.Math) using throws a fit, because my samples are not a power of two. If I try to zero-pad the samples so they become power of twos, it get gibberish out on the other end. I also tried to use a DFT instead, but it ends up being insanely slow (this needs to be done in real time).



How would I go about to zero pad the FFT data properly, both on the initial FFT and the inverse FFT at the end? Assuming I have a sample at 44.1khz that needs to get to 16khz, I currently try something like this, the sample being 1000 in size.



  1. Pad input data to 1024 at the end

  2. Perform FFT

  3. Read the first 512 items into an array (I only need the first 362, but need ^2)

  4. Perform inverse FFT

  5. Read the first 362 items into the audio play buffer


From this, i get garbage out at the end. Doing the same thing but without having to pad at step 1 and 3 due to the samples already being ^2, gives a correct result.




No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...