I'm down sampling voice audio by first performing an FFT, then only taking the parts of the result that I need, and then performing an inverse FFT. However, it's only working properly when I'm using frequencies that are both power of two, say down-sampling from 32768 to 8192. I perform an FFT on the 32k data, discard the top 3/4 of the data and then perform an inverse FFT on the remaining 1/4.
However, whenever I try to do this with data that doesn't line up properly one of two things happen: The math library I'm (Aforge.Math) using throws a fit, because my samples are not a power of two. If I try to zero-pad the samples so they become power of twos, it get gibberish out on the other end. I also tried to use a DFT instead, but it ends up being insanely slow (this needs to be done in real time).
How would I go about to zero pad the FFT data properly, both on the initial FFT and the inverse FFT at the end? Assuming I have a sample at 44.1khz that needs to get to 16khz, I currently try something like this, the sample being 1000 in size.
- Pad input data to 1024 at the end
- Perform FFT
- Read the first 512 items into an array (I only need the first 362, but need ^2)
- Perform inverse FFT
- Read the first 362 items into the audio play buffer
From this, i get garbage out at the end. Doing the same thing but without having to pad at step 1 and 3 due to the samples already being ^2, gives a correct result.
No comments:
Post a Comment