Friday, 22 May 2015

Estimating the Impulse Response of the Room Using Sweep Signal Microphone Recorded Signal (Input & Output of a Convolution)


I played this signal A (a 20Hz to 20000Hz sinusoidal sweep in 10 seconds) with a studio monitor speaker in a big church, and I recorded the result B with good microphones.


The result is very reverb-ish, that's exactly what I wanted to catch.



Now a software (such as Deconvolver but non open-source) can build an impulse response from A + B, that can be later used in a convolution reverb.


It works well. But I would like to learn how to do this myself via DSP / programming.


How can I use a sweep (signal A) + recorded output (signal B) to get an impulse reponse?




Edit: In other words, if a is the original sweep, b the recorded output, and h the impulse response, how to get h from


$$a * h = b$$


Is this formulation correct? is the solution $h=a^{-1} * b$, where $a^{-1}$ is the inverse of $a$ for the convolution? How to compute a convolution-inverse of a discrete signal?



Answer



this is the two-channel FFT method of spectrum analyzer:


$$ y[n] = h[n] \ \circledast \ x[n] $$



just make sure that the length of the FFT $N$ is at least as large as the length of sound $x[n]$ plus the expected length of the impulse response $h[n]$. the length of sound $y[n]$ is also as long as the FFT. you can round $N$ up to the nearest power of two. just zero-pad everything to that length and then


$$H[k] = \frac{Y[k]}{X[k]} $$


is functionally true.


you might sometimes have to worry about division by zero, but if your driving signal $x[n]$ is sufficiently broad-banded (which a linear sweep or a maximum-length sequence is), then you don't have to worry too much about division by zero.


if you know $H[k]$, then you know $h[n]$.


No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...