Thursday 23 June 2016

fft - Find Short Clip of Audio Within Longer Clip Of Audio


I need help figuring out a way to automatically synchronize two audio files that are both different recordings of the same source. More info below.




This example is from a concert, where there is only one audio source - the band.






  • In the first track of the image below, we have three minutes of audio recorded from the soundboard of the concert venue. This contains a crystal clear recording of the band's performance.





  • In the second track of the image below, we have roughly a minute and a quarter of audio recorded using an iPhone in the crowd. This audio still sounds ok, but is not great by any means. Still, it's very
    useable.




Since both of these recordings were taken at roughly the same time of the same performance, the audio in track two synchronizes somewhere with the audio in track 1.




Waveforms Out-Of-Sync





The image below shows where these two snippets of audio synchronize - about the 3 second mark.


Waveforms in Sync




I know that it's pretty easy to synchronize these two audio files up by hand and by ear, but I need to develop a script that synchronizes the two snippets automatically.




The Spectrogram and Spectrogram Log match up pretty well. I'm not sure if I can work with that or not.


Here is the spectrogram comparison:


Spectrogram


Along with the spectrogram log comparison:


Spectrogram Log





I've looked into Cross-Correlation of Waveforms using this video, FFT stuff, and other libraries. I'm lost and any help for my college project would be much appreciated!




No comments:

Post a Comment

readings - Appending 内 to a company name is read ない or うち?

For example, if I say マイクロソフト内のパートナーシップは強いです, is the 内 here read as うち or ない? Answer 「内」 in the form: 「Proper Noun + 内」 is always read 「ない...