As follow up to my previous question I was wondering if there are any speech detection libraries in existence. By speech detection I mean passing in an audio buffer and getting back an index of where speech starts and stops. So if I have 10 seconds of audio sampling at 44kHz, I would expect an array of numbers such as:
44000
88000
123000
190334
...
This would indicate for example that speech starts one second in and then finishes at the two second point, etc.
What I'm not looking for is speech recognition which writes out text from spoken word. This unfortunately is what I see a lot of when I google 'speech detection'.
It would be great if the library was in C, C++ or even Objective-C as I'm writing an app for the iPhone.
Thanks!
Answer
In my answer to your that question, I had mentioned that Voice Activity Detection is a standard feature for codecs like G.729 and such others.
You should look for reference encoders and decoders for algorithms that applies this.
One such example is - http://www.voiceage.com/openinit_g729.php
Another possible source is Speex codec. Which implements VAD
BTW: You should google "Voice Activity Detection" or "Talk Spurt" rather than "Speech Detection".
No comments:
Post a Comment