I have a little karaoke-style app where a user sings 4 lines of a song, with a one second gap between each line. There is no backing music, so it's voice only, hopefully making the problem easier to solve.
I am looking for the most robust way to detect exactly where in my recording the user starts and ends singing line 1, starts and ends singing line 2, etc.
I have cobbled together a simple-minded algorithm that works when there is very little background noise in the recording (like when does that happen?), but it falls to pieces in the presence of the smallest noise.
Can anybody point me towards something more robust?
No comments:
Post a Comment