This is a continuation of a previous question.
I'm trying to analyze breathing and snoring sounds, and while I can fairly well detect snoring now, breathing is a bigger challenge.
I've learned that if I break the analyzed frequency range (about 4KHz, sampled at about 8KHz, with a framesize of 1024) into about 5 subranges, very often one of the subranges exhibits a good sensitivity (using spectral difference) that is buried in the noise in the overall range. The trick is to determine which subrange to "trust" when.
Presumably the "trustworthy" subrange would exhibit variability at a rate between about 2Hz and 0.05Hz, while the "bad" subranges would behave more randomly, with most of their variation being at shorter intervals.
I could cobble together some sort of algorithm to smooth the values at a sub-second resolution and then calculate the variability over longer intervals, but I wonder if there isn't a "canned" algorithm for this sort of things -- something with maybe a modicum of theory behind it?
Any suggestions?
[Note: I realize that one could, in theory, use an FFT to extract this info, but that seems like using a baseball bat to kill a flea. Maybe something a little more lightweight?]
Added:
In a sense (to use an analogy) I'm trying to detect a "baseband" signal in an RF transmission (only the "RF" is audio frequencies, and the "baseband" is below 8Hz). And, in a sense, the "RF" is "spread spectrum" -- the sounds I want to detect tend to generate lots of harmonics and/or have several separate frequency components, so if one band of the spectrum is too noisy I can probably make use of another. The goal is to basically determine some metric resembling SNR for the various frequency bands, on the assumption that most "noise" is > 2Hz and my signal is less than 2Hz.
I have as input to this algorithm the raw amplitudes (sum of FFT amplitudes at all included frequencies) for each band, measured at 8Hz intervals.
(It should be noted that, while I have not done any formal SNR measurements, the overall SNR across the processed spectrum appears to frequently be near or below 1.0 -- if you visually observe the sound envelope in a tool like Audacity no modulation of the envelope is noticeable (even though the ear can clearly discern breathing sounds). This is why it's necessary to analyze bands to find those with decent SNR.)
No comments:
Post a Comment