Random Noise: Audio features in two levels

Monday, December 14, 2009

Audio features in two levels

Audio features are usually extracted in two levels: short term (frame) level, and long term (clip) level.

A frame is defined as a group of neighboring samples which last about 10-40ms, assume the audio signal is stationary and short-term features such as energy and Fourier transform coefficients can be extracted.

For a feature to reveal the semantic meaning, we use from one second to several tens seconds audio clips, sometimes called ‘window’.

Random Noise

Monday, December 14, 2009

Audio features in two levels

No comments:

Post a Comment

About Me

Categories

Blog Archive