Wednesday, October 28, 2009

Short-Time Energy (STE) in Audio

STE is the audio feature that is widely used and the easiest. It is also called volume. STE is a reliable indicator for silence detection. Normally STE is approximated by the rms (root mean square) of the signal magnitude within each frame. The MATLAB code:

% assume the window size is 2 seconds
% there are overlaps in windowing
% assume the step of shif is 1 second

[wav fs] = wavread('DEMO.wav');
wav = wav / max(max(wav));
window_length = 2 * fs;
step = 1 * fs; % has overlap
frame_num = floor((length(wav)-window_length)/step) + 1;

energy = zeros(frame_num, 1);
pos = 1;

for i=1:frame_num
wav_window = wav(pos:pos + window_length-1);
energy(i) = 1/window_length * sum(wav_window.^2);
pos = pos + step;
end

The short time energy of audio signal depends on the gain value of the recording devices. Usually we normalize the value for each frame

No comments:

Post a Comment