Thursday, October 29, 2009

Zero Crossing Rate (ZCR) in Audio

ZCR is a very useful audio feature, it is defined as the number of times that the audio waveform crosses the zero axis.
Picture 1

fs is the sampling rate. ZCR is used to discern unvoiced speech. Usually unvoiced speech has a low short-term energy but a high zero crossing rate. Combining ZCR and STE to prevent low energy unvoiced speech frames from being classified as silent. MATLAB code:

% assume the window size is 2 seconds
% there are overlaps in windowing
% assume the step of shif is 1 second

[wav fs] = wavread('DEMO.wav');
wav = wav / max(max(wav));
window_length = 2 * fs;
step = 1 * fs;
frame_num = floor((length(wav)-window_length)/step) + 1;

zcr_ = zeros(frame_num, 1);
wav_window2 = zeros(window_length, 1);
pos = 1;

for i=1:frame_num
wav_window = wav(pos:pos + window_length-1);
wav_window2(2:end) = wav_window(1:end-1);
zcr_(i) = 1/2 * sum(abs(sign(wav_window) - ...
sign(wav_window2))) * fs / window_length;
pos = pos + step;


  1. Anonymous7:01 AM

    Hi Weis,

    A very good work, but, seems to me that it doesn't work. For you it's ok?

  2. is the result wrong or the program cannot run?

  3. Anonymous10:21 AM

    the program cannot run

  4. Anonymous10:23 AM

    Hi Weis,

    olease i need this program but the program cannot run

  5. I think you should use:

    wav = wav / max(abs(wav));


    wav = wav / max(max(wav));

  6. Anonymous7:06 AM

    Frame size 2 sec is too long for diagnostic ultrasound Doppler signal analysis (for simpler and easier substitute of FFT or wavelet analysis). 30 ms (millisecond) frame size is allowable max frame size. Also voiced/unvoiced discrimination will need such short frame size. Can this algorithm accommodate ??