Friday, September 18, 2009

Audio Segmentation: overlap, windowing, filtering

Overlap: 25% and 50% overlaps are common. A regular process is:

  1. segment into 50% overlapped frames
  2. window each frame (ex Hamming window, blackman window)
  3. add overlaps to reconstruct

Use ‘segment/window/overlap’ to smooth out the transitions between different processing domains since there is a discontinuity between neighboring frames, and that makes it as a clicking sound.

A single FFT can trade off between higher frequency resolution (more samples) or higher time resolution (fewer samples), but cannot do both simultaneously.


The windowed DFT is the product of the original signal and the windowing function: x[n] is a signal of infinite extend, n is the sample index, g[n] is a finite window
In frequency domain this is a convolution with the window DFT:

Here is the impact of windowing:


Short-time Fourier transform (STFT) is a sliding-window narrow Fourier transform that is repeated sequentially over a long vector of samples. A spectrogram is essentially a set of STFT plotted as frequency against time with the intensity (z-axis) given as a grey scale, or color pixel.

The horizontal axis of an FFT plot is traditionally used to represent frequency, and the vertical axis would display amplitude. A spectrogram plots time along the horizontal axis, frequency on the vertical and amplitude on the z-axis (as color or grey scale).


Filters are designed based on specifications given by:

  • spectral magnitude emphasis
  • delay and phase properties through the group delay and phase spectrum
  • implementation and computational structures

Matlab functions for filter design

  • (IIR) besself, butter, cheby1, cheby2, ellip, prony, stmcb
  • (FIR) fir1, fir2, kaiserord, firls, firpm, firpmord, fircls, fircls1, cremez
  • (Implementation) filter, filtfilt, dfilt
  • (Analysis) freqz, FDAtool, SPtool

No comments:

Post a Comment