Friday, September 18, 2009

Audio Segmentation: overlap, windowing, filtering

Overlap: 25% and 50% overlaps are common. A regular process is:

  1. segment into 50% overlapped frames
  2. window each frame (ex Hamming window, blackman window)
  3. add overlaps to reconstruct

Use ‘segment/window/overlap’ to smooth out the transitions between different processing domains since there is a discontinuity between neighboring frames, and that makes it as a clicking sound.

A single FFT can trade off between higher frequency resolution (more samples) or higher time resolution (fewer samples), but cannot do both simultaneously.

Windowing:

The windowed DFT is the product of the original signal and the windowing function: x[n] is a signal of infinite extend, n is the sample index, g[n] is a finite window
image
In frequency domain this is a convolution with the window DFT:

image
Here is the impact of windowing:
audio_windowing

Spectrogram:

Short-time Fourier transform (STFT) is a sliding-window narrow Fourier transform that is repeated sequentially over a long vector of samples. A spectrogram is essentially a set of STFT plotted as frequency against time with the intensity (z-axis) given as a grey scale, or color pixel.

The horizontal axis of an FFT plot is traditionally used to represent frequency, and the vertical axis would display amplitude. A spectrogram plots time along the horizontal axis, frequency on the vertical and amplitude on the z-axis (as color or grey scale).
spectrogram

Filters:

Filters are designed based on specifications given by:

  • spectral magnitude emphasis
  • delay and phase properties through the group delay and phase spectrum
  • implementation and computational structures

Matlab functions for filter design

  • (IIR) besself, butter, cheby1, cheby2, ellip, prony, stmcb
  • (FIR) fir1, fir2, kaiserord, firls, firpm, firpmord, fircls, fircls1, cremez
  • (Implementation) filter, filtfilt, dfilt
  • (Analysis) freqz, FDAtool, SPtool

No comments:

Post a Comment