Skip to content

Classifier Definition

David Wood edited this page Apr 11, 2022 · 3 revisions

Classifier/Models

A model/classifier is made up of the following:

  1. Scaling of PCM data segment (1..N seconds) into doubles in range [-1,1]
  2. Optional data augmentation to produce acoustic variants (pitch, noise, etc) during training only
  3. Feature/Spectogram Extractor, consisting of the following:
    1. Sub-windows size and hop sizes are defined to extract sub-windows.
    2. A feature extractor (MFCC, MFFB, LogMel, FFT, etc) is applied to each sub-window. The collection of features computed across all sub-windows produces a spectogram (aka featuregram). Generally, the x-axis is time and the y-axis is the feature's dimension (e.g., frequency).
    3. Feature Processor (Normalize, velocity, acceleration, etc) can further manipulate the full featuregram. This is optional.
  4. An algorithm learns on pipelined labeled featuregrams. Classification determines label(s) for unlabeled featuregrams.

If defining your own classifiers (e.g., with JavaScript, see details then it is helpful to be familiar with the Java interfaces/classes associated with these components as follows:

  1. IDataWindow and ILabeledDataWindow - holds the the time series segment (e.g., sound, vibration) and labels in the case of the latter.
  2. IDataWindowTransform - applies transformations to an IDataWindow to produce
  3. IFeatureGramExtractor (FeatureGramExtractor)
    1. IFeatureExtractor - various types exist, including FFT, MFCC, MFFB, LogMel and more
    2. IFeatureProcessor - one primary implementation allows computation of derivatives across time (velocity and/or acceleration) over the spectrogram.
  4. IClassifier - defines an algorithm that can be trained and provide classification.
Clone this wiki locally