Skip to content

This project is trying to position the sound source with ML

Notifications You must be signed in to change notification settings

ooyang0325/soundar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Soundar - Map the World with Sound

We use MLP, Polynomial Regression Model to predict DOA (Direction of Arrival) of binaural audio tracks, and use Lasso Regression Model to predict the distance of audio source.

Features

  1. Binaural localization mode, replacing traditional multi-microphone array localization
  2. Easier to apply to wearable devices (headphones, hearing aids, etc.)
  3. High accuracy in angle prediction ​

Application

  1. Assist hearing-impaired individuals in noticing potential sudden threats
  2. Aid in detecting mechanical failures in automated production lines
  3. Auditory systems for bionic robots/animal robots
  4. Enhance surveillance capabilities in security systems ​

Dataset

& The dataset consists of six types of sound sources: sine wave of 130.81, 261.63, 1046.5, ambulance noise, gunshot, fart

  • Each dataset format:
    • $R=1\sim 30$, with a tolerance of $0.5$
    • $degree=0,5,10,15,\cdots,175,180$

ML Models

DOA Prediction

  • Features: ITD, ILD
  • Ouput: DOA
  • Selected Models: MLP, Polynomial Regression, GMM (for validation)

Distance Prediction

  • Feature: DOA, ITD, ILD, RMS Energy
  • Output: R (distance)
  • Selected Model:Lasso Regression Model


ITD and ILD data distribution, different colors represent different angles

Results


MLP prediction results for DOA



Polynomial Regression prediction results for DOA



Cartesian coordinate prediction results

About

This project is trying to position the sound source with ML

Topics

Resources

Stars

Watchers

Forks

Languages