Arousal Valence (msd-musicnn-1.pb; deam-msd-musicnn-2.pb) #1455

Aradhya-Tripathi · 2025-01-10T17:22:58Z

We are trying too get mood predictions using Essentia models, but we are unsure about the way to use the outputs. We are getting a lists of value pairs, but unsure how to convert / interpret this list of pairs (assuming they are arousal and valence pairs).

palonso · 2025-01-22T15:06:13Z

Hi @Aradhya-Tripathi,
All our arousal/valence models operate on small chunks of 1 to 3 seconds (depending on your chosen embedding model).
The expected output shape is: (T, D), where T depends on the audio duration, and D represents the valence and arousal values. To predict overall A/V values, you can compute the average along the time axis (T).

palonso closed this as completed Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arousal Valence (msd-musicnn-1.pb; deam-msd-musicnn-2.pb) #1455

Arousal Valence (msd-musicnn-1.pb; deam-msd-musicnn-2.pb) #1455

Aradhya-Tripathi commented Jan 10, 2025

palonso commented Jan 22, 2025

Arousal Valence (msd-musicnn-1.pb; deam-msd-musicnn-2.pb) #1455

Arousal Valence (msd-musicnn-1.pb; deam-msd-musicnn-2.pb) #1455

Comments

Aradhya-Tripathi commented Jan 10, 2025

palonso commented Jan 22, 2025