Poster outline including results

Introduction - previous work

Discuss Langer et al (2003, 2006)
Malfante et al. (2018) classified 109,609 transients at Ubinas volcano with a 93.5% accuracy.
Falcin et al. (2021) got an accuracy rate of 72% by applying AAA to 845 events labelled by OVSG (542 VT, 217 nested, 86 LP) for 2013-2018. Adding hybrid and tornillo events (after Moretti et al. 2020) led to accuracy of 84%, which could have reached 86-93% but was dragged down by hybrids (64%). Detection rate was twice as good as STA/LTA used by OVSG.

Events: 217,290 transients detected on the MVO digital seismic network between 1996/10/21 and 2008/10/16.
Labels: Five main labels used by MVO for local volcano-seismic events: rockfall (ROC: 58%), hybrid (HYB: 19%), long-period (LPE: 11%), lp-rockfall (LP-ROC: 5.8%), and volcano-tectonic (VT: 3.1%). Others are regional (REG: 1.4%), gages (or st georges?, GAG: 0.7%)
Total traces: 4,072,590
Traces per event: Average is 18.7
I separately noted 235,804 S-files on hal between 1996/10/21 and 2008/10/16. 191,592 by 2004/02/16 (an end date not likely to cause problems with SRC/MVO). cum_N_sfiles = 235,804, cum_N_DSN_wavfiles = 217,290, cum_N_DSN_traces = 4,072,590, cum_N_ASN_wavfiles = 42,110, cum_N_ASN_traces = 439,569

M1.1: We computed a range of data quality and statistical and physical metrics on each waveform trace for each transient
M1.2: Waveform QC: We automatically removed waveforms with dropouts (and list other problems we look for here).
M1.3: We manually reviewed/reclassified transient classifications until we had approximately 100 transients of each class (total 522).

M2.1: The code [http://github.com/malfante/AAA] transforms each waveform into a set of 102 features: 34 features for each of three domains (time, spectral, cepstral).
M2.2: We added 5 frequency metrics we computed in M1.1 as features here. These pre-computed features are 2 band ratios, peak frequency, median frequency and bandwidth. The resulting 107-point vectors of features were then used for modeling.
M2.3: The dataset was randomly divided 50 times into training and testing datasets, to produce a robust model. We use the Random Forest Classifier algorithm from the scikit-learn library. One model is produced per trace ID.
M2.4: We try different trace IDs and compare results. For each waveform, a probability is computed for each class.

Results from Paris are at MONTSERRAT/results.csv. These can be compared with results obtained without the extra features I added here.
Separate models for 3 channels yield accuracies of 76-80%.
If the LP-ROC class is omitted (following Langer et al, 2006), accuracy rises to 82-85%.
If only VT and LP classes are considered, accuracy is 96-99%.

For the poster, we intend to:

add a frequency change metric to M1.1 and incorporate this as a pre-computed feature at M2.2
generate one model per channel at M2.3
compute a probability for each trace ID in an event, and take a weighted average of these to automatically label the event
expand our labelled dataset to at least 1000 events
reclassify the catalog of 217,290 transients
repeat the jackknifing