Skip to content

10_select_best_events.ipynb

Glenn Thompson edited this page Nov 12, 2021 · 3 revisions

10_select_best_events.ipynb: This was my first attempt to find the best N events of each class, and create an input file for ingestion into Malfante's AAA code-base.

It does the following:

  • Calls read_volcano_def() to get a list of subclasses allowed by Seisan
  • Creates a list of subclasses allowed for ML
  • Reads catalog_all.csv if it exists and subsets the DataFrame to certain columns. Otherwise, it calls build_master_event_catalog() on catalog_all_original.csv. Either way, the catalog DataFrame is dfall.
  • Reads STATION0_MVO.HYP by calling parse_STATION0HYP().
  • Begins main loop:
  • calls _count_by_subclass() ask asks user how many more events (N) of each class to reclassify
  • gets/updates fingerprints for each class by calling get_weighted_fingerprints(), and save_fingerprints()
  • selects best N unchecked events of each subclass based on quality, by calling _select_best_events()
  • user manually QCs these best N events (of each class) with qc_best_events()
  • these reclassified events are then merged back into dfall based on index, and dfall is written to catalog_all.csv
  • user is then asked if they want to loop again.
  • remove_marked_events() is called, which subsets the catalog
  • to_AAA() is then called on this subsetted catalog, and writes out aaa_labelled_events.csv
  • report_checked_events() is called to give information about the events reclassified so far

Functions defined within this code are parse_STATION0HYP(), add_station_locations(), plot_amplitude_locations(), deconvolve_instrument_response(), read_volcano_def(), build_master_event_catalog(), _count_by_subclass(), _select_best_events(), get_weighted_fingerprints(), save_fingerprints(), _merge_dataframes(), _guess_subclass(), qc_best_events(), remove_marked_events(), to_AAA(), and report_checked_events().

The notebook has some additional sections that appeared to have been migrated to [dfall_tools.ipynb], and extended there:

  • examine the variable onset of events listed in aaa_labelled_events.csv
  • sort catalog_all.csv by trigger_duration, and plot one event at a time
  • as above, but plot the detection window too
  • view events marked for splitting
  • read fingerprint files
  • subset catalog_all.csv for events checked but not marked to ignore, delete or split. Hopefully this logic made its way into remove_marked_events().
  • there must have been a bug where sometimes say when subclass=r but new_subclass=h, still had r=100 and needed to change this to r=0.
  • a test subset of dfall by columns