Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prepare_data method to generate and save protocol data on disk #1500

Closed

Commits on Oct 13, 2023

  1. add prepare_data method in Task class

    The goal of this method is to generate the data needed by the task
    and save it on disk for future uses, for example by the `setup` method.
    The objective is to avoid systematically recreating data on each process
    at the beginning of a training
    clement-pages committed Oct 13, 2023
    Configuration menu
    Copy the full SHA
    933a660 View commit details
    Browse the repository at this point in the history

Commits on Oct 26, 2023

  1. Configuration menu
    Copy the full SHA
    5257145 View commit details
    Browse the repository at this point in the history

Commits on Nov 2, 2023

  1. modify organisation of pyannote segmentation tasks

    Now all the segmentations tasks in `pyannote` inherit the `SegmentationTask`
    (previously `SegmentationTaskMixin`), which inherits the `Task` class. This
    commit also adds a `prepared_data` attribute  to the `Task` class. That
    attribute is a dict which contains all the prepared data by the `prepare_data`
    method.
    clement-pages committed Nov 2, 2023
    Configuration menu
    Copy the full SHA
    8829574 View commit details
    Browse the repository at this point in the history
  2. Merge branch 'feat/data_preparation' of github.com:clement-pages/pyan…

    …note-audio into feat/data_preparation
    clement-pages committed Nov 2, 2023
    Configuration menu
    Copy the full SHA
    fa63c8a View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2023

  1. add two training tests

    One for the test of the `MultiLabelSegmentation` task, and
    the other for the test of the `SupervisedRepresentationLearningWithArcFace`
    task.
    clement-pages committed Nov 7, 2023
    Configuration menu
    Copy the full SHA
    be6f7ec View commit details
    Browse the repository at this point in the history
  2. assign data directly to task in main process, in prepare_data

    This eliminates the need to reload pickle data in setup when in the main process
    clement-pages committed Nov 7, 2023
    Configuration menu
    Copy the full SHA
    f447bb6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    930deda View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2023

  1. handle call to Task.prepare_data and Task.setup under different s…

    …cenarios
    clement-pages committed Nov 8, 2023
    Configuration menu
    Copy the full SHA
    05ccc30 View commit details
    Browse the repository at this point in the history
  2. Merge branch 'feat/data_preparation' of github.com:clement-pages/pyan…

    …note-audio into feat/data_preparation
    clement-pages committed Nov 8, 2023
    Configuration menu
    Copy the full SHA
    44a01fe View commit details
    Browse the repository at this point in the history

Commits on Nov 9, 2023

  1. add training tests using task caches

    clement-pages committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    4b8e8a2 View commit details
    Browse the repository at this point in the history
  2. update cache_path type and docstrings

    clement-pages committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    45918bd View commit details
    Browse the repository at this point in the history
  3. fix classes variable used before assigment

    This issue occured when a list of classes was  specified during `MultiLabelSegmentation`
    instanciation.
    clement-pages committed Nov 9, 2023
    Configuration menu
    Copy the full SHA
    980414e View commit details
    Browse the repository at this point in the history

Commits on Nov 14, 2023

  1. Configuration menu
    Copy the full SHA
    a9ea07f View commit details
    Browse the repository at this point in the history

Commits on Nov 20, 2023

  1. Configuration menu
    Copy the full SHA
    797a8a4 View commit details
    Browse the repository at this point in the history

Commits on Nov 21, 2023

  1. improve code readability

    clement-pages committed Nov 21, 2023
    Configuration menu
    Copy the full SHA
    987e702 View commit details
    Browse the repository at this point in the history

Commits on Nov 27, 2023

  1. improve: use numpy method for w/r task cache instead pickle (#1)

    * use npz archive instead pickle to save task data
    
    * improve code readability
    
    * improve(task): update numpy array dtypes
    
    In order to use types whose size better machtes the contents of the arrays
    
    * remove `end` entry from `annotated_regions` numpy array
    
    This entry was redundant with the start and duration entries,
    since `end` = `start` + `duration`.
    
    * fix: allow data preparation to be finished when task has no validation
    
    * improve: clear data lists after assignation to `self.prepared_data`
    
    This is to avoid data redundancy in the `prepare_data` method
    
    ---------
    
    Co-authored-by: clement-pages <[email protected]>
    clement-pages and clement-pages authored Nov 27, 2023
    Configuration menu
    Copy the full SHA
    042dc43 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5358986 View commit details
    Browse the repository at this point in the history