Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.1.2 #47

Merged
merged 40 commits into from
Jun 5, 2024
Merged

Release 0.1.2 #47

merged 40 commits into from
Jun 5, 2024

Conversation

wilke0818
Copy link
Collaborator

  • Created Audio Pydantic model class for storing an Audio representation that can support mono and stereo audio files (Task: Internal architecture (data structures) #42)
    • Audio class offers different formats for creating instances and importantly maintains a consistent internal representation of the Audio as a torch.Tensor of shape (num_channels, num_samples)
    • Audios maintain, but currently do not utilize metadata information
  • Created AudioDataset class to help manage large number of Audios and importantly to offer functionality for running Audio tasks and pipelines with Pydra in an efficient manner
  • Rewrote data_augmentations and preprocessing to show use cases and code simplification that these abstract data types will allow (Task: data augmentation. #43)

@wilke0818 wilke0818 requested a review from fabiocat93 June 2, 2024 18:19
@codecov-commenter
Copy link

codecov-commenter commented Jun 2, 2024

Codecov Report

Attention: Patch coverage is 84.92176% with 106 lines in your changes missing coverage. Please review.

Project coverage is 65.51%. Comparing base (43f451c) to head (e95f8ec).

Files Patch % Lines
src/senselab/utils/data_structures/dataset.py 66.00% 34 Missing ⚠️
src/senselab/utils/data_structures/audio.py 62.06% 22 Missing ⚠️
src/senselab/utils/data_structures/video.py 60.60% 13 Missing ⚠️
src/senselab/utils/device.py 72.41% 8 Missing ⚠️
...lab/audio/tasks/speech_to_text_evaluation_pydra.py 0.00% 7 Missing ⚠️
src/senselab/audio/tasks/data_augmentation.py 68.42% 6 Missing ⚠️
src/senselab/audio/tasks/preprocessing_pydra.py 0.00% 5 Missing ⚠️
src/senselab/audio/tasks/voice_cloning.py 0.00% 4 Missing ⚠️
src/senselab/audio/tasks/speech_to_text.py 0.00% 2 Missing ⚠️
src/senselab/utils/tasks/cca_cka.py 95.55% 2 Missing ⚠️
... and 3 more
Additional details and impacted files
@@            Coverage Diff             @@
##            main      #47       +/-   ##
==========================================
+ Coverage   9.06%   65.51%   +56.45%     
==========================================
  Files         17       37       +20     
  Lines        298      957      +659     
==========================================
+ Hits          27      627      +600     
- Misses       271      330       +59     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fabiocat93
Copy link
Collaborator

thank you, @wilke0818; this is great! I have made some changes to align this branch with main and fixed some formatting.

Some suggestions:

  • please, use DeviceType (as in senselab.utils.functions; actually, we can move this to senselab.utils.device).
    Also, we prefer specifying the device instead of using use_gpu (this is not a binary choice; e.g., we do support MPS, too)
  • I have some concerns about the AudioDataset class (do we really need it?). I feel we need to discuss this further. And if the answer is yes, can you make it a child class of Pydantic.BaseModel as you did with Audio?

@fabiocat93 fabiocat93 changed the title Audio abstract dtype Release 0.1.2 Jun 4, 2024
This was linked to issues Jun 4, 2024
@fabiocat93 fabiocat93 added the enhancement New feature or request label Jun 4, 2024
def augment_hf_dataset(
dataset: Dict[str, Any], augmentation: Compose, audio_column: str = "audio"
) -> Dict[str, Any]:
def augment_audio_dataset(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we say "audios" instead of "audio_dataset" since it's a list of audios for how it is rn?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, not urgent, for the very good documentation that we will write, it's a good idea to include some suggestions for parameters to use for the different types of augmentations

"""
augmentation.output_type = "dict"
new_audios = []
device_type, dtype = _select_device_and_dtype(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you manage the scenario when the developer wants to use a device which is not supported? this is the question you asked me

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont

@fabiocat93
Copy link
Collaborator

As a general comment, we have many for loops, which may be a sign that we still need to work on optimizing the code. will keep this for our discussion later today @wilke0818

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general, if we create some workflows, it makes sense having a _pydra version of the scripts. Otherwise, we can probably remove marking functions as pydra tasks or we can do it in the same .py file. will leave this here for our discussion later today @wilke0818

from senselab.utils.data_structures.video import Video


class SenselabDataset(BaseModel):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we said, we need to merge our Dataset classes @wilke0818

"""
pass

def create_audio_split_for_pydra_task(self, batch_size: int = 1) -> List[List[Audio]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we also need a method for combining results together?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good point. Worth considering how useful we think this Pydra split functionality is

(e.g. participant demographics, video settings, location information)
"""

frames: torch.Tensor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate the shape of this torch.Tensor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can, I think it should be 4 dimensions?

MPS: str = "mps"


def _select_device_and_dtype(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we maybe have 3 params?

  • requested_device (optional)
  • available_devices (we can compute this)
  • compatible_devices (this depends on the models)

in this way, we can more easily handle when the user asks for a device that is not available or compatible

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

Audio(waveform=stereo_audio[0].waveform[1], sampling_rate=stereo_audio[0].sampling_rate),
]
).create_audio_split_for_pydra_task(2)
batch_inverted = augment_audio_dataset(batched_audio[0], apply_augmentation)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tested this code with and without GPU?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, as I mentioned before, I am not sure exactly how we would go about realistically testing on GPU. We could mock it, but that seems like it doesn't really test

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will need to be adjusted based on the merge we will do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably mostly creation logic and testing. Probably easiest to merge your branch in and just clean it up. Hopefully there'll be minimal conflicts.

Copy link
Collaborator

@fabiocat93 fabiocat93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in person

@fabiocat93 fabiocat93 removed the enhancement New feature or request label Jun 5, 2024
@fabiocat93 fabiocat93 merged commit 96b7ff0 into main Jun 5, 2024
4 checks passed
@fabiocat93 fabiocat93 deleted the audio_abstract_dtype branch June 17, 2024 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Task: Internal architecture (data structures) Task: Utilities Task: Preprocessing
4 participants