Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Echopype: Upgrade robustness and scalability of ocean sonar data processing #41

Open
leewujung opened this issue Jan 25, 2024 · 16 comments
Labels
GSoC24 project idea Designates a proposed project idea

Comments

@leewujung
Copy link
Contributor

leewujung commented Jan 25, 2024

Project Description

Echosounders, or high-frequency ocean sonar systems, are the workhorse to study life in the ocean. They provide continuous observations of fish and zooplankton by transmitting sounds and analyzing the echoes bounced off these animals, just like how medical ultrasound images the interior of the human body. In recent years echosounders are widely deployed on ships, autonomous vehicles, or moorings, bringing in significant volumes of data that allow scientists to study the rapidly changing marine ecosystems. This project aims to upgrade the robustness and scalability of the Echopype package, which standardizes data from different echosounder instruments into widely accessible netCDF or Zarr files. The project work will focus on making the Echopype testing suite more robust by overhauling its Continuous Integration (CI) mechanisms and tackling distributed computing bottlenecks in processing irregularly spaced echosounder data across computing agents.

Expected Outcomes

[1] Robust Continuous Integration mechanisms that utilize GitHub release assets for hosting test files
[2] Increased test coverage for foundational data conversion functions
[3] Improved distributed computing performance for major processing functions on large (100s of GB) data sets

Skills required

Python; Libraries: Xarray, Dask, Zarr; Interests in working with oceanographic, acoustic or geospatial data

Mentor(s)

Wu-Jung Lee (@leewujung), Valentina Staneva (@valentina-s)

Expected Project Size

175

What is the difficulty of the project?

Intermediate

@leewujung leewujung added GSoC23 project idea Designates a proposed project idea labels Jan 25, 2024
@mwengren mwengren added GSoC24 and removed GSoC23 labels Feb 2, 2024
@leewujung leewujung changed the title [GSoC Project Proposal]: Upgrade robustness and scalability of Echopype for ocean sonar data processing Echopype: Upgrade robustness and scalability of ocean sonar data processing Feb 9, 2024
@mwengren
Copy link
Member

@leewujung @valentina-s Could you choose either 175 or 350 hours for the project size and update the description above? We got some feedback from GSoC that 'Project sizes need to be scoped to 90, 175 or 350 hours (you can not have a project that is 200 hours, or some other random number of hours)'.

Not entirely sure if that means an option of two of the official project sizes is allowable or not, but to be safe we should just go with one size. I will update the project ideas list accordingly. Thanks!

@leewujung
Copy link
Contributor Author

@mwengren : Yes! I think we can change it to 175 hours. I'll submit a pull request for that edit. Thanks

@MohamedNasser8
Copy link

MohamedNasser8 commented Feb 23, 2024

Hello @leewujung @valentina-s
My name is Mohamed Nasser I am last year student at Biomedical Engineering I have experience in python and it's libraries
Through my acadimic journey i dealed with different biological data and visualizations.
I think this idea is very interesting to me and want to learn more about it.
I don't have all the skills required but I am eager to learn, so can you tell how to start.
I think i will begin by getting familier with Xarray, Dask, Zarr and know more about oceanographic

@MohamedNasser8
Copy link

Sould I make contribution to the main repo or can @leewujung , @valentina-s guide me how to start?

@leewujung
Copy link
Contributor Author

leewujung commented Feb 24, 2024

Hey @MohamedNasser8:
Thanks for reaching out! You can start out by checking out our contributor's guide and make sure:

  1. You have the dev environment ready to do, and
  2. You can run the notebooks in the echopype-examples repo which we use to host example notebooks of using the echopype package.

In the next few days we'll start marking relevant issues in the echopype repo, but feel free to start by looking into needs and ask questions or propose anything about upgrading the testing framework and improve test coverage.

@leewujung
Copy link
Contributor Author

I created a new label "GSoC24" and will continue to add issues to that, feel free to take a look: https://github.com/OSOceanAcoustics/echopype/labels/GSoC24

@skald1311
Copy link

Hello @leewujung and @valentina-s,

I hope you guys are doing well! My name is Duong. I'm a 2nd year student majoring in Computer Science at University of Alberta in Canada. All of my coursework have been taught in Python, therefore I would say I have a pretty good grasp of the language. I'd love to have the chance to contribute to this project as working with different echosounder instruments and oceanographic data sound intriguing.

I'm currently going over the "Contributing to echopype" section but I'm stuck at the "Running the test" part. After installing docker, I typed the first command in the activated conda env (this one: python .ci_helpers/docker/setup-services.py --deploy) but it gave me a bunch of errors. For example, in step 2, I got " TypeError: kwargs_from_env() got an unexpected keyword argument 'ssl_version' ", as well as connection errors. I was wondering if this was on my end and what I should do next.

Regarding the proposal, say I've done my draft proposal, would it be okay if I email that to you guys and get some feedbacks on how I can refine it? If yes, please tell me the appropriate email address.

Thanks,

@johnathankahn
Copy link

Hey @leewujung
what will be the selection criteria ? what weightage will contributions have ?

@Kshitijpatil16
Copy link

Greetings @leewujung,
Myself Kshitij Patil, a third-year electronics engineering student, Veermata Jijabai Institute of technology, Mumbai, India.
I am interested to contribute in this project as I find it a very good learning opportunity for me. I am confident that I will be able to work efficiently as this project is quite intriguing for me.

I was going through the "Contributing to echopype" document, successfully completed the installation. But now I'm stuck at the "Running the tests" segment. I had docker previously installed and I am getting stuck at this specific command
python .ci_helpers/run-test.py --local --pytest-args="-vv"

This is the error I am receiving -
image

I tried to resolve the error as I thought it must be the error regarding my pytest not present in the pip list. But thats not the case, i have been trying a lot but now finally came to this conclusion from Chat-gpt
If the issue persists, you might need to check for any custom configurations or specific instructions provided by the Echopype project for running tests locally. Additionally, consider reaching out to the project maintainers or community for further assistance, as they may have insights into project-specific configurations. Remember to consult the project's documentation or README file for any additional requirements or setup instructions related to testing.

Can you please assist me regarding this error, please
I would also want to know what exactly is expected to proceed for this project, which resources should i go through and what issues can I look after to

Thanks and regards,
Kshitij Patil

@leewujung
Copy link
Contributor Author

@skald1311 : Looks like what you ran into is an issue with the latest docker version: https://stackoverflow.com/questions/77641240/getting-docker-compose-typeerror-kwargs-from-env-got-an-unexpected-keyword-ar Try to see if you could downgrade the version and the problem should be resolved -- @ctuguinay who's on our team found this problem last week!

@leewujung
Copy link
Contributor Author

@Kshitijpatil16 : Seems like it is an environment setup problem. Maybe see if you can verify that you are in the right environment where pytest is available.

@leewujung
Copy link
Contributor Author

leewujung commented Mar 4, 2024

I added an issue template for asking questions and discussing ideas under GSoC24. Feel free to give it a try, as well as asking questions directly under existing issues.

I've also added the "GSoC24" labels to more existing issues that are within the realm of GSoC24. The newly added ones are more related to the scalability component, and the previous ones are more related to testing.

I will put together a GSoC contributor's guide in the next couple days, which should answer some of the above questions, but in general:

  • We would like to see some rough prototypes of what you plan to do as PR(s) to go with your proposal
  • Your proposals and/or PR(s) should demonstrate that you have taken concrete steps toward understanding any previously unfamiliar libraries, and show that you will be able to improve the testing framework/tests and enhance scalability. Examples include (but not limited to) benchmarking report or potential solutions you find on the internet related to existing issues
  • We are happy to provide feedback to your proposals. Please find email from my profile

@MohamedNasser8
Copy link

Hello @leewujung, Is there a template for the porposal?

@leewujung
Copy link
Contributor Author

Alright, here's the GSoC24 contributor's guide: https://github.com/OSOceanAcoustics/echopype/blob/main/gsoc_contrib_guide.md

@MohamedNasser8 : Yes, please use the IOOS template. See the contributor's guide linked above for more info.

@yusuf-khaan
Copy link

yusuf-khaan commented Mar 31, 2024

Greeting @leewujung

I am a 3rd year student pursuing B.Tech in Computer Science(specialisation in Data Science and AI) at Shri Ramswaroop Memorial University, Lucknow.Through my academy Journey i have dealed in Core Java, Python, Machine Learning, visualization and has been active on LeetCode which has so much improved my Coding as well as analytical skills. This Project has piqued my interest on how to maximize its Potential and want to contribute to it. I am eager to start working on it. So can you please guide me through

on what are the current weakness and how much of increase in scalability you expect

@leewujung
Copy link
Contributor Author

Hey @yusuf-khaan : Thanks for your interest! Please see the links in my response above to get started on your contribution/proposed work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GSoC24 project idea Designates a proposed project idea
Projects
None yet
Development

No branches or pull requests

7 participants