WHOI Physical Oceanography Poseidon Workshop

This repository will contain all relevant information related to using parallel and distributed data analysis on the Poseidon @ WHOI (Woods Hole Oceanographic Institution). This repository will specifically go over Dask, xarray and requesting resources on Poseidon and your local machine.

The workshop will be held on October 3rd, 2024 and will be co-taught by Katy Abbott and Anthony Meza.

The workshop schedule can be found here: https://docs.google.com/document/d/1vHl_ZYNNWhaYK6h4Y93NFcOhk0zmLGbmzrVpsnM5Zi8/edit?usp=sharing

Our slides can be found here: https://docs.google.com/presentation/d/18fEL94cLxcA-prOxSrREmxWVYxIGuB8MrO2ymTiV2Sc/edit?usp=sharing

Workshop Overview, Goals and Prerequisites:

This workshop assumes knowledge of some basic programming concepts including variable declaration, boolean operators, loops, lists, dictionaries, conditionals and functions. This workshop will use Python. If you need to brush up on any of these concepts in Python, the WHOI Python Carpentries workshop website is a good place to start.

The goal of this workshop will be to provide attendees with an introduction to plotting and processing geophysical data stored in tabular (e.g. CSVs) or hierarchical (e.g. NetCDFs) formats. In particular, we will cover:

Using Dask for parallel computing
Leveraging xarray for handling multi-dimensional arrays
Requesting resources on the Poseidon cluster
Setting up your local machine for compatibility with Poseidon

Why Python for distributed competing?

Python is not the only language that provides distributed computing tools. However, based on our experience Python has the most mature and accessible tools for big data exploration and visualization in the climate sciences. Other languages such as MATLAB, Julia, and R offer similar tools to process geospatial data, but none are as complete as those in Python. Many scientific analysis codes are now being written exclusively in Python so we hope that understanding the basic functions that make up these codes will be worthwhile!

ssh into Poseidon, copy the script run_jupyter_on_poseidon.sh to your home directory, change [email protected] to your email, and run sbatch run_jupyter_on_poseidon.sh on the command line
Check the output from this script, which is piped to log-jupyter-{jobid}.log. You can check the job ID by running mj (short for my job) to see what jobs of yours are in the queue. Any errors will also be sent to this log.
Copy the line that has a format like ssh -N -f -L remote-port:remote server:remote-port [email protected], which show the port the server is running on, and the node it is using on Poseidon. Paste it into a new terminal window on your local machine and run. (See this screenshot for more details.)
Locate the url in the file log-jupyter-{jobid}.log that begins with http://127.0.0.1:remote-port..... Copy this url and paste it into a browser and your notebook should pop up!

On Windows

Download and install Putty
ssh into Poseidon, copy the script run_jupyter_on_poseidon.sh to your home directory, change [email protected] to your email, and run sbatch run_jupyter_on_poseidon.sh on the command line
Check the output from this script, which is piped to log-jupyter-{jobid}.log. You can check the job ID by running mj (short for my job) to see what jobs of yours are in the queue. Any errors will also be sent to this log.
Use the information from the log output to create a VPN tunnel with Putty (See this screenshot for more details.)
Start the tunnel by clicking "Open" in Putty
Locate the url in the file log-jupyter-{jobid}.log that begins with http://127.0.0.1:remote-port..... Copy this url and paste it into a browser and your notebook should pop up!

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
examples		examples
figures		figures
workshop_examples		workshop_examples
.gitignore		.gitignore
README.md		README.md
distributed_dashboard.yaml		distributed_dashboard.yaml
poseidon_setup.sh		poseidon_setup.sh
quick_login.sh		quick_login.sh
resources.md		resources.md
run_jupyter_on_poseidon.sh		run_jupyter_on_poseidon.sh
template.yaml		template.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WHOI Physical Oceanography Poseidon Workshop

Workshop Overview, Goals and Prerequisites:

Why Python for distributed competing?

Table of Contents

Installing Miniforge on Poseidon

Running a remote Jupyter notebook

On Mac/Linux

On Windows

About

Releases

Packages

Contributors 2

Languages

anthony-meza/PO_HPC_Workshop

Folders and files

Latest commit

History

Repository files navigation

WHOI Physical Oceanography Poseidon Workshop

Workshop Overview, Goals and Prerequisites:

Why Python for distributed competing?

Table of Contents

Installing Miniforge on Poseidon

Running a remote Jupyter notebook

On Mac/Linux

On Windows

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages