diff --git a/index.rst b/index.rst index 26edafe9..e28c3958 100644 --- a/index.rst +++ b/index.rst @@ -13,17 +13,16 @@ Welcome to CSC Geocomputing course! * Are you curious on how you can take your geospatial data processing and analysis to the next level? * Or maybe you have been using a supercomputer already, but would like to make sure your are getting the most out of it? -→ **This course is intended for you!** +→ This course is intended for you! -In this course we will learn the basics of geocomputing on a supercomputer through a combination of lectures and hands-on activities. The main focus of the course is Puhti supercomputer, were all hands-on exercises will be done. The CSC services discussed in this course are free-of-charge for academic research, education and training purposes for Finnish higher education institutions and state research institutes (subsidized by the Ministry of Education and Culture, Finland). +**In this course we will learn the basics of geocomputing on a supercomputer through a combination of lectures and hands-on activities.** The main focus of the course is Puhti supercomputer, were all hands-on exercises will be done. The CSC services discussed in this course are free-of-charge for academic research, education and training purposes for Finnish higher education institutions and state research institutes (subsidized by the Ministry of Education and Culture, Finland). -Most of the course content also applies to LUMI supercomputer, which is available for academic users **and companies**. +Most of the course content also applies to LUMI supercomputer, which is available for academic users and companies. The course is meant both for academic researchers planning to use Puhti supercomputer and for data analysts from private companies planning to use LUMI. -.. warning:: - THIS MATERIAL IS WORK IN PROGRESS, do not trust anything ! ;) - +Table of contents +################### .. toctree:: :maxdepth: 2 diff --git a/materials/account_project.md b/materials/account_project.md index 3ad56a1b..ca9348b2 100644 --- a/materials/account_project.md +++ b/materials/account_project.md @@ -6,7 +6,9 @@ Every CSC account must be linked with a **CSC project**, which enables you to sh CSC services are **free of charge** for open science at Finnish higher education institutions and research institutes. -[CSC Docs: Accounts and projects](https://docs.csc.fi/accounts/) +* [CSC Docs: Accounts and projects](https://docs.csc.fi/accounts/) +* [LUMI, get started](https://lumi-supercomputer.eu/get-started/) +* [CSC, LUMI high performance computing services offer companies a competitive advantage](https://csc.fi/web/guest/solutions-for-business) :::{admonition} Course project :class: hint @@ -47,7 +49,8 @@ Your first steps into many CSC services goes via [`https://my.csc.fi`](https://m - Amount of resources allocated: All requested resources are billed ie. number of cores, amount of memory - Time allocated: Resources are billed based on the actual (wall) _time_ a job has **used**, not the reserved maximum time -[CSC Docs: Billing units](https://docs.csc.fi/accounts/billing/) +* [CSC Docs: Billing units](https://docs.csc.fi/accounts/billing/) +* [LUMI Docs: Billing policy](https://docs.lumi-supercomputer.eu/runjobs/lumi_env/billing/) ### Applying for billing units diff --git a/materials/csc.md b/materials/csc.md index 1177c0c0..ca8cd4c0 100644 --- a/materials/csc.md +++ b/materials/csc.md @@ -8,17 +8,28 @@ ![Kajaani](images/kajaani.png) -## [Geoportti](https://www.geoportti.fi) +## Geoportti Geoportti Research Infrastructure (RI) is a shared service for researchers, teachers and students using geospatial data and geocomputing tools. Geoportti RI helps the researchers in Finland to use, to refine, to preserve and to share their geospatial resources. +* [GeoPortti web portal](https://www.geoportti.fi) +* GeoPortti services: + * [GeoPortti GeoCubes](https://vm0160.kaj.pouta.csc.fi/geocubes/) - a harmonised, multi-resolution raster geodata repository containing several national datasets + * [GeoPortti GeoPrivacy](https://geoprivacy.fi/#/) - a service where cyclists and pedestrians can donate GPS tracking data for science. + * [UEF Drone Lab](https://www.geoportti.fi/tools/drones/) + * [Geospatial Challenge Camp](https://challenge-camp.geoportti.fi/en/latest/) - a 10-week long challenge-based course (5 ECTS) that aims to provide participants a chance to tackle relevant real-world challenges in cross-disciplinary teams + * At CSC: supercomputer geospatial installations, support and documentation, STAC, GIS training. + ![](./images/geoportti.png) -## [Location Innovation Hub](https://locationinnovationhub.eu) +## Location Innovation Hub The Location Innovation Hub (LIH) is a centre of excellence in location information coordinated by the Finnish Geospatial Research Institute. Our services are produced in conjunction with a partner network. We help companies to grow their business with location information. We also serve the public sector. +* [Location Innovation Hub](https://locationinnovationhub.eu) +* At CSC: introducing LUMI to geospatial companies + ![](./images/lih.png) diff --git a/materials/csc_services.md b/materials/csc_services.md index 0f9d3bf9..033912ef 100644 --- a/materials/csc_services.md +++ b/materials/csc_services.md @@ -20,4 +20,10 @@ :::{admonition} Want to know more? :class: seealso See also [CSC service catalog](https://research.csc.fi/en/service-catalog) -::: \ No newline at end of file +::: + +:::{admonition} Sensitive data +:class: important + +Sensiteve data should be saved and processed only in services for sensitive data: [SD services](https://research.csc.fi/sensitive-data-services-for-research), [ePouta](https://research.csc.fi/-/epouta) Encrypted files can be stored also to [Allas](https://research.csc.fi/-/allas). Supercomputers and cPouta should not be used for sensitive data. +::: diff --git a/materials/examples.md b/materials/examples.md index 37a45908..174ff37c 100644 --- a/materials/examples.md +++ b/materials/examples.md @@ -45,7 +45,7 @@ Arttu Kivimäki, FGI/NLS: [Mosaicking Sentinel-2 data in Puhti](https://a3s.fi/g Tapio Friberg, ICEYE: [LUMI usecase](https://gis-seminars.a3s.fi/2023-06-08-lumi-for-gis-iceye-use-case.pdf) ``` -You can find all CSC seminar presentations on [CSC geocomputing research pages](https://research.csc.fi/geocomputing-seminars). +You can find more use case presentations from [CSC: geocomputing seminars page](https://research.csc.fi/geocomputing-seminars). ## Some publications from Finland that used Puhti @@ -83,4 +83,4 @@ Samantha Wittke et al, FGI/Aalto [EODIE - Earth Observation Data Information Ext ``` -Know more? -> Please let us know :) \ No newline at end of file +Know more? -> Please let us know :) diff --git a/materials/exercise_basics.md b/materials/exercise_basics.md index b96ed962..0ab03898 100644 --- a/materials/exercise_basics.md +++ b/materials/exercise_basics.md @@ -35,7 +35,7 @@ In an interactive batch job, an interactive shell session is launched on a compu ### Launching an interactive job / compute node shell -Observe how now you need to define the resources you want to reserve now. +Observe how you need to now define the resources you want to reserve. Let's reserve 10 minutes. :::{admonition} Other ways of starting an interactive session @@ -201,4 +201,4 @@ gdalinfo /appl/data/geo/luke/forest_wind_damage_sensitivity/2017/windmap2017_int * Resource request lines start with `#SBATCH` * You can find the jobs output, errors and prints in `slurm-jobid.out` -::: \ No newline at end of file +::: diff --git a/materials/exercise_r.md b/materials/exercise_r.md index 29b9e965..740929c8 100644 --- a/materials/exercise_r.md +++ b/materials/exercise_r.md @@ -1,5 +1,12 @@ # Exercise: R +## R in supercomputers +* `r-env` is the only R module in Puhti with ~1300 packages for all fields of science. +* Mahti does not have R. +* LUMI has only [EasyBuild recepy for R](https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs/r/R/) +* [CSC Docs: r-env](https://docs.csc.fi/apps/r-env/) +* [CSC Docs: R for GIS](https://docs.csc.fi/apps/r-env-for-gis/) + :::{admonition} Timing :class: note @@ -10,23 +17,35 @@ :::{admonition} Goals :class: note -* +* Get to know `r-env` R environment on Puhti +* Running R code interactively and as batch job +* Try out different ways of parallelizing R code + ::: :::{admonition} Prerequisites :class: important -* ... +* [CSC user account](https://docs.csc.fi/accounts/how-to-create-new-user-account/) and [project](https://docs.csc.fi/accounts/how-to-create-new-project/) with [access to Puhti](https://docs.csc.fi/accounts/how-to-add-service-access-for-project/) +* Some experience with R spatial +* Basic Linux skills ::: + + [R exercise materials in Geocomputing Github](https://github.com/csc-training/geocomputing/tree/master/R/puhti) +* Interactive working +* Simple batch job +* Parallel job +* Optional, Array job + :::{admonition} Key points :class: important -* ... - +* Puhti web interface enables working with RStudio interactively +* `future` can be used to parallelization -::: \ No newline at end of file +::: diff --git a/materials/exercise_webinterface.md b/materials/exercise_webinterface.md index 12824cde..6d4b8a9f 100644 --- a/materials/exercise_webinterface.md +++ b/materials/exercise_webinterface.md @@ -27,6 +27,12 @@ * Open [Puhti web interface](https://puhti.csc.fi) and log in +:::{admonition} Change the default project and username + +* `project_200xxxx` is example project name, replace with your own CSC project name. +* `cscusername` is example username, replace with your username. +::: + ### Info * Puhti general status: bottom of front page * Sometimes when the `Disk lag` here is high, reading and writing files might get slow. @@ -34,31 +40,24 @@ * Disk usage of own projects: `Tools` -> `Disk quotas` * Running jobs: `Jobs` -> `Active jobs` -:::{admonition} Change the default project and username - -* `project_200xxxx` is example project name, replace with your own CSC project name. -* `cscusername` is example username, replace with your username. -::: ### Files * Open home directory: `Files` -> `Home Directory` -* Create new directory and open it -* Create new `.txt` file with your name -* Move the new file under scratch: +* Create new `myfile.txt` file and add some text to it. +* Create new directory `mydata` +* Move the new file under `mydata`: * Mark check-box in front of the file * Click `Copy/Move` - * Open `/scratch/project_200xxxx` + * Open `mydata` * Click `Move` -* Open your scratch folder +* Open your `mydata` folder * Download your file to your local computer -* Delete the file :::{admonition} Moving data Web interface is for moving up to 10Gb data, if you have more data use other tools. More info in [moving data](moving_data.md) ::: - ### Graphical applications #### Jupyter diff --git a/materials/job_types.md b/materials/job_types.md index cd1b85f1..8097fc00 100644 --- a/materials/job_types.md +++ b/materials/job_types.md @@ -15,9 +15,9 @@ Apart from interactive jobs, a job can be classified as **serial, parallel or GP ## Serial jobs -Serial jobs means that the computer works on only one task at a time following a sequence of instructions, while only using one core. +Serial job means that the computer works on only one task at a time following a sequence of instructions, while only using one core. -Why could your serial job benefit from being executed using CSC's resources instead of on your own computer? +Why would your serial job benefit from being executed using CSC's resources instead of on your own computer? - Part of a larger workflow - Avoid data transfer between CSC and your own computer - Data sharing among other project members diff --git a/materials/moving_data.md b/materials/moving_data.md index 678f5ed1..686debb1 100644 --- a/materials/moving_data.md +++ b/materials/moving_data.md @@ -1,12 +1,10 @@ # Moving data ## Local computer <-> supercomputer - -* [CSC Docs: Moving files between a local computer and a supercomputer](https://docs.csc.fi/data/moving/) - ### Puhti Web Interface -- Very easy, no installations needed. +- Graphical, no installations needed. +- Limited functionality compared to other options. - For smaller amounts of data, < 10 Gb. - Upload, download, moving, creating folders. - [Puhti Web Interface](https://puhti.csc.fi) -> Files @@ -16,7 +14,7 @@ - For example: **FileZilla**, **WinSCP** and **CyberDuck** - For medium amounts of data, < 1 Tb. -- Very easy, but installation required. +- Easy drag-and-drop for moving, but installation required. - WinSCP is slower than others. - [CSC Docs: Graphical data transfer tools](https://docs.csc.fi/data/moving/graphical_transfer/) @@ -24,6 +22,7 @@ ### Command line tools on local computer - For any amount of data, practically required if data size > 1 Tb. +- Requires knowing the commands. #### scp @@ -55,7 +54,7 @@ rsync --info=progress2 -a /path/to/a_file cscusername@puhti.csc.fi:/scratch/proj # One folder: rsync --info=progress2 -a /path/to/directory cscusername@puhti.csc.fi:/scratch/project_200xxxx/directory ``` -* `progress2` shows time left and percentage +* `--info=progress2` shows time left and percentage :::{admonition} Firewall limitations @@ -70,8 +69,9 @@ Some organizations, for example research institutes with IT-services from Valtor - When downloading from exernal services try to download directly to CSC, not via your local computer - Check what APIs/tools the service supports: - - OGC APIs, [STAC](https://csc-training.github.io/geocomputing_course/materials/stac.html) - - ftp, rsync + - Standard APIs: OGC APIs, [STAC](https://csc-training.github.io/geocomputing_course/materials/stac.html) + - Custom service APIs + - ftp, rsync - wget/curl if HTTP-urls avaialable ### wget @@ -86,6 +86,12 @@ wget http://wwwd3.ymparisto.fi/d3/gis_data/spesific/syvyyskayra.zip wget -r -nc ftp://ftp.aineistot.metsaan.fi/Metsamaski/Maakunta/ --cut-dirs=2 ``` +:::{admonition} More options :class: note + +* [CSC Docs: Moving files between a local computer and a supercomputer](https://docs.csc.fi/data/moving/) + +::: + :::{admonition} Possible trouble with file transfer between Windows and Linux diff --git a/materials/partitions.md b/materials/partitions.md index a2b20152..332b416c 100644 --- a/materials/partitions.md +++ b/materials/partitions.md @@ -2,9 +2,10 @@ Partitions are logical sets of nodes. Resource limitations for a job are defined by the partition (or queue) the job is submitted to. The limitations affect the maximum run time, the amount of memory, and the number of available CPU cores (which are called CPUs in Slurm). In addition, partitions may also define default resources that are automatically allocated for jobs if nothing has been specified. -Jobs should be submitted to the partition that best matches the required resources. That way, as few resources as possible are blocked and another user with a higher demand in RAM can run a job earlier. Of course, other considerations may also influence the choice of a partition. +Jobs should be submitted to the partition that best matches the required resources. That way, as few resources as possible are blocked and another user with a higher demand in memory can run a job earlier. Of course, other considerations may also influence the choice of a partition. -- [CSC Docs: Available batch job partitions](https://docs.csc.fi/computing/running/batch-job-partitions/) +- [CSC Docs: Available batch job partitions](https://docs.csc.fi/computing/running/batch-job-partitions/) +- [LUMI Docs: Slurm particions](https://docs.lumi-supercomputer.eu/runjobs/scheduled-jobs/partitions/) - In order to use the resources in an efficient way, it is important to estimate the request as accurately as possible - By avoiding an excessive "just-in-case" request, the job will start earlier diff --git a/materials/prerequisites.md b/materials/prerequisites.md index a8eeb298..1a0677b5 100644 --- a/materials/prerequisites.md +++ b/materials/prerequisites.md @@ -11,8 +11,8 @@ To make this course as enjoyable as possible for you and to make sure you can ge * [UNIX tutorial for beginners](http://www.ee.surrey.ac.uk/Teaching/Unix/) (the first two topics are a good start, try also some editor) * [Basic Linux Commands 10 min tutorial video](https://www.youtube.com/watch?v=uFPly_nGBMg) (sit back and watch) * [CSC and Linux Cheat Sheet](./cheatsheet.md) (one page summary of the most important Linux commands – handy to have near you during the course) + * [Terminal intro](terminal.md) ## CSC account -For the exercises, a [CSC account](https://docs.csc.fi/accounts/how-to-create-new-user-account/) is needed. -For self-learning you will also need a project. +For the exercises, [CSC user account](https://docs.csc.fi/accounts/how-to-create-new-user-account/) and [project](https://docs.csc.fi/accounts/how-to-create-new-project/) with [access to Puhti](https://docs.csc.fi/accounts/how-to-add-service-access-for-project/). For Allas exercise also Allas service must be enabled for the project. diff --git a/materials/software.md b/materials/software.md index 58dd18dc..d0dce309 100644 --- a/materials/software.md +++ b/materials/software.md @@ -7,49 +7,51 @@ ## GIS tools available in Puhti -* [Ames Stereo Pipeline](https://docs.csc.fi/apps/ames-stereo.md) for processing stereo images -* [ArcGIS Python API](https://docs.csc.fi/apps/arcgis.md) -* [CloudCompare](https://docs.csc.fi/apps/cloudcompare.md) for visualizing, editing and processing poing clouds -* [FORCE](https://docs.csc.fi/apps/force.md) for mass-processing of medium-resolution satellite images -* [GDAL](https://docs.csc.fi/apps/gdal.md) for geospatial data formats -* **[Geoconda](https://docs.csc.fi/apps/geoconda.md)** - Python spatial analysis libraries -* [GRASS GIS](https://docs.csc.fi/apps/grass.md) General purpose GIS software family for viewing, editing and analysing geospatial data -* [LAStools](https://docs.csc.fi/apps/lastools.md) for LiDAR datasets +* [Ames Stereo Pipeline](https://docs.csc.fi/apps/ames-stereo/) for processing stereo images +* [ArcGIS Python API](https://docs.csc.fi/apps/arcgis/) +* [CloudCompare](https://docs.csc.fi/apps/cloudcompare/) for visualizing, editing and processing poing clouds +* [FORCE](https://docs.csc.fi/apps/force/) for mass-processing of medium-resolution satellite images +* [GDAL](https://docs.csc.fi/apps/gdal/) for geospatial data formats +* **[Geoconda](https://docs.csc.fi/apps/geoconda/)** - Python spatial analysis libraries +* [GRASS GIS](https://docs.csc.fi/apps/grass/) General purpose GIS software family for viewing, editing and analysing geospatial data +* [LAStools](https://docs.csc.fi/apps/lastools/) for LiDAR datasets * [MATLAB](https://docs.csc.fi/apps/matlab/) -* [OpenDroneMap](https://docs.csc.fi/apps/opendronemap.md) for processing aerial drone imagery -* [Orfeo ToolBox](https://docs.csc.fi/apps/otb.md) for remote sensing applications -* [PCL](https://docs.csc.fi/apps/pcl.md) for 2D/3D image and point cloud processing -* [PDAL](https://docs.csc.fi/apps/pdal.md) for point cloud translations and processing -* [QGIS](https://docs.csc.fi/apps/qgis.md) General purpose GIS software family for viewing, editing and analysing geospatial data -* **[R for GIS](https://docs.csc.fi/apps/r-env-for-gis.md)** R spataial analysis libraries -* [SAGA GIS](https://docs.csc.fi/apps/saga-gis.md) General purpose GIS software family for viewing, editing and analysing geospatial data -* [Sen2Cor](https://docs.csc.fi/apps/sen2cor.md) for atmospheric-, terrain and cirrus correction of the Sentinel-2 products -* [Sen2mosaic](https://docs.csc.fi/apps/sen2mosaic.md) for download, preprocessing and mosaicing of Sentinel-2 products -* **[SNAP](https://docs.csc.fi/apps/snap.md)** for remote sensing applications -* [WhiteboxTools](https://docs.csc.fi/apps/whiteboxtools.md) an advanced geospatial data analysis platform -* [Zonation](https://docs.csc.fi/apps/zonation.md) Spatial conservation prioritization framework -* **[pytorch](https://docs.csc.fi/apps/pytorch.md)** for deep learning -* **[tensorflow](https://docs.csc.fi/apps/tensorflow.md)** for deep learning +* [OpenDroneMap](https://docs.csc.fi/apps/opendronemap/) for processing aerial drone imagery +* [Orfeo ToolBox](https://docs.csc.fi/apps/otb/) for remote sensing applications +* [PCL](https://docs.csc.fi/apps/pcl/) for 2D/3D image and point cloud processing +* [PDAL](https://docs.csc.fi/apps/pdal/) for point cloud translations and processing +* [QGIS](https://docs.csc.fi/apps/qgis/) General purpose GIS software family for viewing, editing and analysing geospatial data +* **[R for GIS](https://docs.csc.fi/apps/r-env-for-gis/)** R spataial analysis libraries +* [SAGA GIS](https://docs.csc.fi/apps/saga-gis/) General purpose GIS software family for viewing, editing and analysing geospatial data +* [Sen2Cor](https://docs.csc.fi/apps/sen2cor/) for atmospheric-, terrain and cirrus correction of the Sentinel-2 products +* [Sen2mosaic](https://docs.csc.fi/apps/sen2mosaic/) for download, preprocessing and mosaicing of Sentinel-2 products +* **[SNAP](https://docs.csc.fi/apps/snap/)** for remote sensing applications +* [WhiteboxTools](https://docs.csc.fi/apps/whiteboxtools/) an advanced geospatial data analysis platform +* [Zonation](https://docs.csc.fi/apps/zonation/) Spatial conservation prioritization framework +* **[pytorch](https://docs.csc.fi/apps/pytorch/)** for deep learning +* **[tensorflow](https://docs.csc.fi/apps/tensorflow/)** for deep learning ## GIS tools available in Mahti -* [Geoconda](https://docs.csc.fi/apps/geoconda.md) - Python spatial analysis libraries -* [pytorch](https://docs.csc.fi/apps/pytorch.md) for deep learning -* [tensorflow](https://docs.csc.fi/apps/tensorflow.md) for deep learning +* [Geoconda](https://docs.csc.fi/apps/geoconda/) - Python spatial analysis libraries +* [pytorch](https://docs.csc.fi/apps/pytorch/) for deep learning +* [tensorflow](https://docs.csc.fi/apps/tensorflow/) for deep learning ## GIS tools available in LUMI -* [GDAL](https://docs.csc.fi/apps/gdal.md) for geospatial data formats -* [GRASS GIS](https://docs.csc.fi/apps/grass.md) General purpose GIS software family for viewing, editing and analysing geospatial data -* [PDAL](https://docs.csc.fi/apps/pdal.md) for point cloud translations and processing -* [QGIS](https://docs.csc.fi/apps/qgis.md) General purpose GIS software family for viewing, editing and analysing geospatial data -* [SAGA GIS](https://docs.csc.fi/apps/saga-gis.md) General purpose GIS software family for viewing, editing and analysing geospatial data -* [pytorch](https://docs.csc.fi/apps/pytorch.md) for deep learning -* [tensorflow](https://docs.csc.fi/apps/tensorflow.md) for deep learning +* [GDAL](https://docs.csc.fi/apps/gdal/) for geospatial data formats +* [GRASS GIS](https://docs.csc.fi/apps/grass/) General purpose GIS software family for viewing, editing and analysing geospatial data +* [PDAL](https://docs.csc.fi/apps/pdal/) for point cloud translations and processing +* [QGIS](https://docs.csc.fi/apps/qgis/) General purpose GIS software family for viewing, editing and analysing geospatial data +* [SAGA GIS](https://docs.csc.fi/apps/saga-gis/) General purpose GIS software family for viewing, editing and analysing geospatial data +* [pytorch](https://docs.csc.fi/apps/pytorch/) for deep learning +* [tensorflow](https://docs.csc.fi/apps/tensorflow/) for deep learning * Additional, easy to install yourself [EasyBuild recepies](https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs) for CGAL, GDAL, GEOS, ncview, PROJ, R. -## GIS tools NOT available in supercomputers + +:::{admonition} GIS tools NOT available in supercomputers +:class: caution * **Servers** -> these can be run in cPouta * Web map servers: @@ -62,6 +64,8 @@ * Tools available for **Windows only** -> no good option from CSC services * ArcGIS, TerraScan +::: + :::{admonition} Commercial tools :class: seealso, dropdown @@ -95,39 +99,31 @@ Additionally Ames Stereo Pipeline, FORCE, LasTools, OpenDroneMap, PCL and Zonati \* are not available in Puhti currently, but should be possible to install, ask if you need. -## Skills needed - -* Some GIS tool listed above -* Basic Linux skills in commandline: changing folder, running tools, file permissions - * [CSC Linux tutorial](https://docs.csc.fi/support/tutorials/env-guide/) -* Scripting skills, one of these: - * Python: [Python GIS learning materials](https://docs.csc.fi/apps/geoconda/#references) - * R: [Spatial R learning materials](https://docs.csc.fi/apps/r-env-for-gis/#references) - * bash: [CSC bash tutorial](https://docs.csc.fi/support/tutorials/env-guide/linux-bash-scripts/) - * ... -* Parallelization: Python/Dask, R, GNU-parallel - -## Modules - -* Puhti is a shared computing environment -* Software is loaded with modules - * Mutually incompatible software - * One module: single program or group of similar programs - * Modules load applications, adjust path settings and set environment variables + + +:::{admonition} Modules +:class: important + +* Supercomputers are shared computing environment with many mutually incompatible tools installed +* By default only basic Linux tools are available +* Pre-installed tools are available via modules +* One module: single program or group of similar programs +* Usage: * Check documentation for available module names and versions. + * Load a module(s) -> the system can find the tools provided by the module + * Use tools from the loaded module -Example. Loading module for R +Example. Loading module for GDAL ``` -module load r-env +module load gdal +gdalinfo /xx/data.tif ``` +::: + ## Documentation * [CSC Docs: Applications -> geosciences](https://docs.csc.fi/apps/by_discipline/#geosciences) * [LUMI Docs: Software pages](https://docs.lumi-supercomputer.eu/software/) * [CSC Research pages: GIS software](https://research.csc.fi/gis-software) - -Something missing? - Ask us :) - servicedesk@csc.fi diff --git a/materials/spatial_data_at_csc.md b/materials/spatial_data_at_csc.md index 6dce5735..bc9fae32 100644 --- a/materials/spatial_data_at_csc.md +++ b/materials/spatial_data_at_csc.md @@ -9,6 +9,7 @@ * **Finnish Environmental Institute (SYKE) open datasets**: CORINE land use etc * **LUKE, Multi-source national forest inventory**: 2013, 2015, 2017, 2019 and 2021. * **Forest center: canopy height**, forest mask, gridcells, forest resource plots +* LUMI and Mahti do not have spatial data on local disk. * [CSC Docs: Spatial data in CSC computing environment](https://docs.csc.fi/data/datasets/spatial-data-in-csc-computing-env/) ## Paituli @@ -18,7 +19,7 @@ * All datasets open to everyone * Also **historical versions** of several datasets * Possibility to publish own datasets for universities and research institutes -* OKM supports financially, CSC maintains +* Ministry of education and culture supports financially, CSC maintains * [Paituli](https://paituli.csc.fi) !["Paituli"](./images/paituli.png "Paituli") diff --git a/materials/stac.md b/materials/stac.md index dd8901f9..c044b94b 100644 --- a/materials/stac.md +++ b/materials/stac.md @@ -68,10 +68,14 @@ STAC - Spatio-Temporal Asset Catalog \* These datasets have several bands in one file, Python `stackstac` does not support it, but search works. -## Paituli STAC links -* **[Paituli STAC description](https://paituli.csc.fi/stac.html)** -* Paituli STAC end-point: `https://paituli.csc.fi/geoserver/ogc/stac/v1` -* [STAC Browser with Paituli STAC data](https://radiantearth.github.io/stac-browser/#/external/paituli.csc.fi/geoserver/ogc/stac/v1?.language=en) -* Example scripts: + +:::{admonition} Next steps +:class: important + +* Read more about STAC in general from [Paituli STAC description](https://paituli.csc.fi/stac.html) +* [See what data is available with Paiituli STAC](https://radiantearth.github.io/stac-browser/#/external/paituli.csc.fi/geoserver/ogc/stac/v1?.language=en) +* **Test out the example scripts**: * **[Python](https://www.github.com/csc-training/geocomputing/blob/master/python/STAC)** * [R](https://www.github.com/csc-training/geocomputing/blob/master/R/STAC) +* Use Paituli STAC, end-point: `https://paituli.csc.fi/geoserver/ogc/stac/v1` +::: diff --git a/materials/supercomputing.md b/materials/supercomputing.md index 60fa9e94..0488f4c6 100644 --- a/materials/supercomputing.md +++ b/materials/supercomputing.md @@ -13,7 +13,7 @@ * Collaborate with your project members * “Outsource” heavy computations, keep own computer for other usage * **Good documentation, examples for spatial data analysis** -* **CSC specialist support**, +* **CSC specialist support** * **Free of charge for open science** at Finnish universities and research institutes. ## Supercomputers in Kajaani @@ -22,9 +22,9 @@ Name | CPUs | GPUs | Pre-installed GIS tools | Finnish spatial data locally | Scope | --- | --- | --- | --- | --- | --- | -Puhti | 28 000 | 240 Nvidia V100 | **20** | **Yes** | Finland | -Mahti | 90 000 | 96 Nvidia A100 | 1 | No | Finland | -LUMI | **100 000** | **10 000** AMD MI250X | 5 | No | EU | +**Puhti** | 28 000 | 240 Nvidia V100 | **20** | **Yes** | Finland | +**Mahti** | 90 000 | 96 Nvidia A100 | 1 | No | Finland | +**LUMI** | **100 000** | **10 000** AMD MI250X | 5 | No | EU | Puhti: * From interactive single core to medium scale parallel analysis @@ -42,12 +42,12 @@ LUMI: * [LUMI Docs: Hardware overview](https://docs.lumi-supercomputer.eu/hardware/) -# Puhti compared to other options +## Puhti compared to other options -| | Puhti supercomputer*| cPouta virtual machine| my laptop | +| | Puhti supercomputer| cPouta virtual machine| my laptop | |---|---| ---|---| |Max per job: CPU | **4000** | 48 | 4 | -|Max per job: memory Gb | **1500** | 240 | 18 | +|Max per job: memory, Gb | **1500** | 240 | 18 | |Max per job: GPU | **80** | 4 | 1 | |Pre-installed GIS tools | **Yes** | No | No | |Main Finnish datasets | **Yes** | No | No | @@ -57,12 +57,12 @@ LUMI: :::{admonition} Computing speed :class: important -* Supercomptuer single core speed ~ laptop single core speed -* **For speed up, use many cores**: - * Scripts for parallization +* Single core speed: supercompter ~ laptop +* **To increase speed, use many cores**: + * **Scripts for parallization** * Tools that support parallelization out-of-the-box * GPU tools -* Running a single core tool/script on many cores, will not help +* Running a single core tool/script on many cores will not help ::: @@ -70,8 +70,8 @@ LUMI: ![](./images/gui_script.png) * Supercomputers have some support for working with graphical tools -* Main work is done with scripts -* GIS tools have often weak support for parallization +* GIS tools have often weak support for parallization +* **Main work is done with scripts** * Scripts can make analysis parallel * Scripts also increase reproducibility of your work @@ -79,10 +79,31 @@ LUMI: :class: note * Many similar, but independent tasks. -* In GIS scope it is often possible to split or analysis parameters: +* Tasks split by: * Input data: map sheets, rows in a dataframe, data from different time periods etc. * Analysis parameters: different scenarios, different variables etc. - -* Note, that especially with map sheets extra care might be needed for border areas, for example use overlapping map sheets. +* Note, that especially with map sheets extra care might be needed for border areas, for example use overlapping map sheets with [virtual rasters](https://docs.csc.fi/support/tutorials/gis/virtual-rasters/). ::: + +## Technical skills needed for using supercomputers + +* Domain knowledge: + * [GIS tools](software.md) + * [Spatial data sources](https://research.csc.fi/open-gis-data) +* Basic Linux skills: + * [Terminal](terminal.md) + * [Moving data](moving_data.md) + * [CSC Linux tutorial](https://docs.csc.fi/support/tutorials/env-guide/) +* Supercomputer basics +* **Scripting skills and how to write parallel scripts**, one of these: + * Python: + * [CSC Docs: Python GIS learning materials](https://docs.csc.fi/apps/geoconda/#references) + * [CSC Docs: Python parallel jobs](https://docs.csc.fi/apps/python/#python-parallel-jobs) + * [CSC Docs: Dask tutorial](https://docs.csc.fi/support/tutorials/dask-python/) + * R: + * [CSC Docs: Spatial R learning materials](https://docs.csc.fi/apps/r-env-for-gis/#references) + * [CSC Docs: Parallel jobs using R](https://docs.csc.fi/support/tutorials/parallel-r/) + * bash: + * [CSC bash tutorial](https://docs.csc.fi/support/tutorials/env-guide/linux-bash-scripts/) + * Julia, MATLAB etc diff --git a/materials/support.md b/materials/support.md index 15c0b761..d330b983 100644 --- a/materials/support.md +++ b/materials/support.md @@ -2,27 +2,28 @@ ## General problem solving -1. Go to [docs.csc.fi](https://docs.csc.fi) and check in the right section in the _navigation_ -2. Try the [FAQ](https://docs.csc.fi/support/faq/) -3. Try the search function in CSC Docs or search the web - - Type a keyword in CSC Docs, copy/paste the error message in your favorite search engine -4. Send an email to [servicedesk@csc.fi](mailto:servicedesk@csc.fi) containing: +1. Study the CSC/LUMI Docs. +3. Try the search function in Docs or search the web + - Type a keyword in Docs + - Copy/paste the error message in your favorite search engine +4. Send an email to servicedesk containing: - A descriptive title - - What you wanted to achieve and on which which computer + - What you wanted to achieve and on which computer - Which commands you have given - - What error messages resulted + - What error messages did you get - [More tips to help us to more quickly solve your issue](https://docs.csc.fi/support/support-howto/) ## Support pages and channels -[`docs.csc.fi`](https://docs.csc.fi) +CSC: + * [CSC Docs: `docs.csc.fi`](https://docs.csc.fi) + * [`research.csc.fi`](https://research.csc.fi) + * servicedesk@csc.fi + * [CSC user support session in Zoom](https://ssl.eventilla.com/usersupportcoffee/EN) every Wednesday at 14.00. Join on 18.10.2023 for course specific questions! :) -[`research.csc.fi`](https://research.csc.fi) - -+ servicedesk@csc.fi - -+ User support session in Zoom every Wednesday at 14.00 (join on 18.10 for course specific questions! :) ) +LUMI: + * [LUMI Docs: Helpdesk](https://docs.lumi-supercomputer.eu/helpdesk/) ## How we can help diff --git a/materials/where_to_go.md b/materials/where_to_go.md index f6957b21..ee82a334 100644 --- a/materials/where_to_go.md +++ b/materials/where_to_go.md @@ -45,6 +45,7 @@ We have accessed the supercomputer via the webinterface in order to not overwhel * [Geocomputing using CSC resources](https://research.csc.fi/geocomputing) * [Geocomputing examples in github](https://github.com/csc-training/geocomputing) * [Visit courses, webinars, workshops](https://www.csc.fi/en/training) +* [Sign-up for CSC gis-hpc mailing list ](https://postit.csc.fi/sympa/subscribe/gis-hpc) * Ask for help, if needed, we don't bite :) ## How you can help