diff --git a/.gitignore b/.gitignore
index a4be7aea..1d18bb5b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,5 +1,6 @@
/.idea/
marcelo_zoccoler/entry_user_interf3/scripts/__pycache__/flood_tool.cpython-37.pyc
+.DS_Store
.ipynb_checkpoints
\ No newline at end of file
diff --git a/till_korten/devbio-napari_cluster/images/10_imshow.png b/till_korten/devbio-napari_cluster/images/10_imshow.png
new file mode 100644
index 00000000..72029d6f
Binary files /dev/null and b/till_korten/devbio-napari_cluster/images/10_imshow.png differ
diff --git a/till_korten/devbio-napari_cluster/images/1_login.png b/till_korten/devbio-napari_cluster/images/1_login.png
new file mode 100644
index 00000000..c868815c
Binary files /dev/null and b/till_korten/devbio-napari_cluster/images/1_login.png differ
diff --git a/till_korten/devbio-napari_cluster/images/2_start_server.png b/till_korten/devbio-napari_cluster/images/2_start_server.png
new file mode 100644
index 00000000..3f397992
Binary files /dev/null and b/till_korten/devbio-napari_cluster/images/2_start_server.png differ
diff --git a/till_korten/devbio-napari_cluster/images/3_configure_node.png b/till_korten/devbio-napari_cluster/images/3_configure_node.png
new file mode 100644
index 00000000..1ba61d1b
Binary files /dev/null and b/till_korten/devbio-napari_cluster/images/3_configure_node.png differ
diff --git a/till_korten/devbio-napari_cluster/images/4_wait.png b/till_korten/devbio-napari_cluster/images/4_wait.png
new file mode 100644
index 00000000..948cda35
Binary files /dev/null and b/till_korten/devbio-napari_cluster/images/4_wait.png differ
diff --git a/till_korten/devbio-napari_cluster/images/7_open_notebook.png b/till_korten/devbio-napari_cluster/images/7_open_notebook.png
new file mode 100644
index 00000000..38329b74
Binary files /dev/null and b/till_korten/devbio-napari_cluster/images/7_open_notebook.png differ
diff --git a/till_korten/devbio-napari_cluster/images/8_select_kernel.png b/till_korten/devbio-napari_cluster/images/8_select_kernel.png
new file mode 100644
index 00000000..8472ba21
Binary files /dev/null and b/till_korten/devbio-napari_cluster/images/8_select_kernel.png differ
diff --git a/till_korten/devbio-napari_cluster/images/9_select_devbio_napari.png b/till_korten/devbio-napari_cluster/images/9_select_devbio_napari.png
new file mode 100644
index 00000000..0ec09b91
Binary files /dev/null and b/till_korten/devbio-napari_cluster/images/9_select_devbio_napari.png differ
diff --git a/till_korten/devbio-napari_cluster/readme.md b/till_korten/devbio-napari_cluster/readme.md
new file mode 100644
index 00000000..2ae1d361
--- /dev/null
+++ b/till_korten/devbio-napari_cluster/readme.md
@@ -0,0 +1,191 @@
+# Using GPU-accelerated image processing on the TUD HPC cluster
+
+[Till Korten](https://biapol.github.io/blog/till_korten), Oct 21st 2022
+
+The [High Performance Computing (HPC) cluster at the compute center (ZIH) of the TU Dresden](https://tu-dresden.de/zih/hochleistungsrechnen/hpc?set_language=en) provides a lot of computational resources including GPU support, which we can use for analyzing data in the life-sciences.
+This blog post explains how you can run your own [jupyter notebooks](https://jupyter.org/) using some [napari](https://napari.org) plugins and GPU-accelerated image processing python libraries such as [clEsperanto](https://clesperanto.net) on the cluster.
+
+### This blog post is for you if
+
+* you want to try out using napari plugins in jupyter notebooks without a local installation
+* data processing takes a significant amount of time on your computer
+* the time-intensive part of your data processing works without user interaction
+* you want to use your computer for other important tasks (such as life after five) while your data are being processed
+* you have a working python script or jupyter notebook that processes your data
+* the computers available to you limit you because:
+ * no GPU (or too little GPU RAM)
+ * not enough RAM
+ * not enough disk space
+ * not enough CPUs
+
+### You may need to look elsewhere if
+
+* you are still actively developing on your workflow and are installing/removing python packages on a regular basis. We are working with [singularity containers](https://sylabs.io/singularity/) and it is not feasible to frequently build new containers for you with new python packages. You may want to look at Roberts article on [using Google colab](../../robert_haase/clesperanto_google_colab/) if you need more ressources for your workflow but are still in the process of active development rather than deployment.
+* your workflow needs a graphical user interface other than what jupyter notebooks can provide
+
+### See also
+
+* [ZIH HPC Documentation](https://doc.zih.tu-dresden.de/)
+* [Detailed cluster setup instructions](../devbio-napari_cluster_setup/readme.md)
+* [Using Google colab](../../robert_haase/clesperanto_google_colab/)
+
+## Step 1: Set up your account on the ZIH cluster
+
+Before you get started, you need to set up an account, which is [explained in this blog post](../devbio-napari_cluster_setup/readme.md) (and we are happy to help you with that).
+
+## Step 2: Start a Jupyter session on the ZIH cluster
+
+Go to the [jupyter hub of the ZIH cluster](https://taurus.hrsk.tu-dresden.de/jupyter)
+You will be greeted with the TUD login screen. Log in with your ZIH user name and password:
+
+
+
+Afterwards, you should see a single button `Start My Server`. Click on it:
+
+
+
+Now you get to configure the computing node you want your session to run on. Switch to the advanced configuration by clicking the button `Advanced`. Then you should see something like the image below.
+
+1. Start by choosing a preset (click on 1).
+2. Choose the "GPU Ampere A100" preset (2).
+3. Click the orange button `Spawn` at the very bottom.
+
+
+
+You will now see a wait bar. Do not worry if it does not move, this bar is always at 50%. It usually takes 2-5 min to get a node.
+
+
+
+
+
+## Step 3: Open a Jupyter Notebook with the devbio-napari environment
+
+Now open a new notebook by clicking on `File` (1 in the image below) -> `New` (2) -> `Notebook` (3)
+
+
+
+Now you are asked to select a kernel. Click on the drop down button (red rectangle in the image below).
+
+
+
+Choose the kernel that starts with **devbio-napari** (`devbio-napari-0.2.1` in the image below).
+
+
+
+Note: for an existing notebook, you can click on the kernel name (by default `Python 3`) in the top right corner of the notebook and select the devbio-napari kernel as described above.
+
+## Step 4: Accessing your data on the HPC cluster
+
+After [setting up the ZIH account](../devbio-napari_cluster_setup/readme.md), you have a folder where your fileserver space is mapped to a folder on the cluster. It should look something like this: `/grp//`.
+
+Note: For security reasons, this folder is read-only. Therefore, you need to transfer the data from the fileserver to a temporary folder on the cluster before you can start working with it. Note: the data on this temporary folder will be **automatically deleted** after 10 days. So please make sure to **transfer the data back once you are done** (see Step 3 below)
+
+To transfer your data, please insert the following **after your import statements** into your notebook:
+
+```Python
+from biapol_taurus import ProjectFileTransfer
+
+# We get files from the fileserver:
+source_dir = "/grp//path/to/your/data/"
+pft = ProjectFileTransfer(source_dir)
+pft.sync_from_fileserver()
+
+# make sure that images are read from the correct location
+imread = pft.imread
+```
+
+Waiting .............sending incremental file list
+./
+folder/
+folder/filename001.tif
+
+sent 65,467 bytes received 65 bytes 43,688.00 bytes/sec
+total size is 65,220 speedup is 1.00
+
+## Step 5: Work with your data
+
+1. you can list files locally available on the cluster
+
+ ```python
+ pft.list_files()
+ ```
+
+ ['/scratch/ws/0/username-cache/is36zwh_',
+ '/scratch/ws/0/username-cache/folder',
+ '/scratch/ws/0/username-cache/folder/filename001.tif']
+
+ note: the folder with the cryptic name (`is36zwh_`) is a temporary folder created and managed by `pft`.
+
+2. you can read single images:
+
+ ```Python
+ image = imread("folder/filename001.tif")
+ ```
+
+ note that this is fast after syncing, while it will be slow the first time you do this without syncing.
+
+3. you can read all `.tif` images:
+
+ ```Python
+ images = []
+ for filename in pft.list_files():
+ if filename.endswith(".tif"):
+ images.append(imread(filename))
+
+ from skimage.io import imshow
+ imshow(images[0])
+ ```
+
+
+
+4. after you analysed your data, you may want to save the results from a pandas dataframe to a csv file:
+ * if you need the file again, or don't want to wait until it is transferred to the fileserver, save it locally:
+
+ ```python
+ full_path = pft.cache_path / "folder/results.csv"
+ my_pandas_dataframe.to_csv(full_path)
+ ```
+
+ * if you don't need the file again on the cluster, you can save it directly to the fileserver:
+
+ ```python
+ pft.csv_save("folder/results.csv", my_pandas_dataframe)
+ ```
+
+ Waiting ................target file: /grp//path/to/your/data/folder/results.csv
+
+## Step 6: Put your data back on the fileserver
+
+Note: This step is **important** if you don't do this **you will loose any data you created/changed on the cluster** because it is automatically deleted after 10 days!
+
+Put the following at the end of your jupyter notebook:
+
+```python
+pft.sync_to_fileserver()
+```
+
+Waiting .............sending incremental file list
+./
+folder/
+folder/results.csv
+
+sent 467 bytes received 65 bytes 43,688.00 bytes/sec
+total size is 567 speedup is 1.00
+
+#
+## Step 7: Clean up
+
+This step is optional (but encouraged if your data is hundreds of GB), you can skip it if you want to re-analyze the same data again later. The cleanup will happen automatically after 10 days.
+
+```python
+pft.cleanup()
+```
+
+## Trouble shooting
+
+* If jupyter lab does not start within 10-15 min, maybe all A100 GPUs are in use. In that case, you can either wait (usually it is easier to get a node in the mornings before 10:00 am), or choose the "GPU Tesla K80" preset in step 2 above. Those GPUs are much less performant and thus much less used - so you should get one more easily.
+* If you run out of memory or need more CPU cores, increase the number of CPUs in the advanced configuration. Note that the memory is per CPU, so if you choose more CPUs, you automatically get more memory.
+
+## Acknowledgements
+
+I would like to thank Fabian Rost for sharing his extensive experience of how to run python notebooks within singularity containers on the TUD cluster.
diff --git a/till_korten/devbio-napari_cluster_setup/images/1_login.png b/till_korten/devbio-napari_cluster_setup/images/1_login.png
new file mode 100644
index 00000000..c868815c
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/1_login.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/2_start_server.png b/till_korten/devbio-napari_cluster_setup/images/2_start_server.png
new file mode 100644
index 00000000..3f397992
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/2_start_server.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/3_configure_node.png b/till_korten/devbio-napari_cluster_setup/images/3_configure_node.png
new file mode 100644
index 00000000..1ba61d1b
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/3_configure_node.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/4_wait.png b/till_korten/devbio-napari_cluster_setup/images/4_wait.png
new file mode 100644
index 00000000..948cda35
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/4_wait.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/5_open_terminal.png b/till_korten/devbio-napari_cluster_setup/images/5_open_terminal.png
new file mode 100644
index 00000000..3bc36424
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/5_open_terminal.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/6_terminal_output.png b/till_korten/devbio-napari_cluster_setup/images/6_terminal_output.png
new file mode 100644
index 00000000..e17bff2d
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/6_terminal_output.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/7_open_notebook.png b/till_korten/devbio-napari_cluster_setup/images/7_open_notebook.png
new file mode 100644
index 00000000..38329b74
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/7_open_notebook.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/8_select_kernel.png b/till_korten/devbio-napari_cluster_setup/images/8_select_kernel.png
new file mode 100644
index 00000000..8472ba21
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/8_select_kernel.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/9_select_devbio_napari.png b/till_korten/devbio-napari_cluster_setup/images/9_select_devbio_napari.png
new file mode 100644
index 00000000..0ec09b91
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/9_select_devbio_napari.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/a1_new_project_application.png b/till_korten/devbio-napari_cluster_setup/images/a1_new_project_application.png
new file mode 100644
index 00000000..321ae660
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/a1_new_project_application.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/a2_trial_project.png b/till_korten/devbio-napari_cluster_setup/images/a2_trial_project.png
new file mode 100644
index 00000000..3b8ce2a1
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/a2_trial_project.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/a3_PI_PC_email.png b/till_korten/devbio-napari_cluster_setup/images/a3_PI_PC_email.png
new file mode 100644
index 00000000..152d9d6d
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/a3_PI_PC_email.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/a4_CPU_GPU.png b/till_korten/devbio-napari_cluster_setup/images/a4_CPU_GPU.png
new file mode 100644
index 00000000..8dadc6fd
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/a4_CPU_GPU.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/images/a6_data_management.png b/till_korten/devbio-napari_cluster_setup/images/a6_data_management.png
new file mode 100644
index 00000000..c5571cfa
Binary files /dev/null and b/till_korten/devbio-napari_cluster_setup/images/a6_data_management.png differ
diff --git a/till_korten/devbio-napari_cluster_setup/readme.md b/till_korten/devbio-napari_cluster_setup/readme.md
new file mode 100644
index 00000000..a55ecc9b
--- /dev/null
+++ b/till_korten/devbio-napari_cluster_setup/readme.md
@@ -0,0 +1,306 @@
+# Setting up GPU-accelerated image processing on the TUD HPC cluster
+[Till Korten](https://biapol.github.io/blog/till_korten), July 29th 2022
+
+The [High Performance Computing (HPC) cluster at the compute center (ZIH) of the TU Dresden](https://tu-dresden.de/zih/hochleistungsrechnen/hpc?set_language=en) provides a lot of computational resources including GPU support, which we can use for analyzing data in the life-sciences.
+This blog post explains how to set up an account on the cluster. [This follow-up article](../devbio-napari_cluster/readme.md) explains how you can then run your own [jupyter notebooks](https://jupyter.org/) using some [napari](https://napari.org) plugins and GPU-accelerated image processing python libraries such as [clEsperanto](https://clesperanto.net) on the cluster.
+
+### This blog post is for you if
+* you want to [run devbio-napari tools](../devbio-napari_cluster/readme.md) on the cluster.
+* you are comfortabel with the command line
+
+If you need help with the setup, please do not hesitate to contact us
+
+### See also
+* [ZIH HPC Documentation](https://doc.zih.tu-dresden.de/)
+
+## Step 1: Get access to the ZIH cluster
+Before you can use the cluster, you need to [apply for an HPC project](https://tu-dresden.de/zih/hochleistungsrechnen/zugang/projektantrag?set_language=en).
+
+In most cases, a NHR trial project will provide plenty of resources for your needs, so that you can follow our recommendations for filling out the application:
+1. Log in to the [online project application form](https://projects.hpc.tu-dresden.de/jards/WEB/application/login.php?appkind=nhr)
+2. Application List: Scroll down and click on the button "New Project Application"
+
+
+
+3. Project Type: Select "trial project (< 42000 CPU hours per year)
+
+
+
+4. Choose PI and PC: Enter the email addresses of the principal investigator (PI) and a person to contact (PC). If you are not the PI, click on the button "change my role to Person to Contact" so that you can enter the email address of your PI.
+
+
+
+5. Principal Investigator: Enter the personal data of your PI
+6. Person to Contact: Enter the personal data of the PC
+7. Resources: check the boxes for both CPU and GPU:
+
+
+
+8. Project data:
+ * Choose a project duration. Enter the estimated duration of the project here (e.g. the duration of the contract of the PC). Note that no matter the total estimated project duration, all projects need to be extended after one year.
+ * Choose a title
+ * Choose a few keywords, e.g. `image processing, machine learning, data analysis`
+ * Enter a short project description, e.g. `Analysis of light-sheet microscopy images of developing organoids using python machine-learning tools such as napari`. If you already have an abstract of a project proposal, we recommend to use parts of that (note the 1500 character limit).
+ * Commmissioned research: `No` (unless the project you are doing was specifically commissioned by a company)
+ * Select sponsorship (DFG, BMBF, EU, Industry, Other)
+ * If you selected `Other`, enter the sponsor in the text field below. e.g. `TU Dresden`
+ * Classify your project according to predefined categories. E.g. "Main category": `201 Basic Research in Biology and Medicine` and "Sub category": `201-02 Biophysics`
+ * Give reasons that make the use of this supercomputer necessary: e.g. `We need to process large amounts of image data, that would require hours to days on a normal computer. Furthermore, we need a standardized, reproducible environment for data processing.`
+ * If the PI or PC were involved in other HPC projects within the last 3 years, you need to fill in the table "Other applications for compute time"
+ * Check the box that you are aware that incomplete information may lead to a significant cutback of resources or even to the rejection of the proposal
+ * Add other project members that need access to the HPC cluster (note, you can also easily add members later as needed)
+
+9. Software packages: Here is what we recommend for bioimage analysis
+ * Compilers: GCC (during installation of some python packages),
+ * Programming Languages: Python (and R)
+ * Other packages: `singularity, OpenCL, CUDA (Inside singularity containers: PyTorch, Tensorflow, napari)`
+ * Packages developed by yourself: `(Inside singularity containers: devbio-napari, pyclesperanto)`
+ * Open source: Yes
+ * Links: `https://sylabs.io/singularity/, , https://www.khronos.org/opencl/, https://developer.nvidia.com/cuda-toolkit, https://pytorch.org/, https://www.tensorflow.org/, https://napari.org/stable/, https://github.com/haesleinhuepf/devbio-napari, https://github.com/clEsperanto/pyclesperanto_prototype`
+ * Parallelization strategy: Hybrid
+10. Data management: Fill out your estimated data requirements. Here is what we estimated for an average image analysis project:
+
+
+
+11. Upload of PDF files: For a trial project, you can skip this step.
+12. Finalize application: Click the button "Finalize" at the bottom.
+13. You will receive an email with a PDF document that needs to be signed (electronically) by the PI and sent back via email.
+
+For further information, refer to the [ZIH documentation for HPC project applications](https://doc.zih.tu-dresden.de/application/project_request_form/).
+
+## Step 2: Get access fom the cluster to the fileserver
+
+1. If you don't already have one, apply for a [group space on the fileserver](https://selfservice.zih.tu-dresden.de/l/index.php/spor/request-form/)
+2. Ask for access to the fileserver from the cluster: Write an email to [hpcsupport](mailto:hpcsupport@zih.tu-dresden.de) and ask them to give you access to your fileserver account. They will then tell you a mount point that usually starts with `/grp//`.
+
+## Step 3: Start a Jupyter session on the ZIH cluster
+Go to the [jupyter hub of the ZIH cluster](https://taurus.hrsk.tu-dresden.de/jupyter)
+You will be greeted with the TUD login screen. Log in with your ZIH user name and password:
+
+
+
+
+Afterwards, you should see a single button `Start My Server`. Click on it:
+
+
+
+Now you get to configure the computing node you want your session to run on. Switch to the advanced configuration by clicking the button `Advanced`. Then you should see something like the image below.
+
+1. Start by choosing a preset (click on 1).
+2. You should choose a GPU node preset (2). You can choose between
+ * Ampere A100 -> This is what you want if you really need to crunch some numbers. The A100 is more powerful than any gaming GPU and has 40GB graphics memory. However these machines are often more used and it may take longer to get a node on the cluster
+ * Tesla K80 -> This is what you want for testing whether your workflows work on a GPU on the cluster and for tasks that are not time critical. The K80 is half as as powerful as a GTX 1080 and has 12GB graphics memory.
+
+Once you are happy with your configuration, click the orange button `Spawn` at the very bottom.
+
+
+
+You will now see a wait bar. Do not worry if it does not move, this bar is always at 50%. It usually takes 2-5 min to get a node.
+
+
+
+#### Trouble shooting
+
+* Check the current utilization (the bars above the preset chooser)
+ * the A100 are on the bar labeled `alpha`
+ * the K80 are on the bar labeled `gpu2`
+
+ If the partition is very full, you may have to wait a long time or not get a session at all.
+* If you run out of memory or need more CPU cores, increase the number of CPUs. Note that the memory is per CPU, so if you choose more CPUs, you automatically get more memory.
+
+
+## Step 4: Open a terminal
+
+Open a terminal by clicking on `File` (1 in the image below) -> `New` (2) -> `Terminal` (3)
+
+
+
+## Step 5: Install a custom jupyter kernel for your user
+
+One of the advantages of our approach is, that you can always execute your code with the exac same python environment so that you always get the same result for the same data. Therefore, it is important for you that you know with which version of our python environment you were working. You can find available [versions here](https://gitlab.mn.tu-dresden.de/bia-pol/singularity-devbio-napari/-/releases). Look for the version number at the end of the title. For example `v0.2.1`.
+
+To install a devbio-napari python environment, execute the following code in the terminal:
+
+```bash
+git clone https://gitlab.mn.tu-dresden.de/bia-pol/singularity-devbio-napari.git
+cd singularity-devbio-napari
+./install.sh
+```
+
+replace `` with the latest version shown at the top of [the version list](https://gitlab.mn.tu-dresden.de/bia-pol/singularity-devbio-napari/-/releases).
+
+*Note*: In order to ensure [repeatability ](https://en.wikipedia.org/wiki/Repeatability) and [reproducibility](https://en.wikipedia.org/wiki/Reproducibility) of the results you obtained from using a singularity container, we strongly recommend and encourage you to keep track of the used container's version. This way, you can ensure the integrity of your analysis workflow further down the line.
+
+Wait 2-15 min until the image is downloaded and verified (the time depends on how much network and disk load is on the cluster). The output should look something like this:
+
+
+
+If everything went well, close the terminal by clicking on the small X at the top of the terminal window.
+
+## Step 6: Open a Jupyter Notebook with the newly installed environment
+
+Reload the browser tab. Now open a new notebook by clicking on `File` (1 in the image below) -> `New` (2) -> `Notebook` (3)
+
+
+
+Now you are asked to select a kernel. Click on the drop down button (red rectangle in the image below).
+
+
+
+Choose the kernel you just installed (`devbio-napari-0.2.1` in the image below).
+
+
+
+NB: for an existing notebook, you can click on the kernel name (by default `Python 3`) in the top right corner of the notebook and aelect the devbio-napari kernel as described above.
+
+## Step 7: Verify that the environment works
+
+Run some test code to verify that the environment has everything you need. For example:
+
+```python
+import pyclesperanto_prototype as cle
+from skimage.io import imread, imsave
+```
+```python
+# initialize GPU
+device = cle.select_device("A100")
+print("Used GPU: ", device)
+```
+Used GPU: \
+```python
+# load data
+image = imread('https://imagej.nih.gov/ij/images/blobs.gif')
+
+# process the image
+inverted = cle.subtract_image_from_scalar(image, scalar=255)
+blurred = cle.gaussian_blur(inverted, sigma_x=1, sigma_y=1)
+binary = cle.threshold_otsu(blurred)
+labeled = cle.connected_components_labeling_box(binary)
+
+# The maxmium intensity in a label image corresponds to the number of objects
+num_labels = cle.maximum_of_all_pixels(labeled)
+
+# print out result
+print("Num objects in the image: " + str(num_labels))
+
+# save image to disc
+imsave("result.tif", cle.pull(labeled))
+```
+Num objects in the image: 62.0
+
+*/tmp/ipykernel_13/2798962990.py:17: UserWarning: result.tif is a low contrast image imsave("result.tif", cle.pull(labeled))*
+```python
+cle.available_device_names()
+```
+\['NVIDIA A100-SXM4-40GB', 'cupy backend (experimental)'\]
+
+## Hints
+
+### Copy data directly to the cluster
+
+This option targets more advanced users. It is faster, because it skips the transfer of data between fileserver and project space. However, it requires specialized file-transfer tools like [WinSCP](http://winscp.net/eng/download.php), [Cyberduck](https://cyberduck.io/) or [Rsync](https://man7.org/linux/man-pages/man1/rsync.1.html).
+
+Please follow the [instructions on how to use the ZIH Export Nodes](https://doc.zih.tu-dresden.de/data_transfer/export_nodes/).
+
+### What hardware is the current node running on?
+If you are using an NVidia GPU, you can the [NVidia System Management Interface](https://developer.nvidia.com/nvidia-system-management-interface):
+```
+!nvidia-smi
+```
+It will give an overview about what's currently going on on your GPU:
+```
++-----------------------------------------------------------------------------+
+| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
+|-------------------------------+----------------------+----------------------+
+| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
+| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
+| | | MIG M. |
+|===============================+======================+======================|
+| 0 NVIDIA A100-SXM... On | 00000000:11:00.0 Off | 0 |
+| N/A 52C P0 60W / 400W | 0MiB / 40536MiB | 0% Default |
+| | | Disabled |
++-------------------------------+----------------------+----------------------+
+
++-----------------------------------------------------------------------------+
+| Processes: |
+| GPU GI CI PID Type Process name GPU Memory |
+| ID ID Usage |
+|=============================================================================|
+| No running processes found |
++-----------------------------------------------------------------------------+
+```
+
+another option is
+```
+!numba -s
+```
+which will tell you everything about your system you ever wanted to know (the output below is just the hardware information section. numba actually provides much more information):
+```
+System info:
+--------------------------------------------------------------------------------
+__Hardware Information__
+Machine : x86_64
+CPU Name : znver2
+CPU Count : 96
+Number of accessible CPUs : 6
+List of accessible CPUs cores : 32 33 34 80 81 82
+CFS Restrictions (CPUs worth of runtime) : None
+
+CPU Features : 64bit adx aes avx avx2 bmi bmi2
+ clflushopt clwb clzero cmov cx16
+ cx8 f16c fma fsgsbase fxsr lzcnt
+ mmx movbe mwaitx pclmul popcnt
+ prfchw rdpid rdrnd rdseed sahf sha
+ sse sse2 sse3 sse4.1 sse4.2 sse4a
+ ssse3 wbnoinvd xsave xsavec
+ xsaveopt xsaves
+
+Memory Total (MB) : 1031711
+Memory Available (MB) : 974317
+
+__OS Information__
+Platform Name : Linux-3.10.0-1160.11.1.el7.x86_64-x86_64-with-glibc2.31
+Platform Release : 3.10.0-1160.11.1.el7.x86_64
+OS Name : Linux
+OS Version : #1 SMP Fri Dec 18 16:34:56 UTC 2020
+OS Specific Version : ?
+Libc Version : glibc 2.31
+
+__Python Information__
+Python Compiler : GCC 10.3.0
+Python Implementation : CPython
+Python Version : 3.9.13
+Python Locale : en_US.UTF-8
+
+__LLVM Information__
+LLVM Version : 10.0.1
+
+__CUDA Information__
+CUDA Device Initialized : True
+CUDA Driver Version : 11040
+CUDA Detect Output:
+Found 4 CUDA devices
+id 0 b'NVIDIA A100-SXM4-40GB' [SUPPORTED]
+ compute capability: 8.0
+ pci device id: 0
+ pci bus id: 139
+id 1 b'NVIDIA A100-SXM4-40GB' [SUPPORTED]
+ compute capability: 8.0
+ pci device id: 0
+ pci bus id: 144
+id 2 b'NVIDIA A100-SXM4-40GB' [SUPPORTED]
+ compute capability: 8.0
+ pci device id: 0
+ pci bus id: 187
+id 3 b'NVIDIA A100-SXM4-40GB' [SUPPORTED]
+ compute capability: 8.0
+ pci device id: 0
+ pci bus id: 193
+Summary:
+ 4/4 devices are supported
+```
+Note the 1TB (sic!) of total RAM available on the nodes in the alpha partition of the TUD cluster
+
+
+# Acknowledgements
+I would like to thank Fabian Rost for sharing his extensive experience of how to run python notebooks within singularity containers on the TUD cluster.
diff --git a/till_korten/multi-gpu-training/images/loss_multi_GPU.png b/till_korten/multi-gpu-training/images/loss_multi_GPU.png
new file mode 100644
index 00000000..5186aede
Binary files /dev/null and b/till_korten/multi-gpu-training/images/loss_multi_GPU.png differ
diff --git a/till_korten/multi-gpu-training/images/loss_single_GPU.png b/till_korten/multi-gpu-training/images/loss_single_GPU.png
new file mode 100644
index 00000000..01b04963
Binary files /dev/null and b/till_korten/multi-gpu-training/images/loss_single_GPU.png differ
diff --git a/till_korten/multi-gpu-training/images/n2v_result.png b/till_korten/multi-gpu-training/images/n2v_result.png
new file mode 100644
index 00000000..ff562c66
Binary files /dev/null and b/till_korten/multi-gpu-training/images/n2v_result.png differ
diff --git a/till_korten/multi-gpu-training/readme.md b/till_korten/multi-gpu-training/readme.md
new file mode 100644
index 00000000..d6983168
--- /dev/null
+++ b/till_korten/multi-gpu-training/readme.md
@@ -0,0 +1,161 @@
+# Training Noise to Void on multiple GPUs
+
+Training Noise to Void (N2V) models can be tedious. [On the taurus cluster](../devbio-napari_cluster/readme.md), you have multiple GPUs A100 GPUs available. Training on one of these is already 5-10 times faster than on a typical laptop GPU. However, on the cluster you even have the opportunity to use up to 8 A100 GPUs - sounds promising, right? In this blog post, we are exploring, how training works technically, and what the speed gain is.
+
+## Training on a single GPU
+
+We are going to use a slightly modified version of [the example notebook for training 3D models from the N2V repository](https://github.com/juglab/n2v/blob/main/examples/3D/01_training.ipynb).
+
+We only modify the cell with the hyperparameters in order to get better results than in their example and to be closer to real trainin times:
+
+```Python
+# You can increase "train_steps_per_epoch" to get even better results at the price of longer computation.
+num_GPU = 1
+batch_size = int(16 * num_GPU)
+learning_rate = 0.0004 * num_GPU
+epochs = int(200 / num_GPU)
+config = N2VConfig(X, unet_kern_size=3,
+ train_steps_per_epoch=200,train_epochs=epochs, train_loss='mse', batch_norm=True,
+ train_batch_size=batch_size, n2v_perc_pix=0.198, n2v_patch_shape=patch_shape,
+ n2v_manipulator='uniform_withCP', n2v_neighborhood_radius=5, train_learning_rate=learning_rate)
+
+# Let's look at the parameters stored in the config-object.
+vars(config)
+```
+
+{'means': ['1152.6188'],
+ 'stds': ['2689.561'],
+ 'n_dim': 3,
+ 'axes': 'ZYXC',
+ 'n_channel_in': 1,
+ 'n_channel_out': 1,
+ 'unet_residual': False,
+ 'unet_n_depth': 2,
+ 'unet_kern_size': 3,
+ 'unet_n_first': 32,
+ 'unet_last_activation': 'linear',
+ 'unet_input_shape': (None, None, None, 1),
+ 'train_loss': 'mse',
+ 'train_epochs': 200,
+ 'train_steps_per_epoch': 200,
+ 'train_learning_rate': 0.0004,
+ 'train_batch_size': 16,
+ 'train_tensorboard': True,
+ 'train_checkpoint': 'weights_best.h5',
+ 'train_reduce_lr': {'factor': 0.5, 'patience': 10},
+ 'batch_norm': True,
+ 'n2v_perc_pix': 0.198,
+ 'n2v_patch_shape': (16, 64, 64),
+ 'n2v_manipulator': 'uniform_withCP',
+ 'n2v_neighborhood_radius': 5,
+ 'single_net_per_channel': True,
+ 'blurpool': False,
+ 'skip_skipone': False,
+ 'structN2Vmask': None,
+ 'probabilistic': False}
+
+Note that we already calculate batch size, learning rate and number of epochs based on the number of GPUs we want to use. For now, we use just one GPU.
+
+With these parameters, the training takes about 42 min on a single A100 GPU:
+
+```
+CPU times: user 26min 42s, sys: 6min 19s, total: 33min 1s
+Wall time: 42min 31s
+```
+
+And we get a decently trained model:
+
+
+
+## Training on 4 GPUs
+
+Here comes the cool part, in order to enable training on multiple GPUs with tensorflow, we only need to change very few lines.
+
+First we adapt the hyperparameters to 4 GPUs:
+
+```Python
+# You can increase "train_steps_per_epoch" to get even better results at the price of longer computation.
+num_GPU = 1
+batch_size = int(16 * num_GPU)
+learning_rate = 0.0004 * num_GPU
+epochs = int(200 / num_GPU)
+config = N2VConfig(X, unet_kern_size=3,
+ train_steps_per_epoch=200,train_epochs=epochs, train_loss='mse', batch_norm=True,
+ train_batch_size=batch_size, n2v_perc_pix=0.198, n2v_patch_shape=patch_shape,
+ n2v_manipulator='uniform_withCP', n2v_neighborhood_radius=5, train_learning_rate=learning_rate)
+
+# Let's look at the parameters stored in the config-object.
+vars(config)
+```
+
+{'means': ['1152.6188'],
+ 'stds': ['2689.561'],
+ 'n_dim': 3,
+ 'axes': 'ZYXC',
+ 'n_channel_in': 1,
+ 'n_channel_out': 1,
+ 'unet_residual': False,
+ 'unet_n_depth': 2,
+ 'unet_kern_size': 3,
+ 'unet_n_first': 32,
+ 'unet_last_activation': 'linear',
+ 'unet_input_shape': (None, None, None, 1),
+ 'train_loss': 'mse',
+ 'train_epochs': 50,
+ 'train_steps_per_epoch': 200,
+ 'train_learning_rate': 0.0016,
+ 'train_batch_size': 64,
+ 'train_tensorboard': True,
+ 'train_checkpoint': 'weights_best.h5',
+ 'train_reduce_lr': {'factor': 0.5, 'patience': 10},
+ 'batch_norm': True,
+ 'n2v_perc_pix': 0.198,
+ 'n2v_patch_shape': (16, 64, 64),
+ 'n2v_manipulator': 'uniform_withCP',
+ 'n2v_neighborhood_radius': 5,
+ 'single_net_per_channel': True,
+ 'blurpool': False,
+ 'skip_skipone': False,
+ 'structN2Vmask': None,
+ 'probabilistic': False}
+
+ Note how we use a four times larger batch size and learning rate, because we distribute the batches over four GPUs. It has been shown [that a larger batch size justifies an equal increase in learning rate](https://www.baeldung.com/cs/learning-rate-batch-size). Because we expect this to converge faster, we reduce the number of epochs accordingly.
+
+
+ In order to enable tensorflow to parallelize to multiple GPUs, we simply add three lines of code before we create the model:
+
+ ```python
+import tensorflow as tf
+mirrored_strategy = tf.distribute.MirroredStrategy()
+with mirrored_strategy.scope():
+ model = N2V(config=config, name=model_name, basedir=basedir)
+```
+
+The first line imports tensorflow, the second line defines a strategy how tensorflow synchronizes the data between GPUs, and the third line makes sure that the model uses variables that are synchronized between GPUs. See the [tensorflow documentation](https://www.tensorflow.org/guide/distributed_training) for more details.
+
+With these parameters, the training takes about 20 min on four A100 GPUs:
+
+```
+CPU times: user 27min 41s, sys: 3min 45s, total: 31min 26s
+Wall time: 20min 26s
+```
+
+And we get a decently trained model:
+
+
+
+Note how the loss is less noisy because of the larger batch size, but the overall shape of the loss is very similar.
+
+## Denoising performance
+
+Let's plot the denoising results side-by-side:
+
+
+
+both networks perform very similar
+
+## Conclusions
+
+So we got just a 2-times speedup with 4 GPUs - what gives?
+
+The problem is that after each batch is processed, the parameters must be synchronized between the GPUs, resulting in data transfer overhead. Therefore, It seems that
\ No newline at end of file