Skip to content

Commit

Permalink
🎨 speed up tutorial execution in CI, add some hyperlinks
Browse files Browse the repository at this point in the history
  • Loading branch information
Henry Webel committed May 28, 2024
1 parent 05eb01c commit 522b519
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 14 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test_tutorial.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,5 +23,5 @@ jobs:
- name: Test Tutorial
run: |
cd scripts
papermill 0_Tutorial.ipynb 0_Tutorial_out.ipynb
papermill 0_Tutorial.ipynb 0_Tutorial_out.ipynb -p epochs 4
39 changes: 26 additions & 13 deletions scripts/0_Tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@
"\n",
"*Authors:* \n",
"\n",
"Marc Pielies Avelli ([email protected], [email protected] )\n",
"Marc Pielies Avelli ([[email protected]](mailto:[email protected]), [[email protected]](mailto:[email protected]) )\n",
"\n",
"Arnor Sigurdsson ([email protected], [email protected])\n",
"Arnor Sigurdsson ([[email protected]](mailto:[email protected]), [[email protected]](mailto:[email protected]))\n",
"\n",
"Henry Webel ([email protected])\n",
"Henry Webel ([[email protected]](mailto:[email protected]))\n",
" \n",
"Simon Rasmussen ([email protected], [email protected])\n",
"Simon Rasmussen ([[email protected]](mailto:[email protected]), [[email protected]](mailto:[email protected]))\n",
"\n",
"## Introduction ##\n",
"\n",
Expand Down Expand Up @@ -47,9 +47,7 @@
"- **Fusion module**: The extracted high-level, abstract features are then combined in a number of dense layers.\n",
"- **Output module**: A final set of dense layers maps the feature vectors to the targets in a regression task per node, which will represent the EU-seq signal at a distance from the TSS of the gene defining the sample (located in the middle).\n",
"\n",
"CLASTER was built using the EIR framework, a python package developed by Arnor Sigurdsson ([email protected]) which makes it easy to replicate and adapt it to new tasks. Documentation on EIR can be found at https://eir.readthedocs.io/en/latest/. Have a look at the tutorials to get a feel for the config files required and all possibilities that EIR offers. The framework uses hydra (https://hydra.cc/docs/intro/) to manage a set of configuration files, which allow you to streamline the process.\n",
"\n",
"\n"
"CLASTER was built using the EIR framework, a python package developed by Arnor Sigurdsson ([[email protected]](mailto:[email protected])) which makes it easy to replicate and adapt it to new tasks. Documentation on EIR can be found at [eir.readthedocs.io](https://eir.readthedocs.io/en/latest/). Have a look at the tutorials to get a feel for the config files required and all possibilities that EIR offers. The framework uses hydra (https://hydra.cc/docs/intro/) to manage a set of configuration files, which allow you to streamline the process."
]
},
{
Expand Down Expand Up @@ -812,6 +810,20 @@
"## 2. Training and validating CLASTER"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"tags": [
"Parameters",
"parameters"
]
},
"outputs": [],
"source": [
"epochs: int = 120"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -845,11 +857,11 @@
"output_path.mkdir(parents=True, exist_ok=True)\n",
"\n",
"# Write config files explicitly:\n",
"training_microc_rotated_yaml_contents = {\"globals.yaml\":\"\"\"\n",
"training_microc_rotated_yaml_contents = {\"globals.yaml\": f\"\"\"\n",
"output_folder: ../runs_tutorial/gene_expression_microc_rotated_pure_conv_tutorial/ \n",
"checkpoint_interval: 860\n",
"sample_interval: 860 \n",
"n_epochs: 120\n",
"checkpoint_interval: {int(860 /120 * epochs)}\n",
"sample_interval: {int(860 /120 * epochs)} \n",
"n_epochs: {epochs}\n",
"batch_size: 64\n",
"optimizer: \"adamw\"\n",
"lr: 0.0001 \n",
Expand Down Expand Up @@ -1570,7 +1582,8 @@
"results_path = Path(\"../runs_tutorial/gene_expression_microc_rotated_pure_conv_tutorial/results/\")\n",
"N_BINS= 200\n",
"condition_list = [\"_ctrl\"]\n",
"SPLIT = 4300 #860\n",
"SPLIT = int(860 /120 * epochs)\n",
"# SPLIT = 4300 # for 860 samples, 120 epochs\n",
"\n",
"ids, predicted, actual = _get_predictions(results_path, N_BINS, condition_list, SPLIT)\n",
"\n",
Expand All @@ -1597,7 +1610,7 @@
"\n",
"CLASTER was designed to predict nascent transcription, but the EIR framework can be reused and expanded for any other purpose. It can handle plenty of other data modalities as well! You just need a well defined input containing some information or signal that can be used to infer the values of another, as an output. You can then customize your network using EIR, provide the inputs and targets in one of the supported data modalities, and name samples with the same ID for input and output.\n",
"\n",
"If in doubt, feel free to reach us!\n"
"If in doubt, feel free to reach out to us!\n"
]
}
],
Expand Down

0 comments on commit 522b519

Please sign in to comment.