Welcome to the repository of the Website that shows the paper's results
The website is coded in Python, using the framework Dash. The data is stored on AWS s3. We use Cloud Run from Google Cloud Platform to host our website.
The fact that you cannot have access to the data stored on AWS (since we don't share the credentials) makes it harder to contribute to the project. However, you can still propose the some changes with a pull request.
Once you have forked the repository and cloned it, you can install the package with its development dependencies using:
pip install -e .[env]
The command launch_local_website
allows you to test the website locally.
If you are using Visual Studio Code, a .devcontainer folder is already prepared so that you can work in a dedicated container.
Feel free to discuss about you ideas in the discussion section.
📦Website
┣ 📂.devcontainer
┃ ┣ 📜Dockerfile
┃ ┗ 📜devcontainer.json
┣ 📂post_processing
┃ ┣ 📂all_categories
┃ ┃ ┣ 📜feature_importances_correlation.py
┃ ┃ ┣ 📜information.py
┃ ┃ ┣ 📜scores_feature_importances.py
┃ ┃ ┗ 📜scores_residual.py
┃ ┣ 📂custom_categories
┃ ┃ ┣ 📜create_custom_data.py
┃ ┃ ┗ 📜select_categories.py
┃ ┗ 📜__init__.py
┣ 📂website
┃ ┣ 📂dataset
┃ ┃ ┗ 📜page.py
┃ ┣ 📂feature_importances
┃ ┃ ┗ 📜page.py
┃ ┣ 📂feature_importances_correlations
┃ ┃ ┣ 📂between_algorithms
┃ ┃ ┃ ┣ 📂tabs
┃ ┃ ┃ ┃ ┣ 📜all_categories.py
┃ ┃ ┃ ┃ ┣ 📜custom_categories.py
┃ ┃ ┃ ┃ ┗ 📜shared_plotter.py
┃ ┃ ┃ ┗ 📜page.py
┃ ┃ ┗ 📂between_targets
┃ ┃ ┃ ┣ 📂tabs
┃ ┃ ┃ ┃ ┣ 📜all_categories.py
┃ ┃ ┃ ┃ ┣ 📜custom_categories.py
┃ ┃ ┃ ┃ ┗ 📜shared_plotter.py
┃ ┃ ┃ ┗ 📜page.py
┃ ┣ 📂prediction_performances
┃ ┃ ┣ 📂feature_importances
┃ ┃ ┃ ┣ 📂tabs
┃ ┃ ┃ ┃ ┣ 📜all_categories.py
┃ ┃ ┃ ┃ ┣ 📜custom_categories.py
┃ ┃ ┃ ┃ ┗ 📜shared_plotter.py
┃ ┃ ┃ ┗ 📜page.py
┃ ┃ ┗ 📂residual
┃ ┃ ┃ ┣ 📂tabs
┃ ┃ ┃ ┃ ┣ 📜all_categories.py
┃ ┃ ┃ ┃ ┣ 📜custom_categories.py
┃ ┃ ┃ ┃ ┗ 📜shared_plotter.py
┃ ┃ ┃ ┗ 📜page.py
┃ ┣ 📂residual_correlations
┃ ┃ ┣ 📂tabs
┃ ┃ ┃ ┣ 📜all_categories.py
┃ ┃ ┃ ┣ 📜custom_categories.py
┃ ┃ ┃ ┗ 📜share_plotter.py
┃ ┃ ┗ 📜page.py
┃ ┣ 📂utils
┃ ┃ ┣ 📜__init__.py
┃ ┃ ┣ 📜aws_loader.py
┃ ┃ ┣ 📜controls.py
┃ ┃ ┣ 📜graphs.py
┃ ┃ ┗ 📜rename.py
┃ ┣ 📜__init__.py
┃ ┣ 📜app.py
┃ ┣ 📜index.py
┃ ┗ 📜introduction.py
┣ 📜.dockerignore
┣ 📜.gitignore
┣ 📜Dockerfile
┣ 📜LICENSE
┣ 📜README.md
┗ 📜setup.py
📦age-vs-survival
┣ 📂all_categories
┃ ┣ 📂correlations
┃ ┃ ┣ 📂feature_importances
┃ ┃ ┃ ┣ 📜pearson_between_algorithms.feather
┃ ┃ ┃ ┣ 📜pearson_between_targets.feather
┃ ┃ ┃ ┣ 📜pearson_[...].feather # Original data not used in website
┃ ┃ ┃ ┣ 📜spearman_between_algorithms.feather
┃ ┃ ┃ ┣ 📜spearman_between_targets.feather
┃ ┃ ┃ ┗ 📜spearman_[...].feather # Original data not used in website # Original data not used in website
┃ ┃ ┗ 📂residual
┃ ┃ ┣ 📜number_participants_[...].feather
┃ ┃ ┣ 📜pearson_[...].feather
┃ ┃ ┗ 📜spearman_[...].feather
┃ ┣ 📜information.feather
┃ ┣ 📜scores_feature_importances.feather
┃ ┗ 📜scores_residual.feather
┣ 📂custom_categories
┃ ┣ 📂correlations
┃ ┃ ┣ 📂feature_importances
┃ ┃ ┃ ┣ 📜pearson_between_algorithms.feather
┃ ┃ ┃ ┣ 📜pearson_between_targets.feather
┃ ┃ ┃ ┣ 📜spearman_between_algorithms.feather
┃ ┃ ┃ ┗ 📜spearman_between_targets.feather
┃ ┃ ┗ 📂residual
┃ ┃ ┣ 📜number_participants_[...].feather
┃ ┃ ┣ 📜pearson_[...].feather
┃ ┃ ┗ 📜spearman_[...].feather
┃ ┣ 📜information.feather
┃ ┣ 📜scores_feature_importances.feather
┃ ┗ 📜scores_residual.feather
┣ 📂examination
┃ ┗ 📜category.feather
┣ 📂feature_importances
┃ ┣ 📂examination
┃ ┃ ┗ 📜category.feather
┃ ┣ 📂laboratory
┃ ┃ ┗ 📜category.feather
┃ ┗ 📂questionnaire
┃ ┗ 📜category.feather
┣ 📂laboratory
┃ ┗ 📜category.feather
┣ 📂questionnaire
┃ ┗ 📜category.feather
┣ 📜Results.xlsx # Original data not used in website
┗ 📜favicon.ico
A CI / CD workflow has been created with Git Actions in order to deploy the website automatically on demand. You can find the development version of the website here.
Under the folder data, the files that can be updated are the following:
- Results.xlsx
- correlations/feature_importances/correlation_type_main_category.feather
- correlations/feature_importances/correlation_type_std_main_category.feather
- correlations/residual/number_participants_type_of_death_type_of_death.feather
- correlations/residual/correlation_type_type_of_death_type_of_death.feather
- main_category/category.feather
To generate the other files, all the scripts under the folder data has to be executed. For that, execute:
./post_processing/update_all_categories.sh
Then, the file select_categories.py outputs the 30 best models:
python ./post_processing/custom_categories/select_categories.py
This list has to match with the list called CUSTOM_CATEGORIES_INDEX in the website code. Once updated, you can execute a last script that prunes the files generated for all the categories:
python ./post_processing/custom_categories/create_custom_data.py
Once everything is done, don't forget to push the new __init__.py file to GitHub, to push the new data the AWS s3 so that you can use the CI / CD pipeline.
To sync the data folder to AWS s3, you can use the following:
aws s3 sync data/ s3://age-vs-survival/ --delete --exclude "*.zip"