Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc build #166

Merged
merged 25 commits into from
Jan 27, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
2272df9
Refactor documentation workflow and deploy process
jay-m-dev Sep 20, 2024
9e8695a
Add legacy source files
jay-m-dev Sep 20, 2024
69989e6
Update site name and css path
jay-m-dev Sep 20, 2024
0dacd2d
Add logo
jay-m-dev Sep 21, 2024
1607551
Add back to top, copyright and primary color
jay-m-dev Sep 21, 2024
50d0df3
Add deprecation warning
jay-m-dev Sep 21, 2024
c336279
Update legacy documentation with no longer maintained warning
jay-m-dev Sep 23, 2024
7c683c9
Add favicon and cache dependencies
jay-m-dev Sep 23, 2024
4d3fe33
Update old doc label to archived
jay-m-dev Sep 23, 2024
1829eeb
Add logo on readme
jay-m-dev Sep 25, 2024
057766a
Merge branch 'main' into doc_build
jay-m-dev Dec 10, 2024
7cdf52e
Update TPOT2 in readme
jay-m-dev Dec 11, 2024
e963d48
Update TPOT2 in cite.md
jay-m-dev Dec 11, 2024
30caae4
Update TPOT2 in contribute.md
jay-m-dev Dec 11, 2024
b4b7c2f
Update TPOT2 in installation.md
jay-m-dev Dec 11, 2024
a39aac2
Update TPOT2 in support and using readmes
jay-m-dev Dec 11, 2024
a27d86b
Update TPOT2 in build_mkdocs
jay-m-dev Dec 11, 2024
f86b1cc
Update TPOT2 in get_configspace docstring
jay-m-dev Dec 11, 2024
1f953b3
Update TPOT2 in base_evolver docstring
jay-m-dev Dec 11, 2024
42d378c
Update TPOT2 in docstring
jay-m-dev Dec 11, 2024
c1c6085
Skip test_tpot_estimator_predict temporarily
jay-m-dev Dec 11, 2024
46dd968
refactoring tpot2 to tpot
nickmatsumoto Dec 23, 2024
83de78f
compat issues
nickmatsumoto Dec 23, 2024
2afe0c9
Merge branch 'main' into doc_build
jay-m-dev Jan 23, 2025
ed8669e
Merge branch 'main' into doc_build, fix library import conflicts
jay-m-dev Jan 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 22 additions & 2 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,14 @@ jobs:
with:
python-version: '3.10'

- name: Cache dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('docs/requirements_docs.txt') }}
restore-keys: |
${{ runner.os }}-pip-

- name: Install dependencies
run: |
pip install --upgrade pip
Expand All @@ -41,6 +49,18 @@ jobs:
run: |
bash docs/scripts/build_mkdocs.sh

- name: Build and Deploy Docs
- name: Build and Deploy Latest Docs
run: |
mike deploy --push --branch gh-pages latest

- name: Build and Deploy Archived Docs
run: |
mike deploy --config-file mkdocs_archived.yml --push --branch gh-pages archived

- name: Set Default Version
run: |
mike set-default latest --push --branch gh-pages

- name: Create alias for Latest Docs
run: |
mkdocs gh-deploy --force --clean --verbose
mike alias latest stable --push --branch gh-pages
7 changes: 5 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
*.pyc
.pytest_cache/
TPOT2.egg-info
TPOT.egg-info
TPOT.egg-info
*.tar.gz
*.pkl
*.json
Expand All @@ -14,4 +15,6 @@ target/
.venv/
build/*
*.egg
*.coverage*
*.coverage*
docs/documentation/
mkdocs.yml
56 changes: 31 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
# TPOT2
# TPOT

![Tests](https://github.com/EpistasisLab/tpot2/actions/workflows/tests.yml/badge.svg)
[![PyPI Downloads](https://img.shields.io/pypi/dm/tpot2?label=pypi%20downloads)](https://pypi.org/project/TPOT2)
[![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/tpot2?label=conda%20downloads)](https://anaconda.org/conda-forge/tpot2)
<center>
<img src="https://raw.githubusercontent.com/EpistasisLab/tpot/master/images/tpot-logo.jpg" width=300 />
</center>

<br>

![Tests](https://github.com/EpistasisLab/tpot/actions/workflows/tests.yml/badge.svg)
[![PyPI Downloads](https://img.shields.io/pypi/dm/tpot?label=pypi%20downloads)](https://pypi.org/project/TPOT)
[![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/tpot?label=conda%20downloads)](https://anaconda.org/conda-forge/tpot)

TPOT stands for Tree-based Pipeline Optimization Tool. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. Consider TPOT your Data Science Assistant.

Expand Down Expand Up @@ -33,8 +39,8 @@ The original version of TPOT was primarily developed at the University of Pennsy

## License

Please see the [repository license](https://github.com/EpistasisLab/tpot2/blob/main/LICENSE) for the licensing and usage information for TPOT2.
Generally, we have licensed TPOT2 to make it as widely usable as possible.
Please see the [repository license](https://github.com/EpistasisLab/tpot/blob/main/LICENSE) for the licensing and usage information for TPOT.
Generally, we have licensed TPOT to make it as widely usable as possible.

TPOT is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as
Expand All @@ -51,23 +57,23 @@ License along with TPOT. If not, see <http://www.gnu.org/licenses/>.

## Documentation

[The documentation webpage can be found here.](https://epistasislab.github.io/tpot2/)
[The documentation webpage can be found here.](https://epistasislab.github.io/tpot/)

We also recommend looking at the Tutorials folder for jupyter notebooks with examples and guides.

## Installation

TPOT2 requires a working installation of Python.
TPOT requires a working installation of Python.

### Creating a conda environment (optional)

We recommend using conda environments for installing TPOT2, though it would work equally well if manually installed without it.
We recommend using conda environments for installing TPOT, though it would work equally well if manually installed without it.

[More information on making anaconda environments found here.](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)

```
conda create --name tpot2env python=3.10
conda activate tpot2env
conda create --name tpotenv python=3.10
conda activate tpotenv
```

### Packages Used
Expand Down Expand Up @@ -99,7 +105,7 @@ Many of the hyperparameter ranges used in our configspaces were adapted from eit

### Note for M1 Mac or other Arm-based CPU users

You need to install the lightgbm package directly from conda using the following command before installing TPOT2.
You need to install the lightgbm package directly from conda using the following command before installing TPOT.

This is to ensure that you get the version that is compatible with your system.

Expand All @@ -109,10 +115,10 @@ conda install --yes -c conda-forge 'lightgbm>=3.3.3'

### Installing Extra Features with pip

If you want to utilize the additional features provided by TPOT2 along with `scikit-learn` extensions, you can install them using `pip`. The command to install TPOT2 with these extra features is as follows:
If you want to utilize the additional features provided by TPOT along with `scikit-learn` extensions, you can install them using `pip`. The command to install TPOT with these extra features is as follows:

```
pip install tpot2[sklearnex]
pip install tpot[sklearnex]
```

Please note that while these extensions can speed up scikit-learn packages, there are some important considerations:
Expand All @@ -126,11 +132,11 @@ We recommend using Python 3.9 when installing these extra features, as it provid


```
pip install -e /path/to/tpot2repo
pip install -e /path/to/tpotrepo
```

If you downloaded with git pull, then the repository folder will be named TPOT2. (Note: this folder is the one that includes setup.py inside of it and not the folder of the same name inside it).
If you downloaded as a zip, the folder may be called tpot2-main.
If you downloaded with git pull, then the repository folder will be named TPOT. (Note: this folder is the one that includes setup.py inside of it and not the folder of the same name inside it).
If you downloaded as a zip, the folder may be called tpot-main.


## Usage
Expand All @@ -140,17 +146,17 @@ See the Tutorials Folder for more instructions and examples.
### Best Practices

#### 1
TPOT2 uses dask for parallel processing. When Python is parallelized, each module is imported within each processes. Therefore it is important to protect all code within a `if __name__ == "__main__"` when running TPOT2 from a script. This is not required when running TPOT2 from a notebook.
TPOT uses dask for parallel processing. When Python is parallelized, each module is imported within each processes. Therefore it is important to protect all code within a `if __name__ == "__main__"` when running TPOT from a script. This is not required when running TPOT from a notebook.

For example:

```
#my_analysis.py

import tpot2
import tpot
if __name__ == "__main__":
X, y = load_my_data()
est = tpot2.TPOTClassifier()
est = tpot.TPOTClassifier()
est.fit(X,y)
#rest of analysis
```
Expand Down Expand Up @@ -207,15 +213,15 @@ good_function = lambda est, a=a, b=b : new_objective(est=est, a=a, b=b)

### Tips

TPOT2 will not check if your data is correctly formatted. It will assume that you have passed in operators that can handle the type of data that was passed in. For instance, if you pass in a pandas dataframe with categorical features and missing data, then you should also include in your configuration operators that can handle those feautures of the data. Alternatively, if you pass in `preprocessing = True`, TPOT2 will impute missing values, one hot encode categorical features, then standardize the data. (Note that this is currently fitted and transformed on the entire training set before splitting for CV. Later there will be an option to apply per fold, and have the parameters be learnable.)
TPOT will not check if your data is correctly formatted. It will assume that you have passed in operators that can handle the type of data that was passed in. For instance, if you pass in a pandas dataframe with categorical features and missing data, then you should also include in your configuration operators that can handle those feautures of the data. Alternatively, if you pass in `preprocessing = True`, TPOT will impute missing values, one hot encode categorical features, then standardize the data. (Note that this is currently fitted and transformed on the entire training set before splitting for CV. Later there will be an option to apply per fold, and have the parameters be learnable.)


Setting `verbose` to 5 can be helpful during debugging as it will print out the error generated by failing pipelines.


## Contributing to TPOT2
## Contributing to TPOT

We welcome you to check the existing issues for bugs or enhancements to work on. If you have an idea for an extension to TPOT2, please file a new issue so we can discuss it.
We welcome you to check the existing issues for bugs or enhancements to work on. If you have an idea for an extension to TPOT, please file a new issue so we can discuss it.

## Citing TPOT

Expand Down Expand Up @@ -281,8 +287,8 @@ BibTeX entry:
}
```

### Support for TPOT2
## Support for TPOT

TPOT2 was developed in the [Artificial Intelligence Innovation (A2I) Lab](http://epistasis.org/) at Cedars-Sinai with funding from the [NIH](http://www.nih.gov/) under grants U01 AG066833 and R01 LM010098. We are incredibly grateful for the support of the NIH and the Cedars-Sinai during the development of this project.
TPOT was developed in the [Artificial Intelligence Innovation (A2I) Lab](http://epistasis.org/) at Cedars-Sinai with funding from the [NIH](http://www.nih.gov/) under grants U01 AG066833 and R01 LM010098. We are incredibly grateful for the support of the NIH and the Cedars-Sinai during the development of this project.

The TPOT logo was designed by Todd Newmuis, who generously donated his time to the project.
Loading
Loading