Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor of template files #55

Merged
merged 97 commits into from
Apr 18, 2024
Merged
Show file tree
Hide file tree
Changes from 95 commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
53ce6a8
Create base class
jteijema Mar 29, 2024
92812e9
Change basic
jteijema Mar 29, 2024
7dd722f
Change arfi
jteijema Mar 29, 2024
feb9498
Move render to base class
jteijema Mar 29, 2024
31ba847
Remove messy naming of templates
jteijema Mar 29, 2024
5a46597
cleanup filehandler
jteijema Mar 29, 2024
f053794
pass wordcloud to doc renderer
jteijema Mar 29, 2024
04172cd
Start on multimodel class
jteijema Mar 29, 2024
c60bcf9
Update multimodel
jteijema Mar 29, 2024
f84c237
Update template_basic.py
jteijema Mar 29, 2024
a71bc39
lowcase fix
jteijema Mar 29, 2024
152f719
Update ruff workflow
jteijema Mar 29, 2024
ecea649
Noqa base tempalte
jteijema Mar 29, 2024
c1b036a
Update workflows
jteijema Mar 29, 2024
973ab2f
move dynamic parameter collection to render func
jteijema Mar 29, 2024
e36e46c
Update README.md
jteijema Mar 29, 2024
10e6835
Add some text
jteijema Mar 29, 2024
2957622
refactor template finder
jteijema Mar 30, 2024
c005cce
Rewrite default makita console output
jteijema Mar 30, 2024
f7830e6
Ruff!
jteijema Mar 30, 2024
d5bc734
Ruff! 2
jteijema Mar 30, 2024
aa0868b
Rename Template Classes
jteijema Apr 3, 2024
7002fa9
Format with Ruff
jteijema Apr 3, 2024
97cf357
Update noqa
jteijema Apr 3, 2024
dd14db4
noqa E501
jteijema Apr 3, 2024
bcbe68c
Update add Fallback term to fallback print
jteijema Apr 3, 2024
05a04d2
format the scripts
jteijema Apr 3, 2024
fd25f7a
Extra space at the end of templates
jteijema Apr 3, 2024
0ead841
Update basic example
jteijema Apr 3, 2024
78fa2be
Update basic readme
jteijema Apr 3, 2024
040bc81
Update arfi example
jteijema Apr 3, 2024
d682adb
Update arfi readme
jteijema Apr 3, 2024
6a0e8e9
Update MM example
jteijema Apr 3, 2024
229b106
Add linebreak at the end of the file using filehandler
jteijema Apr 3, 2024
61574e3
Update pyproject.toml
jteijema Apr 3, 2024
87219a6
Update ci-workflow.yml
jteijema Apr 3, 2024
5f6a75e
Remove file renaming
jteijema Apr 3, 2024
cec0e6f
Include py.template in ruff linter
jteijema Apr 3, 2024
5955362
Run Basic Template
jteijema Apr 3, 2024
ab3e789
Fix typo in workflow
jteijema Apr 3, 2024
dc31bad
same as before
jteijema Apr 3, 2024
9dfb057
increase simulation steps
jteijema Apr 3, 2024
4da49e0
Add synergy to workflow
jteijema Apr 3, 2024
13ce171
a different dir for the test run
jteijema Apr 3, 2024
977a5f8
Update ci-workflow.yml
jteijema Apr 3, 2024
9efbe9c
restrict sim to ubuntu
jteijema Apr 3, 2024
68d1c5d
fix workflow for non ubuntu
jteijema Apr 3, 2024
470f041
Fix small ruff errors
jteijema Apr 3, 2024
af40028
Merge remote-tracking branch 'upstream/main' into base-template-class
jteijema Apr 4, 2024
d9c4856
Add defaults to config
jteijema Apr 4, 2024
63a761a
remove query strategy from MM readme
jteijema Apr 4, 2024
44819e8
Improve platform handling
jteijema Apr 4, 2024
35d4167
update workflow
jteijema Apr 4, 2024
9ad220e
add test data folder to workflow
jteijema Apr 4, 2024
6bad78b
remove unused imports
jteijema Apr 4, 2024
9e7a057
Update ci-workflow.yml
jteijema Apr 4, 2024
d854693
Update ci-workflow.yml
jteijema Apr 4, 2024
d998706
Move template finder to inside template class
jteijema Apr 5, 2024
2d6349b
Move n_runs to static params
jteijema Apr 5, 2024
e3deaca
Ruff format
jteijema Apr 5, 2024
69341bc
Update entrypoint.py
jteijema Apr 5, 2024
7ff8603
n_runs fix in templates
jteijema Apr 5, 2024
29600f8
Update ci-workflow.yml
jteijema Apr 5, 2024
4132893
add modelmatrix to mm template
jteijema Apr 5, 2024
cf64ca4
Remove modifications from examples
jteijema Apr 11, 2024
a82338b
rename jobs
jteijema Apr 11, 2024
c21f05b
Update names in workflow
jteijema Apr 11, 2024
f41d64f
rename total files var
jteijema Apr 11, 2024
21bab0d
Add python version
jteijema Apr 11, 2024
e13fb4b
Update pyproject.toml
jteijema Apr 11, 2024
e29a8e8
Fix arguments passing
jteijema Apr 11, 2024
a02e587
Update template_multimodel.txt.template
jteijema Apr 11, 2024
afee598
Clean up impossible models
jteijema Apr 11, 2024
df915e7
Organize imports for arfi
jteijema Apr 11, 2024
4d3a2ef
add default min
jteijema Apr 11, 2024
4e507a9
not not
jteijema Apr 11, 2024
470eae3
set defaults
jteijema Apr 11, 2024
f00790b
rename some arguments
jteijema Apr 12, 2024
83f472e
Update args passing
jteijema Apr 12, 2024
1d59c15
Update config.py
jteijema Apr 12, 2024
8919c16
Update ci-workflow.yml
jteijema Apr 12, 2024
b930c59
Update ci-workflow.yml
jteijema Apr 12, 2024
8c44764
Update ci-workflow.yml
jteijema Apr 12, 2024
78c1181
lint after installing linter
jteijema Apr 12, 2024
1596b6d
Update ci-workflow.yml
jteijema Apr 12, 2024
c713f99
os agnostic directories
jteijema Apr 12, 2024
a1ee441
rename step in workflow
jteijema Apr 12, 2024
bb0fa0f
Scitree error windows
jteijema Apr 12, 2024
bc629b1
Update pip cache for windows
jteijema Apr 12, 2024
af2c247
Windows workflow workaround
jteijema Apr 12, 2024
2b90a2a
update windows cache
jteijema Apr 12, 2024
6de93e9
Update doc_README.md.template
jteijema Apr 12, 2024
2f34b36
Update doc_README.md.template
jteijema Apr 12, 2024
7cc387f
Merge remote-tracking branch 'upstream/main' into base-template-class
jteijema Apr 18, 2024
5bc7711
rename to balance_strategies
jteijema Apr 18, 2024
2955ef4
Rename template functions
jteijema Apr 18, 2024
8ad435d
Ruff formatter
jteijema Apr 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 38 additions & 28 deletions .github/workflows/ci-workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,51 +4,61 @@ jobs:
test-template-and-lint:
strategy:
matrix:
os: [macos-latest, windows-latest, ubuntu-latest]
os: [windows-latest, ubuntu-latest]
python-version: ['3.8', '3.12']
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@master
- uses: actions/setup-python@v4
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.8'
python-version: ${{ matrix.python-version }}
architecture: 'x64'
- name: Install makita
- name: Cache Python packages
uses: actions/cache@v4
with:
path: |
${{ runner.os == 'Windows' && 'C:\users\runneradmin\appdata\local\pip\cache' || '~/.cache/pip' }}
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Install dependencies
run: |
pip install .
- name: Install ruff
pip install . ruff scitree asreview-datatools asreview-insights synergy-dataset
- name: Lint python with ruff
run: |
pip install ruff
ruff check .
- name: Create directories using Python
run: python -c "import os; [os.makedirs(path, exist_ok=True) for path in ['./tmp/basic/data-test', './tmp/arfi/data', './tmp/multimodel/data', './tmp/scripts', './tmp/synergy/data']]"
- name: set up environment
run: |
mkdir tmp
cd tmp
mkdir -p basic/data
mkdir -p arfi/data
mkdir -p multimodel/data
cp ../.github/workflows/test_data/labels.csv basic/data/labels.csv
cp ../.github/workflows/test_data/labels.csv arfi/data/labels.csv
cp ../.github/workflows/test_data/labels.csv multimodel/data/labels.csv
- name: Test makita templates
cp .github/workflows/test_data/labels.csv ./tmp/basic/data-test/labels.csv
cp .github/workflows/test_data/labels.csv ./tmp/arfi/data/labels.csv
cp .github/workflows/test_data/labels.csv ./tmp/multimodel/data/labels.csv
- name: Render makita templates
run: |
cd tmp/basic
asreview makita template basic | tee output.txt
asreview makita template basic --classifier nb --feature_extractor tfidf --query_strategy max --n_runs 1 -s data-test -o output-test --init_seed 1 --model_seed 2 --skip_wordclouds --overwrite --instances_per_query 2 --stop_if min --balance_strategy double | tee output.txt
grep -q "ERROR" output.txt && exit 1 || true
cd ../arfi
asreview makita template arfi | tee output.txt
grep -q "ERROR" output.txt && exit 1 || true
cd ../multimodel
asreview makita template multimodel | tee output.txt
grep -q "ERROR" output.txt && exit 1 || true
- name: Run ShellCheck
- name: Render makita scripts
run: |
asreview makita add-script --all -o ./tmp/scripts | tee output.txt
grep -q "ERROR" output.txt && exit 1 || true
- name: Run SciTree
if: ${{ matrix.os != 'windows-latest' }}
uses: ludeeus/action-shellcheck@master
with:
scandir: './tmp'
env:
SHELLCHECK_OPTS: -e SC2148
- name: Generate makita scripts
run: |
asreview makita add-script --all
- name: Lint python with ruff
cd ./tmp/
scitree
- name: Execute basic template jobs file
if: ${{ matrix.os != 'windows-latest' }}
run: |
ruff .
cd tmp/synergy
synergy_dataset get -d van_de_Schoot_2018 -o ./data -l
asreview makita template basic --instances_per_query 100 --skip_wordclouds --overwrite --n_runs 2
sh jobs.sh
scitree
4 changes: 2 additions & 2 deletions .github/workflows/pythonpackage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ jobs:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: '3.x'
- name: Install dependencies
Expand Down
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ optional arguments:
--platform PLATFORM Platform to run jobs: Windows, Darwin, Linux. Default: the system of rendering templates.
--n_runs N_RUNS Number of runs. Default: 1.
--no_wordclouds Disables the generation of wordclouds.
--overwrite Automatically accepts all overwrite requests.
--classifier CLASSIFIER Classifier to use. Default: nb.
--feature_extractor FEATURE_EXTRACTOR Feature_extractor to use. Default: tfidf.
--query_strategy QUERY_STRATEGY Query strategy to use. Default: max.
Expand Down Expand Up @@ -148,6 +149,7 @@ optional arguments:
--platform PLATFORM Platform to run jobs: Windows, Darwin, Linux. Default: the system of rendering templates.
--n_priors N_PRIORS Number of priors. Default: 10.
--no_wordclouds Disables the generation of wordclouds.
--overwrite Automatically accepts all overwrite requests.
--classifier CLASSIFIER Classifier to use. Default: nb.
--feature_extractor FEATURE_EXTRACTOR Feature_extractor to use. Default: tfidf.
--query_strategy QUERY_STRATEGY Query strategy to use. Default: max.
Expand Down Expand Up @@ -175,18 +177,19 @@ optional arguments:
--platform PLATFORM Platform to run jobs: Windows, Darwin, Linux. Default: the system of rendering templates.
--n_runs N_RUNS Number of runs. Default: 1.
--no_wordclouds Disables the generation of wordclouds.
--overwrite Automatically accepts all overwrite requests.
--instances_per_query INSTANCES_PER_QUERY Number of instances per query. Default: 1.
--stop_if STOP_IF The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
--classifiers CLASSIFIERS Classifiers to use Default: ['logistic', 'nb', 'rf', 'svm']
--feature_extractors FEATURE_EXTRACTOR Feature extractors to use Default: ['doc2vec', 'sbert', 'tfidf']
--query_strategies QUERY_STRATEGY Query strategies to use Default: ['max']
--balancing_strategies BALANCE_STRATEGY Balance strategies to use Default: ['double']
--balance_strategies BALANCE_STRATEGY Balance strategies to use Default: ['double']
--impossible_models IMPOSSIBLE_MODELS Model combinations to exclude Default: ['nb,doc2vec', 'nb,sbert']
```

If you want to specify certain combinations of classifiers and feature
extractors that should and should not be used, you can use the `--classifiers`,
`--feature_extractors`, `--query_strategies`, `--balancing_strategies` and `--impossible_models` option. For instance, if you
`--feature_extractors`, `--query_strategies`, `--balance_strategies` and `--impossible_models` option. For instance, if you
want to exclude the combinations of `nb` with `doc2vec` and `logistic` with
`tfidf`, use the following command:

Expand Down
Loading