[ENH] update ICA to sklearn from mdp #44

emdupre · 2018-05-13T03:10:40Z

Pending discussion in #14.

This is a breaking change, since the implementation of FastICA is slightly different across MDP and sklearn.

tedana/decomposition/eigendecomp.py

tedana/workflows/tedana.py

tedana/cli/run_tedana.py

tsalo · 2018-05-18T17:04:52Z

I don't know if backwards compatibility is a requirement for new versions before 1.0.0, but could this PR be added to a 0.0.2 or 1.0.0 milestone? I don't think it belongs in 0.0.1.

emdupre · 2018-05-18T18:05:23Z

Yes, agreed ! This is a 0.1.0 feature, in my mind— the main one, I think.
I don't think I have it on the 0.0.1 milestone, but I'll go ahead and make the 0.1.0 milestone now !

codecov · 2018-05-18T18:09:19Z

Codecov Report

Merging #44 into master will decrease coverage by 0.02%.
The diff coverage is 9.09%.

@@            Coverage Diff            @@
##           master     #44      +/-   ##
=========================================
- Coverage   48.72%   48.7%   -0.03%     
=========================================
  Files          32      32              
  Lines        2079    2080       +1     
=========================================
  Hits         1013    1013              
- Misses       1066    1067       +1

Impacted Files	Coverage Δ
tedana/info.py	`100% <ø> (ø)`	⬆️
tedana/workflows/tedana.py	`12% <0%> (+0.09%)`	⬆️
tedana/decomposition/eigendecomp.py	`10.81% <11.11%> (-0.15%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 87cb2e2...1025ace. Read the comment docs.

rmarkello · 2018-05-18T18:34:01Z

No big surprise here: from the most recent Circle build it looks like the first file saved out post-ICA is different. It might be worth doing some manual inspection to see how different these are, and if it's to a tolerance we're comfortable with...

emdupre · 2018-05-18T18:40:52Z

Yes, agreed. @prantikk specifically asked that we

try to make sure final SNR and contrast stay comparable.

which I think is a good starting point for manual inspection !

tsalo · 2018-11-06T14:34:36Z

Now that we've committed to merging this, I wanted to check in about it. Other than dealing with the conflicts, is there anything that needs to be done before this one can be merged?

emdupre · 2018-11-06T14:38:39Z

Thanks for checking in on this, @tsalo !! I think I just need to resolve the conflicts (which I'm happy to do later today !), but I wanted to give a little more time for feedback on the roadmap.

Also, I realized I never did the promised comparison 😞 Do you think it would still be useful to do ?

tsalo · 2018-11-06T14:48:22Z

I agree that varying the seed will probably result in variability on par with the differences between the mdp and sklearn implementations. I personally don't think we need those comparisons to merge. We'll need to run the new version and inspect the results when we update the integration tests anyway, right?

emdupre · 2018-11-06T14:51:29Z

That's ok by me, unless @handwerkerd or @KirstieJane disagree ! Either way, I'll fix these merge conflicts later today and wait until we've given a full week for the roadmap RFC (in #151) before merging in :)

tedana/workflows/tedana.py

tsalo · 2018-11-11T15:32:03Z

I know we've probably discussed this before, but I can't see where- Does it matter that FastICA whitens the data?

emdupre · 2018-11-11T15:48:03Z

MDP was already whitening the data -- we had not supplied a whitened param, so by default we were whitening the data !

tsalo · 2018-11-11T16:01:09Z

Is it okay that MDP was doing it too though?

emdupre · 2018-11-11T17:27:38Z

Yes, it should be fine. We actually want whitened data, especially if (as previously) we weren't selecting principal components based on descending eigenvalues. Even when we are, though, we still want to account for the fact that descending eigenvalues explain differential amounts of variance (since, in whitened data, they should all be equal amounts). I just found this review and think it explains the idea well !

tsalo · 2018-11-11T18:15:36Z

If the whitening is performed by FastICA, and we aren't using a decision tree to select PCA components during that whitening stage, do we still need the PCA step?

emdupre · 2018-11-11T18:50:59Z

@tsalo and I chatted off-line and got a better handle on this point (since it is quite confusing !).

We do need the PCA step. The reason this is confusing is because if we allow for whitening within the ICA we are introducing a second PCA, which makes the first seem redundant. But it's not ! The first PCA allows us to dimensionally reduce the data, by taking dimensions that meet some criteria -- we can (and I think should !) keep discussing what those criteria are in #101.

But the second PCA, performed inside the fastICA call, does not dimensionally reduce the data. Instead, it just orthogonalizes components which can then be statistically whitened to help the ICA converge.

There's a CrossValidated answer making exactly this point, and might be a useful reference !

emdupre · 2018-11-13T15:55:11Z

Merging this, since #151 is merged in ! Thanks everyone for all of your feedback, here !

* Added flow charts and some text * Finished flow charts and text. Co-authored-by: marco7877 <[email protected]> --------- Co-authored-by: marco7877 <[email protected]>

* Decision tree refactor with minimal and kundu * Fix commented-out tedana workflow * Appease the style checker * All tremble before the mighty linter * Actually fix incorrect style checker issue * Unfix another style checker error * Attempt to make Black happy, even though it does not actually say what's wrong * ran black * Added elbows to reports * fixing kundu tree and added calc_median * kundu.json added comment * kundu kappa_elbow is GTE not GT * kundu dtm matches main and minimal updated * flake8 style fixes * fixed linting * fixed report elbow warning * removed unneeded second d_table calc function * Links building decision trees to index * Adds ComponentSelector to API docs * Set language to English * Fix dead nilearn link * Add load_config and ComponentSelector to API docs * Fix mixing matrix over-save bug * Separately modularized kappa & rho elbow calcs and created liberal rho elbow (#15) * kundu tree provisionalreject to unclassified * calc_rho_elbow progress * calc_rho_elbow done * Removed calc_varex_upper_p * Removed kappa_rho_elbow tests * both decision trees running * linting fixes * Enable tedana_reclassify as console script * No errors if no xcomp but also no decide_comps (#16) * Update tedana/io.py Co-authored-by: Taylor Salo <[email protected]> * Appease style checker * Appease the style checker? * Force to use up to date setuptools; installation bug otherwise * Remove out of date make entry * Create functional reclassify CLI * Replace blanks with n/a * Maybe appease black * Fix typo Co-authored-by: Eneko Uruñuela <[email protected]> * BIDSify some outputs * Appease black * Heavily revise ComponentSelector module docs * Fixing mid kappa A inconsistency (#17) * Output codes in kundu.json * fixed kappa ratio * Update tedana/selection/selection_nodes.py Co-authored-by: Joshua Teves <[email protected]> * minimal tree keep kappa>2rho Co-authored-by: Joshua Teves <[email protected]> * Drops 3.6 support * Remove 3.6 support from CircleCI tests * Reformat comment * Reduce line length * Update lint in Makefile * Correctly collect API submodule doc * Fix errors * Fix more sphinx * working on selector init documentation * Breaking up outputs.rst * partially updated output_file_descriptions.rst * changed n_bold_comps to n_accepted_comps * n_bold_comps to n_accepted_comps * ComponentSelector.py API docs cleaned up * selection_nodes decision_docs updated * selection_nodes docstrings cleaned up * Fixed a test for selection_nodes * Updated faq for tedana_reclassify and tree options * docstrings in tedica and other small updates * Updated docstrings in selection_utils.py * Update docs/output_file_descriptions.rst * Working on improving selector documentation (#18) * working on selector init documentation * Breaking up outputs.rst * partially updated output_file_descriptions.rst * changed n_bold_comps to n_accepted_comps * n_bold_comps to n_accepted_comps * ComponentSelector.py API docs cleaned up * selection_nodes decision_docs updated * selection_nodes docstrings cleaned up * Fixed a test for selection_nodes * Updated faq for tedana_reclassify and tree options * docstrings in tedica and other small updates * Updated docstrings in selection_utils.py * Update docs/output_file_descriptions.rst Co-authored-by: Joshua Teves <[email protected]> * Remove manual selection * Force user to pick a tree * Fix CLI test * Revert "Force user to pick a tree" This reverts commit 4fc656f. * Revert "Fix CLI test" This reverts commit 4038336. * Make kundu default tree * Attempt to fix error * Adds input data to registry * Revert "Adds input data to registry" This reverts commit c7349bd. * Adds input registration * Appease linter * Add class template start * Add previous workflow registry into new one * Fix failure to update tags and classifications in manual * Fix missing less likely BOOLD tag * Adds more useful reporting for unused metrics * Create generated metrics * Update line terminator * Force black to run before flake8 * Updates percentile call * more doc updates * fixed meica to v2.5 in docstrings * docs building again * more updates to building decision trees * improved docs (#19) * working on selector init documentation * Breaking up outputs.rst * partially updated output_file_descriptions.rst * changed n_bold_comps to n_accepted_comps * n_bold_comps to n_accepted_comps * ComponentSelector.py API docs cleaned up * selection_nodes decision_docs updated * selection_nodes docstrings cleaned up * Fixed a test for selection_nodes * Updated faq for tedana_reclassify and tree options * docstrings in tedica and other small updates * Updated docstrings in selection_utils.py * Update docs/output_file_descriptions.rst * more doc updates * fixed meica to v2.5 in docstrings * docs building again * more updates to building decision trees Co-authored-by: Joshua Teves <[email protected]> * Get rid of optional method keyword * Revert "Get rid of optional method keyword" This reverts commit e5fdec1. * Revert "Updates percentile call" This reverts commit 9d6a487. * Revert "Update line terminator" This reverts commit 8cf697c. * Autodocument ComponentSelector methods/attributes (#20) * Rename ComponentSelector module. * Document the ComponentSelector directly. * fixed rename of component_selector * Fixed remaining transition to component_selector (#21) * working on selector init documentation * Breaking up outputs.rst * partially updated output_file_descriptions.rst * changed n_bold_comps to n_accepted_comps * n_bold_comps to n_accepted_comps * ComponentSelector.py API docs cleaned up * selection_nodes decision_docs updated * selection_nodes docstrings cleaned up * Fixed a test for selection_nodes * Updated faq for tedana_reclassify and tree options * docstrings in tedica and other small updates * Updated docstrings in selection_utils.py * Update docs/output_file_descriptions.rst * more doc updates * fixed meica to v2.5 in docstrings * docs building again * more updates to building decision trees * fixed rename of component_selector Co-authored-by: Joshua Teves <[email protected]> * more doc updates * mostly classification_output_descriptions * Fixed io API and selector API warnings * message message * key parts of docs all updated * output_file_descriptions fully updated * filled testing gaps for component_selector * Updates integration test fnames * Try a numpy fix * Try again * Remove dead code * full selector coverage (#23) * Add tedana_reclassify tests * Actually add test to circle workflow * Maybe actually add it * Change o to outdir * Fix noreports maybe * Fix tedort * CircleCI are you okay? * Circle if you keep this up I will switch to Actions * Revert "Circle if you keep this up I will switch to Actions" This reverts commit ad29c0d. * Maybe silence duecredit and re-trigger Circle * Try something else * Guess that wasn't legal * Switch main to _main * Add to pyproject.toml * Force it to be editable * Add references to resources package * Dispose of sanity check * Add more reclassify tests * Adaptive mask is not a bool * Add label for setup.cfg * Revert "Adaptive mask is not a bool" This reverts commit f7db360. * Add resource files * Clarify variables * Update date and weep * Fixed NoLikelyBOLDBug (#24) * Fixed NoLikelyBOLDBug * Updated docs for Likely BOLD * Added note for when ICA will rerun * updated message * New verbose tag for more detailed logging. * at_least_num_exist to classification_doesnt_exist * Cleaned up selector logging output * fixed debug logging * Temporarily turn on force overwrite for redo ICA * Fixed I007 divergence * calc_varex_thresh now has num_highest_var_comps * fixed linting errors * Update integration test data * Adds csv and text file reading for manual acc/rej * Add tests for CustomEncoder * Adds bibtex warning check test * Appease linter * Fix unused metrics warning * Add reclassify tests and patches to test failures * Make stylistic changes. * Remove trailing whitespace. * Spacing in io. * More minor changes. * Add custom napoleon section "Generated Files" * Replace numTrue/numFalse with n_true/n_false. * Replace ifTrue/ifFalse with if_true/if_false. * Use fill_doc. * Style fixes. * more int32 * more int32 fun * Appease linter * Fixed style issues * Add RICA to Approach section of docs * Fixed CI style check failure * DTM documentation review (#30) * Standardization of usage descriptions * Minor grammar edits * Minor grammar/spelling edits * Update docs/faq.rst --------- * Rename reclassify force (#32) * changed tedana_reclassify and force * Added default messages to CLI workflows * clean up CLI default messages * added t2smap to function from CLI * style fix * Add defaults to --help output (#31) * added ica_reclassify to setup.cfg * Using a more persistent cache for the testing data (#33) * Cleans up how testing datasets are downloaded within test_integration.py. In Main & the current JT_DTM each dataset is downloaded in a slightly different way and the five-echo data are downloaded twice. * Added `data_for_testing_info` which gives the file hash location and local directory name for each of the four files we download. All tests are updated to use this function. * The local copy of testing data will now go into the `.testing_data_cache` subdirectory * The downloaded testing data will be in separate directories from the outputs so the downloaded directories can be completely static * When `download_test_data` is called, it will first download the metadata json to see if the last updated copy on osf.io is newer than the downloaded version and will only download if osf has a newer file. Downloading the metadata will happen frequently, but it will hopefully be fast. * The logger is now used to give a warning if osf.io cannot be accessed, but it will still run using cached data * Change to TestLGR.info * Fixing high variance classification mess (#34) * Added dec_reclassify_high_var_comps plus * clarified diff btwn rho_kundu and _liberal thresh * Clarified docs for minimal tree * Replace versioneer with hatch (#35) * Update gitignore. * Delete _version.py * Adopt new packaging. * Ignore the _version.py file. * Fix CI (#36) * Base the cache on pyproject.toml, not setup.cfg. * Also drop use of setup.py in publishing action. * Add flake8-pyproject as a requirement. (#37) * Try fixing coverage. (#38) * Improving ica_reclassify (#39) * ica_reclassify docs now rendering in usage.html * moves file parsing to ica_reclassify_workflow * added error checks and tests * Ica reclassify registry fixes (#42) * add pandas version check >= 1.5.2 and mod behavior (#938) * add version check and mod behavior if pandas >= 1.5.2 to prevent error in writing csv * formatting * adding P. Molfese --------- Co-authored-by: Molfese <[email protected]> * readded InputHarvester and expanduser * fixed handler base_dir path * mixing matrix file always in registry --------- Co-authored-by: Peter J. Molfese <[email protected]> Co-authored-by: Molfese <[email protected]> * Drop Python 3.6 and 3.7 support (#40) * Drop Python 3.6 and 3.7 support. * line_terminator --> lineterminator * added mixm to 4echo test (#43) * Updating Contributor Information (#41) * Some contributor updates * Added doc to Marco * Added flow charts and some text (#44) * Added flow charts and some text * Finished flow charts and text. Co-authored-by: marco7877 <[email protected]> --------- Co-authored-by: marco7877 <[email protected]> * RTDfix (#45) * Update documentation (#46) * Update docs. * Update docs/building_decision_trees.rst Co-authored-by: Dan Handwerker <[email protected]> --------- Co-authored-by: Dan Handwerker <[email protected]> * Output docs on one page (#47) * Output docs on one page * added new multi-echo lectures --------- Co-authored-by: Joshua Teves <[email protected]> Co-authored-by: handwerkerd <[email protected]> Co-authored-by: Taylor Salo <[email protected]> Co-authored-by: Eneko Uruñuela <[email protected]> Co-authored-by: handwerkerd <[email protected]> Co-authored-by: Taylor Salo <[email protected]> Co-authored-by: Eneko Uruñuela <[email protected]> Co-authored-by: Neha Reddy <[email protected]> Co-authored-by: Peter J. Molfese <[email protected]> Co-authored-by: Molfese <[email protected]> Co-authored-by: marco7877 <[email protected]> Co-authored-by: Taylor Salo <[email protected]>

[ENH] update ICA to sklearn from mdp

750a232

tsalo reviewed May 13, 2018

View reviewed changes

tedana/decomposition/eigendecomp.py Outdated Show resolved Hide resolved

tsalo reviewed May 13, 2018

View reviewed changes

tedana/decomposition/eigendecomp.py Outdated Show resolved Hide resolved

emdupre added 2 commits May 17, 2018 14:39

Address review comments

14d5ed1

Resolve merge conflicts

a16fd1c

tsalo reviewed May 18, 2018

View reviewed changes

tedana/workflows/tedana.py Outdated Show resolved Hide resolved

tsalo reviewed May 18, 2018

View reviewed changes

tedana/cli/run_tedana.py Outdated Show resolved Hide resolved

Patch merge errors

2097bd0

emdupre added this to the 0.1.0 milestone May 18, 2018

Merge remote-tracking branch 'upstream/master' into sklearn-ica

b26ddfb

emdupre force-pushed the sklearn-ica branch 5 times, most recently from 50591b1 to 652a795 Compare May 28, 2018 03:28

Update testing environment

c7241b7

emdupre force-pushed the sklearn-ica branch 7 times, most recently from 38071b1 to 694be16 Compare May 28, 2018 21:53

Modify circle yaml

a1c1f4c

emdupre force-pushed the sklearn-ica branch from 694be16 to a1c1f4c Compare May 30, 2018 03:33

KirstieJane modified the milestones: 0.1.0, transparent and reproducible processing Oct 31, 2018

emdupre mentioned this pull request Nov 5, 2018

[ENH] Rename modules #136

Merged

tsalo reviewed Nov 9, 2018

View reviewed changes

tedana/workflows/tedana.py Outdated Show resolved Hide resolved

Merge remote-tracking branch 'upstream/master' into sklearn-ica

a8cafce

emdupre force-pushed the sklearn-ica branch from 380f549 to a8cafce Compare November 9, 2018 18:27

emdupre added 2 commits November 12, 2018 16:49

[DOC] remove mdp package reference

1c65fa3

[FIX] Fix bad merge

1025ace

emdupre merged commit 0641061 into ME-ICA:master Nov 13, 2018

tsalo mentioned this pull request Jan 31, 2019

MLE estimation and convergence #200

Closed

emdupre mentioned this pull request Feb 1, 2019

Should PCA component selection leverage multi-echo information? #101

Closed

rmarkello mentioned this pull request Feb 8, 2019

Ambiguous error during "spatial clustering of components" step #181

Closed

tsalo added the output-change label Nov 20, 2020

tsalo mentioned this pull request Nov 20, 2020

Examine differences in results between original MEICA & current tedana #622

Open

jbteves added breaking change WIll make a non-trivial change to outputs and removed output-change labels Apr 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] update ICA to sklearn from mdp #44

[ENH] update ICA to sklearn from mdp #44

emdupre commented May 13, 2018 •

edited

Loading

tsalo commented May 18, 2018

emdupre commented May 18, 2018

codecov bot commented May 18, 2018 •

edited

Loading

rmarkello commented May 18, 2018

emdupre commented May 18, 2018

tsalo commented Nov 6, 2018

emdupre commented Nov 6, 2018

tsalo commented Nov 6, 2018

emdupre commented Nov 6, 2018

tsalo commented Nov 11, 2018

emdupre commented Nov 11, 2018

tsalo commented Nov 11, 2018

emdupre commented Nov 11, 2018

tsalo commented Nov 11, 2018

emdupre commented Nov 11, 2018

emdupre commented Nov 13, 2018

[ENH] update ICA to sklearn from mdp #44

[ENH] update ICA to sklearn from mdp #44

Conversation

emdupre commented May 13, 2018 • edited Loading

tsalo commented May 18, 2018

emdupre commented May 18, 2018

codecov bot commented May 18, 2018 • edited Loading

Codecov Report

rmarkello commented May 18, 2018

emdupre commented May 18, 2018

tsalo commented Nov 6, 2018

emdupre commented Nov 6, 2018

tsalo commented Nov 6, 2018

emdupre commented Nov 6, 2018

tsalo commented Nov 11, 2018

emdupre commented Nov 11, 2018

tsalo commented Nov 11, 2018

emdupre commented Nov 11, 2018

tsalo commented Nov 11, 2018

emdupre commented Nov 11, 2018

emdupre commented Nov 13, 2018

emdupre commented May 13, 2018 •

edited

Loading

codecov bot commented May 18, 2018 •

edited

Loading