setfit-integrated-gradients

Hacking SetFit so that it works with integrated gradients. See demo.ipynb for an example.

Integrated gradients is a way to explain the decisions of the model by scoring what parts of the input influenced a particular decision.

Note: This mini-library only supports binary classification with a scikit-learn logistic-regression head.

Installation

pip install -e .

The "-e" switch installs the package in develop mode.

Notes

I wrote this mini-library before SetFit 0.6.0. At the time, there was no SetFitHead class yet, so I just took the sklearn LogisticRegression and passed its parameters to an equivalent Torch class. I did my best to break the forward pass of SetFit into pieces so that I can push gradients through the head and up to the token embeddings.

Attributions from integrated gradients are computed per token and then averaged to get word-level attributions.

I'm leaving this here for posterity and in case it is useful to others for further hacking.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
setfit_ig		setfit_ig
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

setfit-integrated-gradients

Installation

Notes

About

Languages

License

kgourgou/setfit-integrated-gradients

Folders and files

Latest commit

History

Repository files navigation

setfit-integrated-gradients

Installation

Notes

About

Topics

Resources

License

Stars

Watchers

Forks

Languages