Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add checks to prevent negative outputs of map fst's that will be set to zero by the ReLU activation #13

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

William-Baker
Copy link
Contributor

Currently there is no validation check to prevent negative values being produced by a Map operation, these will get set to 0 by the ReLU in MLP layers and cause craft models to behave unexpectedly.

The simplest example that would cause this is a program like this:

def test_prog():
    SO1 = rasp.Map(lambda x: x - 1, rasp.tokens).annotated(name="SO1", encoding=rasp.Encoding.NUMERICAL)
    return SO1
vocab = [0, 1]
sequence_length = 1

you would expect the model to output -1 and 0 for inputs 0 and 1, but instead the craft model will output 0 and 0.

I have added a validation check within the dynamic program validator and added an exception into the compiler that will throw an error if any negative values are found at the output of the first layer of a Map MLP.

@langosco
Copy link
Contributor

I think the checks should only check for negative values on numerical Maps. Currently it also raises an error on categorical Maps which we don't want, since categorical maps deal fine with negative values.

@langosco
Copy link
Contributor

langosco commented Jan 19, 2024

On another note - I wonder if this is fixable in the compiler? Naively I would think it should be not hard to fix, since rasp.LinearSequenceMap is able to handle negative output values, but I haven't looked at what's going on at the compiler level here.

@William-Baker
Copy link
Contributor Author

I've now added a check to verify that the map is numerical.
The Sequence Map operation at the very least does require the non-linearity of the ReLU in the dense layers to function. But perhaps a bias could be added in fst before the activation and subtracted in snd?

@david-lindner
Copy link
Collaborator

As discussed offline: let's make the check in the basis inference step print a warning instead of raising an exception, and let's add a test case for the validator check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants