Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A complete re-write of the ABNF parser. #200

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

ldthomas
Copy link

This is a complete rewrite of the ABNF parser in abnf.ts. It uses a fixed, pre-generated ABNF grammar object instead of re-generating the object each time a message is parsed. It runs approximately twice as fast as the original parser, is more accurate in URI validation and removes the dependencies on uri-js and valid-url.

Notes:

  • packages/siwe/package.json - removed uri-js and valid-url dependencies.
    • note that a global search indicated that uri-js was never used anyway
    • suggest bumping the version to 2.4.0 but versioning is your prerogative
  • packages/siwe-parser/package.json - removed uri-js and valid-url dependencies. Added a script for re-generating the fixed ABNF grammar object, packages/siwe-parser/siwe-grammar.js, from the ABNF grammar, packages/siwe-parser/siwe-abnf.txt. npm run apg needs to be run each time a change is made to the ABNF grammar.
    • suggest bumping the version to 2.2.0 but versioning is your prerogative
  • even after deleting the dependence on uri-js and valid-url, the package-lock.json files still contain dozens of references to them. Deleting the package-lock.json files and re-generating them from scratch cleans them up considerably. However, I'm aware there may be unintended consequences from that maneuver. Just a suggestion if you want to clean them up.
  • packages/siwe/client.ts - contains a workaround for

import { ParsedMessage, parseIntegerNumber } from '@spruceid/siwe-parser';

  • as mentioned in issue Suggestion regarding siwe-parser usage of apg-js. #177, in my environment a build of the cloned version of this repository fails to find '@spruceid/siwe-parser'. I've not found a fix for this problem yet.
  • a large number of unit tests have been added for testing the accuracy of parsing both the message URI and the resource URIs

…s and valid-url. Added many new unit tests for validating both the message URI and the resource URIs. Added a script to re-generate the ABNF grammar object whenever the ABNF grammar needs to be changed.
Copy link
Member

@sbihel sbihel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

There are some lint errors (you can run npm run lint at the root of the repo).

packages/siwe-parser/lib/callbacks.ts Show resolved Hide resolved
packages/siwe-parser/lib/abnf.ts Outdated Show resolved Hide resolved
packages/siwe-parser/lib/t-ipv4-ipv6.test.ts Outdated Show resolved Hide resolved
this.address
);
}
// this test is done in siwe-parser, callback function "address"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all the checks done in this function are done as part of the parsing, then I suggest no to modify this function but instead move its call in the constructor to only the case where it's initialised from an object.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly you still want to validate the EIP-55 address and the URI in the constructor block for the message object. The EIP-55 address will be simple enough, but I'll need to build a special validator that validates only the URI and not an entire message. The ABNF grammar object, siwe-grammar.ts and the existing call back functions give me everything I need for that, but it will require having an option in the ParsedMessage constructor to parse only the URI. How about the resource URIs? Will they need to be validated as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to write a new isUri() function to use in the object block of the constructor. I just need to assemble it from existing pieces which won't be difficult. And, of course, do the unit testing. I'll need a few more days on this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit should resolve this conversation. An isUri() function has been added, unit tested and the message block of the SiweMessage constructor updated to test the EIP-55 address, the URI and all of the resource URIs if any are present. However, see the additional question in the following conversation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've extracted the URI and EIP55 validation out of the validation function, but remains:

  • domain
  • version
  • nonce
  • dates

These can (or already are) validated in the parser, so can't the validate function be left untouched and only called when constructing the message from an object?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed the proper flow of the validation in the message object case. In particular I missed the validateMessage() function. All I really needed to do was replace the valid-url isUri() function with the new APG isUri() function there. All other validations will then remain the same. I can easily make that change. However, I notice there is no validation of the resources URIs. Do you have any need for that? It is easily added.

Here’s a better suggestion. Remove the validateMessage() function altogether and replace it with

// this.validateMessage();
new ParsedMessage(this.prepareMessage());

I’ve tried it and it passes all of my objects unit tests. This will also validate all of the resources URIs. (NOTE: that perpareMessage() needs a fix for the “empty statement” case that #30 pointed out was allowed in the spec.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW all of the client.test.ts unit tests pass with this fix also. But I have a question. When I run jest on client.test.ts it prints out several screens of red error text and a lot of warnings that about validate() has been deprecated, But then at the end it shows that all tests have passed. What's a person to make of this? Does something need to be fixed here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it prints out several screens of red error text and a lot of warnings

This PR will require a new major version and the deprecated function will be removed then, so that will remove those warnings.

Here’s a better suggestion. Remove the validateMessage() function altogether and replace it with new ParsedMessage(this.prepareMessage());

I was worried we would lose the information provided by SiweErrorType (e.g. if a nonce isn't long enough it would clearly state so), and we do a bit but it still points to the field (e.g. now it says something like "line 9: invalid nonce") so I'm ok with this change. But could you remove the variants from SiweErrorType that aren't used anymore?

Copy link

@Cali93 Cali93 Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related, EIP55 validation should be optional #53 (comment)

packages/siwe/lib/client.ts Outdated Show resolved Hide resolved
packages/siwe-parser/lib/siwe-abnf.txt Show resolved Hide resolved
… is the newly published version 4.4.0, 2)the script "apg" will generate a typescript grammar object from the ABNF grammar siwe-abnf.txt. Added an optional "doTrace" parameter to the constructor of ParsedMessage. The default is "false". If set to "true" apg-js will trace the parse tree of parsed message and write it to output/siwe-parser-trace.html. Consequently the output directory has been added to ".gitignore" and ".npmignore".
…dded "packages/siwe-parser/lib/t-isUri.test.ts" for unit testing of the "isUri" function. Added tests in the object block of the “SiweMessage” constructor for testing the EIP-55 address, the URI and the resources URIs. Added “packages/siwe/lib/objects.test.ts” for unit testing of the object block.
@ldthomas
Copy link
Author

ldthomas commented May 8, 2024

I think I’m done here except for

  1. I’ve included a workaround in client.ts. For importing isUri. I had to point to the specific directory and file. I’ll need some advice on how to fix that.
  2. I’ve not used your JSON object file methodology for the unit tests. As is mentioned previously, I’m willing to familiarize myself with that and do the (tedious) work of moving all of the test data into the proper JSON format. But I’d like to finish here for now and leave that to a later PR.

… as the syntax of date times. Removed the SiweMessage private function validateMessage(). Replaced it in the message object block of the constructor by parsing the stringified message object. In this way both the message and the message object get exactly the same validation. Fixed a bug for the case of an empty statement in the function toMessage().
@ldthomas
Copy link
Author

Trace: After looking at it, redesigning the trace module and publishing a new release of apg-js is way too much work for a feature that is not necessary for the operation of siwe and probably wouldn’t get much use anyway. The code is still there in comments if someone really wants to use the trace feature to debug a message.

Date times: In the process of analyzing the validation of the message objects in SiweMessage I’ve noticed that there is an asymmetry between the way the messages and message objects are currently validated. The messages are validated by parsing them but the parser does not check the semantics of the date times. RFC 3339 defines the syntax of the date time but much of the semantics is simply mentioned in comments. For example, the syntax of the date-month is simply 2 integers but the semantics requires the number to be in the range 1-12. In the case of a message object the semantics are validated with the regex ISO8601. I’ve copied this regex to the parser callbacks.ts and there I use it to validate the date time semantics while parsing. I’ve added unit tests in siwe/lib/objects.test.ts to cover both syntax and semantic date time errors.

validateMessage(): I’ve removed this function altogether and replaced it by parsing the “stringified” version of the message object. In this way, there is complete symmetry between the validation of a message and a message object. (Incidentally, I also fixed a small bug in the toMessage() function so that the case of an empty statement is handled according to spec.)

From the root lint and test both run without errors. I hope this latest commit works for you. If so then I can move on to updating the unit testing to your JSON object methodology.

@ldthomas
Copy link
Author

My apologies for an oversight. I’m still getting myself up to speed with the different ways a message and message object are handled in the SiweMessage constructor. I missed a bracket paring and ended up parsing the message twice. Taking a more careful look at what validateMessage() does, the only thing is can see that it is checking that the parser doesn’t is the validation of the date-times. isValidISO8601Date() does a little more than just the regex. On the other hand, there are some message fields that the parser validates that validateMessage() does not:

  • scheme
  • statement
  • Chain ID

So in this latest commit, I’ve moved the complete date-time validation into the parser’s callback functions and fixed the parsing/validation in the message/object blocks so that there is no redundancy. My intention here now is that all fields are both syntactically and semantically correct if the parser succeeds.

BTW siwe/lib/utils.ts isValidISO8601Date(), which I copied into the parser’s callback functions has a bug. Line 67

if (inputDate) {

should read

if (inputMatch) {

Copy link

@Bettyrockergit Bettyrockergit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packages/siwe-parser/lib/t-chars.test.ts

@ldthomas
Copy link
Author

Thanks. Missed the debugging code left in this test.

@ldthomas
Copy link
Author

Hi @sbihel. Anything else you need? From my end I'm done here unless there is something else required for a merge. I'm waiting on that before I begin working on converting the unit tests to your JSON file methodology.

@sbihel
Copy link
Member

sbihel commented May 24, 2024

Thank you, it looks great. In addition to the clean-up of SiweErrorType:

BTW siwe/lib/utils.ts isValidISO8601Date(), which I copied into the parser’s callback functions has a bug

Would you mind fixing it as part of this PR please?

before I begin working on converting the unit tests to your JSON file methodology

Thank you for doing that, you can just move the tests from objects.test.ts and t-spec.test.ts -- the other tests are a bit too "low-level" and won't provide much value to other implementations.

@ldthomas
Copy link
Author

The fix to siwe/lib/utils.ts is done.

Converted siwe/lib/objects.test.ts to JSON file methodology.

I had already converted all of the siwe-parser/lib/t-*.test.ts to the JSON methodology and they are all included in unit.test.ts. If others are only interested in the specification tests they only need to use the valid-specification.json file and ignore the others. But I think having all of the unit tests available is nice for a quick check that anytime a change is made it hasn’t broken anything.

I really don’t want to mess with the SiweErrorType enum. The siwe-parser, past and present, never uses them. They only seem to be used in the siwe client and since I don’t really know what is going on there I really need to stay out of it. However, looking into that I did notice a big problem in client.test.ts. I’m not great with async programming so I’m not going to attempt to fix it but have a look at this code:

(n, test) => {
try {
new SiweMessage(test);
} catch (error) {
expect(Object.values(SiweErrorType).includes(error));
}
}
It can never fail and really does no testing at all. 1) If for some reason SiweMessage() unexpectedly succeeds, the test will just fall through without announcing failure. 2) expect(true) and expect(false), without any expect matchers, both succeed. So no matter what the value of Object.values(SiweErrorType).includes(error) the test succeeds. Not to mention that SiweMessage() doesn’t throw SiweErrorType anymore. My suggestion is to just replace this with an expect().toThrow() test. But I really don’t know what is going on with the tests that use async so I’m not going to touch that. Just pointing it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants