-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DATETIME_RFC339 (sic) is not a correct regex for an RFC 3339 datetime #78
Comments
Current recommendation is to use https://github.com/closeio/ciso8601 for RFC3339 parsing. |
in #131 we've decided to switch to pydantic for datetime parsing/validation The if you feel that this should be changed please let us know or close this issue 🙏 |
That sounds good, as long as it's not both custom and wrong :) |
The datetime regex in https://github.com/stac-utils/stac-pydantic/blob/master/stac_pydantic/shared.py#L18
DATETIME_RFC339 = "%Y-%m-%dT%H:%M:%SZ"
is not correct wrt RFC 3339.
This is important because it is used in Search object validation, and incorrectly returns an error on some valid datetime values.
Copying this from my personal notes on RFC 3339 datetimes:
RFC 3339 is a profile of ISO 8601, adding these constraints:
- a complete representation of date and time (fractional seconds optional).
- requires 4-digit years
- only allows a period character to be used as the decimal point for fractional seconds
- requires the zone offset to be
Z
or like+00:00
, while ISO8601 allows like+0000
Also note that the 'T' separator is required, and there's quite a bit of confusion over this, see https://stackoverflow.com/questions/63783868/what-are-valid-date-time-separators-in-rfc3339-strings
These are a few examples of what would be allowed for ISO8601 but not RFC 3339:
Below are all valid RFC 3339 datetimes. Note the fractional seconds, Z or z as a timezone, positive and negative arbitrary offset timezones, T or any other character as a separator between date and time.
I think this is the correct regex for an RFC 3339 datetime:
r"^(\d\d\d\d)\-(\d\d)\-(\d\d)(T|t)(\d\d):(\d\d):(\d\d)(\.\d+)?(Z|([-+])(\d\d):(\d\d))$"
This is slightly different from the one in python-strict-rfc3339, as it allows T or t for the sep, and reverses +- and -+ so that the - doesn't need to be escaped with
\-
Matching the datetime string to this will ensure it is a valid RFC 3339 (not just an ISO 8601 datetime), and then an ISO8601 parser can be used to parse it further if need be.
The built-in Python
datetime
library is not sufficient to parse all valid datetimes here -- notably, it doesn't parse Z as a timezone.There are two options for this:
Additionally, hypothesis-jsonschema has support for generating dt's for testing: https://github.com/Zac-HD/hypothesis-jsonschema/blob/1c5f107230ccbd48c66d7c6693833745a598e294/src/hypothesis_jsonschema/_from_schema.py
The text was updated successfully, but these errors were encountered: