You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need our logs to have unique ID values we can use to index them in memory in an efficient manner. The IDs should be tied to the timestamp of the log so we can sort them and there should be no collision on ID values. When we start building out segment functionality, we want to be able to operate on these logs in a data structure efficiently (binary tree, min-heap, etc) so incorporating timestamp into the ID is important.
Approach
We should use Prefixed K-Sortable Unique IDentifiers (PKSUID) to accomplish this. A python module exists here. The "prefix" part allows for a prefixed identifier to the ID such as log_1032HU2eZvKYlo2CEPtcnUvl. This will allow us to use PKSUIDs for other objects in the future such as segments (ex: seg_1032HU2eZvKYlo2CEPtcnUvl).
Example
Below is an example of using the module. We want to make sure to set the timestamp ourselves with the timestamp value of the logs. These should be normalized to Coordinated Universal Time (UTC) (ticket here: #42)
frompksuidimportPKSUID# generate a new unique identifier with the prefix usruid=PKSUID('usr')
# returns 'usr_24OnhzwMpa4sh0NQmTmICTYuFaD'print(uid)
# returns: usrprint(uid.get_prefix())
# returns: 1643510623print(uid.get_timestamp())
# returns: 2022-01-30 02:43:43print(uid.get_datetime())
# returns: b'\x81>*\xccDJT\xf1\xbe\xa9\xf3&\xe8\xa5\xb2\xc1'print(uid.get_payload())
# convert from a str representation back to PKSUIDuid_from_string=PKSUID.parse('usr_24OnhzwMpa4sh0NQmTmICTYuFaD')
# this can now be used as usual# returns: 1643510623print(uid_from_string.get_timestamp())
# conversion to and parsing from bytes is also possibleuid_as_bytes=uid.bytes()
uid_from_bytes=PKSUID.parse_bytes(uid_as_bytes)
# returns: 2022-01-30 02:43:43print(uid_from_bytes.get_datetime())
# all the standard comparison operators are availableimporttimets=int(time.time())
# OUR USE CASElesser_uid, greater_uid=PKSUID('usr', timestamp=ts), PKSUID('usr', timestamp=ts+5)
# returns Trueprint(lesser_uid<greater_uid)
# except for the case of equivalence operators (eq, ne), the prefix is not taken into account when comparingprefixed_uid_1, prefixed_uid_2=PKSUID('diff', timestamp=ts), PKSUID('prefix', timestamp=ts+5)
# returns Trueprint(prefixed_uid_1<prefixed_uid_2)
Definition of Done
Build functionality into log class to create ID when parsing JSON
Tied to #40
Problem
We need our logs to have unique ID values we can use to index them in memory in an efficient manner. The IDs should be tied to the timestamp of the log so we can sort them and there should be no collision on ID values. When we start building out segment functionality, we want to be able to operate on these logs in a data structure efficiently (binary tree, min-heap, etc) so incorporating timestamp into the ID is important.
Approach
We should use Prefixed K-Sortable Unique IDentifiers (PKSUID) to accomplish this. A python module exists here. The "prefix" part allows for a prefixed identifier to the ID such as
log_1032HU2eZvKYlo2CEPtcnUvl
. This will allow us to use PKSUIDs for other objects in the future such as segments (ex:seg_1032HU2eZvKYlo2CEPtcnUvl
).Example
Below is an example of using the module. We want to make sure to set the timestamp ourselves with the timestamp value of the logs. These should be normalized to Coordinated Universal Time (UTC) (ticket here: #42)
Definition of Done
Remember to use Python type hints when implementing
The text was updated successfully, but these errors were encountered: