feat(LogUniqueId): Create unique ID for logs #41

rc10house · 2024-06-26T16:26:59Z

Tied to #40

Problem

We need our logs to have unique ID values we can use to index them in memory in an efficient manner. The IDs should be tied to the timestamp of the log so we can sort them and there should be no collision on ID values. When we start building out segment functionality, we want to be able to operate on these logs in a data structure efficiently (binary tree, min-heap, etc) so incorporating timestamp into the ID is important.

Approach

We should use Prefixed K-Sortable Unique IDentifiers (PKSUID) to accomplish this. A python module exists here. The "prefix" part allows for a prefixed identifier to the ID such as log_1032HU2eZvKYlo2CEPtcnUvl. This will allow us to use PKSUIDs for other objects in the future such as segments (ex: seg_1032HU2eZvKYlo2CEPtcnUvl).

Example

Below is an example of using the module. We want to make sure to set the timestamp ourselves with the timestamp value of the logs. These should be normalized to Coordinated Universal Time (UTC) (ticket here: #42)

from pksuid import PKSUID

# generate a new unique identifier with the prefix usr
uid = PKSUID('usr')

# returns 'usr_24OnhzwMpa4sh0NQmTmICTYuFaD'
print(uid)

# returns: usr
print(uid.get_prefix())

# returns: 1643510623
print(uid.get_timestamp())

# returns: 2022-01-30 02:43:43
print(uid.get_datetime())

# returns: b'\x81>*\xccDJT\xf1\xbe\xa9\xf3&\xe8\xa5\xb2\xc1'
print(uid.get_payload())

# convert from a str representation back to PKSUID
uid_from_string = PKSUID.parse('usr_24OnhzwMpa4sh0NQmTmICTYuFaD')

# this can now be used as usual
# returns: 1643510623
print(uid_from_string.get_timestamp())

# conversion to and parsing from bytes is also possible
uid_as_bytes = uid.bytes()
uid_from_bytes = PKSUID.parse_bytes(uid_as_bytes)

# returns: 2022-01-30 02:43:43
print(uid_from_bytes.get_datetime())

# all the standard comparison operators are available
import time
ts = int(time.time())

# OUR USE CASE
lesser_uid, greater_uid = PKSUID('usr', timestamp = ts), PKSUID('usr', timestamp=ts + 5)

# returns True
print(lesser_uid < greater_uid)

# except for the case of equivalence operators (eq, ne), the prefix is not taken into account when comparing
prefixed_uid_1, prefixed_uid_2 = PKSUID('diff', timestamp = ts), PKSUID('prefix', timestamp=ts + 5)

# returns True
print(prefixed_uid_1 < prefixed_uid_2)

Definition of Done

Build functionality into log class to create ID when parsing JSON
Normalize time to UTC -> feat(NormalizeTimestamp): Normalize UserAle and arbitrary log timestamps to UTC #42
Create PKSUID and store as field within log object

Remember to use Python type hints when implementing

The text was updated successfully, but these errors were encountered:

jlhitzeman · 2024-06-27T17:27:28Z

Commenting for access

rc10house · 2024-07-10T22:20:45Z

Closed via #47

EandrewJones added the enhancement label Jun 26, 2024

EandrewJones added this to the Library Primitives milestone Jun 26, 2024

EandrewJones assigned EandrewJones and jlhitzeman and unassigned EandrewJones Jun 27, 2024

rc10house closed this as completed Jul 10, 2024

This was referenced Jul 11, 2024

getUUID refactor #30

Closed

Pull request for issue #28 #29

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(LogUniqueId): Create unique ID for logs #41

feat(LogUniqueId): Create unique ID for logs #41

rc10house commented Jun 26, 2024 •

edited

Loading

jlhitzeman commented Jun 27, 2024

rc10house commented Jul 10, 2024

feat(LogUniqueId): Create unique ID for logs #41

feat(LogUniqueId): Create unique ID for logs #41

Comments

rc10house commented Jun 26, 2024 • edited Loading

Problem

Approach

Example

Definition of Done

jlhitzeman commented Jun 27, 2024

rc10house commented Jul 10, 2024

rc10house commented Jun 26, 2024 •

edited

Loading