Skip to content

pd.NaT creates unexpected behavior in TimeSeries #230

Open
@nsteins

Description

@nsteins

When using pandas Timestamps to create TimeSeries, any NaT values create unexpected results. For example the following snippet

import traces
import pandas as pd

ts = traces.TimeSeries({pd.Timestamp('2018-10-15 16:45:01'): 1,
                        pd.Timestamp('2019-02-22 12:05:08'): 2,
                        pd.NaT: 3,
                        pd.Timestamp('2019-04-16 13:08:26'): 4})

ts[pd.Timestamp('2019-02-21')]

returns 3

I suspect this is because pd.NaT implements comparisons in a different way than standard Python. e.g.

>>>pd.Timestamp('2019-01-01') >= pd.NaT
False
>>>pd.Timestamp('2019-01-01') <= pd.NaT
False
>>> pd.Timestamp('2019-01-01') > None
TypeError: '>' not supported between instances of 'Timestamp' and 'NoneType'

This difference is probably breaking the way the sortedcontainers implementation. Not sure what the best way to handle is, I could see either returning an error if a NaT is added to a TimeSeries or trying to convert the type to something else that is a bit more consistent.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions