Skip to content

DEPR: Timestamp.freq #41586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jun 7, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
212 changes: 212 additions & 0 deletions doc/source/reference/offset_frequency.rst

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -690,6 +690,7 @@ Deprecations
- Deprecated special treatment of lists with first element a Categorical in the :class:`DataFrame` constructor; pass as ``pd.DataFrame({col: categorical, ...})`` instead (:issue:`38845`)
- Deprecated behavior of :class:`DataFrame` constructor when a ``dtype`` is passed and the data cannot be cast to that dtype. In a future version, this will raise instead of being silently ignored (:issue:`24435`)
- Deprecated passing arguments as positional (except for ``"method"``) in :meth:`DataFrame.interpolate` and :meth:`Series.interpolate` (:issue:`41485`)
- Deprecated the :attr:`Timestamp.freq` attribute. For the properties that use it (``is_month_start``, ``is_month_end``, ``is_quarter_start``, ``is_quarter_end``, ``is_year_start``, ``is_year_end``), when you have a ``freq``, use e.g. ``freq.is_month_start(ts)`` (:issue:`15146`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find the explanation of "for properties that use it .., when you have a freq, ..." very clear. Adding an actual example might clarify.

First, I assume people who do ts.is_month_start (with ts being a Timestamp) don't necessarily know that they are relying on some freq.
Also, the attributes are not going away, right? It's only the fastpath for computing it based on the freq that goes away? Are there actually cases that will change behaviour, or is it only performance? (and if only performance, that might not matter much for scalars, or maybe not worth warning about).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there actually cases that will change behaviour, or is it only performance?

Yes, e.g.

dti = pd.date_range("2017-01-01", periods=30, freq="B")
dti[0].is_month_start  # <- True now, False once freq is removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in principle we can know when this will change? (depending on the freq, we can know whether it can give different answers?)
Could we warn only in those cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, will do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to warn in fewer cases. Still haven't updated the whatsnew note per the first comment here. suggestions on wording?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree here. i think we are simply deprecating freq. why do we need to even say anything about is_month_start for example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bc the result of is_month_start is going to change in some cases

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One consequence of deprecating Timestamp.freq is that when having a timeseries with a freq, the following equivalency won't no longer always hold:

dtidx.is_month_start[0] == dtidx[0].is_month_start

That's a bit unfortunate, I would say, and probably something we should explicitly document.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for this as a followup documentation (but for 1.3)

- Deprecated passing arguments as positional in :meth:`DataFrame.ffill`, :meth:`Series.ffill`, :meth:`DataFrame.bfill`, and :meth:`Series.bfill` (:issue:`41485`)
- Deprecated passing arguments as positional in :meth:`DataFrame.sort_values` (other than ``"by"``) and :meth:`Series.sort_values` (:issue:`41485`)
- Deprecated passing arguments as positional in :meth:`DataFrame.dropna` and :meth:`Series.dropna` (:issue:`41485`)
Expand Down
20 changes: 20 additions & 0 deletions pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -716,6 +716,26 @@ cdef class BaseOffset:
# if there were a canonical docstring for what is_anchored means.
return self.n == 1

# ------------------------------------------------------------------

def is_month_start(self, _Timestamp ts):
return ts._get_start_end_field("is_month_start", self)

def is_month_end(self, _Timestamp ts):
return ts._get_start_end_field("is_month_end", self)

def is_quarter_start(self, _Timestamp ts):
return ts._get_start_end_field("is_quarter_start", self)

def is_quarter_end(self, _Timestamp ts):
return ts._get_start_end_field("is_quarter_end", self)

def is_year_start(self, _Timestamp ts):
return ts._get_start_end_field("is_year_start", self)

def is_year_end(self, _Timestamp ts):
return ts._get_start_end_field("is_year_end", self)


cdef class SingleConstructorOffset(BaseOffset):
@classmethod
Expand Down
6 changes: 4 additions & 2 deletions pandas/_libs/tslibs/timestamps.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,15 @@ cdef object create_timestamp_from_ts(int64_t value,
cdef class _Timestamp(ABCTimestamp):
cdef readonly:
int64_t value, nanosecond
object freq
object _freq

cdef bint _get_start_end_field(self, str field)
cdef bint _get_start_end_field(self, str field, freq)
cdef _get_date_name_field(self, str field, object locale)
cdef int64_t _maybe_convert_value_to_local(self)
cdef bint _can_compare(self, datetime other)
cpdef to_datetime64(self)
cpdef datetime to_pydatetime(_Timestamp self, bint warn=*)
cdef bint _compare_outside_nanorange(_Timestamp self, datetime other,
int op) except -1
cpdef void _set_freq(self, freq)
cdef _warn_on_field_deprecation(_Timestamp self, freq, str field)
3 changes: 3 additions & 0 deletions pandas/_libs/tslibs/timestamps.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ from typing import (
import numpy as np

from pandas._libs.tslibs import (
BaseOffset,
NaT,
NaTType,
Period,
Expand Down Expand Up @@ -57,6 +58,8 @@ class Timestamp(datetime):
fold: int | None= ...,
) -> _S | NaTType: ...

def _set_freq(self, freq: BaseOffset | None) -> None: ...

@property
def year(self) -> int: ...
@property
Expand Down
138 changes: 108 additions & 30 deletions pandas/_libs/tslibs/timestamps.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ cdef inline object create_timestamp_from_ts(int64_t value,
dts.day, dts.hour, dts.min,
dts.sec, dts.us, tz, fold=fold)
ts_base.value = value
ts_base.freq = freq
ts_base._freq = freq
ts_base.nanosecond = dts.ps // 1000

return ts_base
Expand Down Expand Up @@ -155,6 +155,21 @@ cdef class _Timestamp(ABCTimestamp):
dayofweek = _Timestamp.day_of_week
dayofyear = _Timestamp.day_of_year

cpdef void _set_freq(self, freq):
# set the ._freq attribute without going through the constructor,
# which would issue a warning
# Caller is responsible for validation
self._freq = freq

@property
def freq(self):
warnings.warn(
"Timestamp.freq is deprecated and will be removed in a future version",
FutureWarning,
stacklevel=1,
)
return self._freq

def __hash__(_Timestamp self):
if self.nanosecond:
return hash(self.value)
Expand Down Expand Up @@ -263,7 +278,9 @@ cdef class _Timestamp(ABCTimestamp):

if is_any_td_scalar(other):
nanos = delta_to_nanoseconds(other)
result = type(self)(self.value + nanos, tz=self.tzinfo, freq=self.freq)
result = type(self)(self.value + nanos, tz=self.tzinfo)
if result is not NaT:
result._set_freq(self._freq) # avoid warning in constructor
return result

elif is_integer_object(other):
Expand Down Expand Up @@ -361,18 +378,17 @@ cdef class _Timestamp(ABCTimestamp):
val = self.value
return val

cdef bint _get_start_end_field(self, str field):
cdef bint _get_start_end_field(self, str field, freq):
cdef:
int64_t val
dict kwds
ndarray[uint8_t, cast=True] out
int month_kw

freq = self.freq
if freq:
kwds = freq.kwds
month_kw = kwds.get('startingMonth', kwds.get('month', 12))
freqstr = self.freqstr
freqstr = self._freqstr
else:
month_kw = 12
freqstr = None
Expand All @@ -382,6 +398,31 @@ cdef class _Timestamp(ABCTimestamp):
field, freqstr, month_kw)
return out[0]

cdef _warn_on_field_deprecation(self, freq, str field):
"""
Warn if the removal of .freq change the value of start/end properties.
"""
cdef:
bint needs = False

if freq is not None:
kwds = freq.kwds
month_kw = kwds.get("startingMonth", kwds.get("month", 12))
freqstr = self._freqstr
if month_kw != 12:
needs = True
if freqstr.startswith("B"):
needs = True

if needs:
warnings.warn(
"Timestamp.freq is deprecated and will be removed in a future "
"version. When you have a freq, use "
f"freq.{field}(timestamp) instead",
FutureWarning,
stacklevel=1,
)

@property
def is_month_start(self) -> bool:
"""
Expand All @@ -397,10 +438,11 @@ cdef class _Timestamp(ABCTimestamp):
>>> ts.is_month_start
True
"""
if self.freq is None:
if self._freq is None:
# fast-path for non-business frequencies
return self.day == 1
return self._get_start_end_field("is_month_start")
self._warn_on_field_deprecation(self._freq, "is_month_start")
return self._get_start_end_field("is_month_start", self._freq)

@property
def is_month_end(self) -> bool:
Expand All @@ -417,10 +459,11 @@ cdef class _Timestamp(ABCTimestamp):
>>> ts.is_month_end
True
"""
if self.freq is None:
if self._freq is None:
# fast-path for non-business frequencies
return self.day == self.days_in_month
return self._get_start_end_field("is_month_end")
self._warn_on_field_deprecation(self._freq, "is_month_end")
return self._get_start_end_field("is_month_end", self._freq)

@property
def is_quarter_start(self) -> bool:
Expand All @@ -437,10 +480,11 @@ cdef class _Timestamp(ABCTimestamp):
>>> ts.is_quarter_start
True
"""
if self.freq is None:
if self._freq is None:
# fast-path for non-business frequencies
return self.day == 1 and self.month % 3 == 1
return self._get_start_end_field("is_quarter_start")
self._warn_on_field_deprecation(self._freq, "is_quarter_start")
return self._get_start_end_field("is_quarter_start", self._freq)

@property
def is_quarter_end(self) -> bool:
Expand All @@ -457,10 +501,11 @@ cdef class _Timestamp(ABCTimestamp):
>>> ts.is_quarter_end
True
"""
if self.freq is None:
if self._freq is None:
# fast-path for non-business frequencies
return (self.month % 3) == 0 and self.day == self.days_in_month
return self._get_start_end_field("is_quarter_end")
self._warn_on_field_deprecation(self._freq, "is_quarter_end")
return self._get_start_end_field("is_quarter_end", self._freq)

@property
def is_year_start(self) -> bool:
Expand All @@ -477,10 +522,11 @@ cdef class _Timestamp(ABCTimestamp):
>>> ts.is_year_start
True
"""
if self.freq is None:
if self._freq is None:
# fast-path for non-business frequencies
return self.day == self.month == 1
return self._get_start_end_field("is_year_start")
self._warn_on_field_deprecation(self._freq, "is_year_start")
return self._get_start_end_field("is_year_start", self._freq)

@property
def is_year_end(self) -> bool:
Expand All @@ -497,10 +543,11 @@ cdef class _Timestamp(ABCTimestamp):
>>> ts.is_year_end
True
"""
if self.freq is None:
if self._freq is None:
# fast-path for non-business frequencies
return self.month == 12 and self.day == 31
return self._get_start_end_field("is_year_end")
self._warn_on_field_deprecation(self._freq, "is_year_end")
return self._get_start_end_field("is_year_end", self._freq)

cdef _get_date_name_field(self, str field, object locale):
cdef:
Expand Down Expand Up @@ -673,11 +720,11 @@ cdef class _Timestamp(ABCTimestamp):

def __setstate__(self, state):
self.value = state[0]
self.freq = state[1]
self._freq = state[1]
self.tzinfo = state[2]

def __reduce__(self):
object_state = self.value, self.freq, self.tzinfo
object_state = self.value, self._freq, self.tzinfo
return (Timestamp, object_state)

# -----------------------------------------------------------------
Expand Down Expand Up @@ -719,7 +766,7 @@ cdef class _Timestamp(ABCTimestamp):
pass

tz = f", tz='{zone}'" if zone is not None else ""
freq = "" if self.freq is None else f", freq='{self.freqstr}'"
freq = "" if self._freq is None else f", freq='{self._freqstr}'"

return f"Timestamp('{stamp}'{tz}{freq})"

Expand Down Expand Up @@ -877,7 +924,13 @@ cdef class _Timestamp(ABCTimestamp):
)

if freq is None:
freq = self.freq
freq = self._freq
warnings.warn(
"In a future version, calling 'Timestamp.to_period()' without "
"passing a 'freq' will raise an exception.",
FutureWarning,
stacklevel=2,
)

return Period(self, freq=freq)

Expand Down Expand Up @@ -1147,7 +1200,7 @@ class Timestamp(_Timestamp):
nanosecond=None,
tzinfo_type tzinfo=None,
*,
fold=None
fold=None,
):
# The parameter list folds together legacy parameter names (the first
# four) and positional and keyword parameter names from pydatetime.
Expand Down Expand Up @@ -1276,9 +1329,16 @@ class Timestamp(_Timestamp):

if freq is None:
# GH 22311: Try to extract the frequency of a given Timestamp input
freq = getattr(ts_input, 'freq', None)
elif not is_offset_object(freq):
freq = to_offset(freq)
freq = getattr(ts_input, '_freq', None)
else:
warnings.warn(
"The 'freq' argument in Timestamp is deprecated and will be "
"removed in a future version.",
FutureWarning,
stacklevel=1,
)
if not is_offset_object(freq):
freq = to_offset(freq)

return create_timestamp_from_ts(ts.value, ts.dts, ts.tzinfo, freq, ts.fold)

Expand Down Expand Up @@ -1551,12 +1611,21 @@ timedelta}, default 'raise'
"Use tz_localize() or tz_convert() as appropriate"
)

@property
def _freqstr(self):
return getattr(self._freq, "freqstr", self._freq)

@property
def freqstr(self):
"""
Return the total number of days in the month.
"""
return getattr(self.freq, 'freqstr', self.freq)
warnings.warn(
"Timestamp.freqstr is deprecated and will be removed in a future version.",
FutureWarning,
stacklevel=1,
)
return self._freqstr

def tz_localize(self, tz, ambiguous='raise', nonexistent='raise'):
"""
Expand Down Expand Up @@ -1647,12 +1716,18 @@ default 'raise'
value = tz_localize_to_utc_single(self.value, tz,
ambiguous=ambiguous,
nonexistent=nonexistent)
return Timestamp(value, tz=tz, freq=self.freq)
out = Timestamp(value, tz=tz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we some just avoid setting this?

if out is not NaT:
out._set_freq(self._freq) # avoid warning in constructor
return out
else:
if tz is None:
# reset tz
value = tz_convert_from_utc_single(self.value, self.tz)
return Timestamp(value, tz=tz, freq=self.freq)
out = Timestamp(value, tz=tz)
if out is not NaT:
out._set_freq(self._freq) # avoid warning in constructor
return out
else:
raise TypeError(
"Cannot localize tz-aware Timestamp, use tz_convert for conversions"
Expand Down Expand Up @@ -1707,7 +1782,10 @@ default 'raise'
)
else:
# Same UTC timestamp, different time zone
return Timestamp(self.value, tz=tz, freq=self.freq)
out = Timestamp(self.value, tz=tz)
if out is not NaT:
out._set_freq(self._freq) # avoid warning in constructor
return out

astimezone = tz_convert

Expand Down Expand Up @@ -1840,7 +1918,7 @@ default 'raise'
if value != NPY_NAT:
check_dts_bounds(&dts)

return create_timestamp_from_ts(value, dts, tzobj, self.freq, fold)
return create_timestamp_from_ts(value, dts, tzobj, self._freq, fold)

def to_julian_date(self) -> np.float64:
"""
Expand Down
6 changes: 5 additions & 1 deletion pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -1189,7 +1189,11 @@ def _addsub_object_array(self, other: np.ndarray, op):
# Caller is responsible for broadcasting if necessary
assert self.shape == other.shape, (self.shape, other.shape)

res_values = op(self.astype("O"), np.asarray(other))
with warnings.catch_warnings():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouln't these not have freq to begin with?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self here is DTA/TDA/PA, which all have freq. there is an issue about removing freq from DTA/TDA, which this deprecation is a blocker for

# filter out warnings about Timestamp.freq
warnings.filterwarnings("ignore", category=FutureWarning)
res_values = op(self.astype("O"), np.asarray(other))

result = pd_array(res_values.ravel())
# error: Item "ExtensionArray" of "Union[Any, ExtensionArray]" has no attribute
# "reshape"
Expand Down
Loading