Skip to content

Commit

Permalink
Make *_range functions consistent (pandas-dev#17482)
Browse files Browse the repository at this point in the history
  • Loading branch information
jschendel authored and jreback committed Sep 14, 2017
1 parent fa557f7 commit 2cf2566
Show file tree
Hide file tree
Showing 14 changed files with 747 additions and 130 deletions.
9 changes: 9 additions & 0 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -218,10 +218,19 @@ Top-level dealing with datetimelike
to_timedelta
date_range
bdate_range
cdate_range
period_range
timedelta_range
infer_freq

Top-level dealing with intervals
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autosummary::
:toctree: generated/

interval_range

Top-level evaluation
~~~~~~~~~~~~~~~~~~~~

Expand Down
9 changes: 9 additions & 0 deletions doc/source/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1705,6 +1705,15 @@ has multiplied span.
pd.PeriodIndex(start='2014-01', freq='3M', periods=4)
If ``start`` or ``end`` are ``Period`` objects, they will be used as anchor
endpoints for a ``PeriodIndex`` with frequency matching that of the
``PeriodIndex`` constructor.

.. ipython:: python
pd.PeriodIndex(start=pd.Period('2017Q1', freq='Q'),
end=pd.Period('2017Q2', freq='Q'), freq='M')
Just like ``DatetimeIndex``, a ``PeriodIndex`` can also be used to index pandas
objects:

Expand Down
55 changes: 54 additions & 1 deletion doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ Furthermore this will now correctly box the results of iteration for :func:`Data
.. ipython:: ipython

d = {'a':[1], 'b':['b']}
df = pd,DataFrame(d)
df = pd.DataFrame(d)

Previously:

Expand Down Expand Up @@ -358,6 +358,59 @@ Previously, :func:`to_datetime` did not localize datetime ``Series`` data when `

Additionally, DataFrames with datetime columns that were parsed by :func:`read_sql_table` and :func:`read_sql_query` will also be localized to UTC only if the original SQL columns were timezone aware datetime columns.

.. _whatsnew_0210.api.consistency_of_range_functions:

Consistency of Range Functions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In previous versions, there were some inconsistencies between the various range functions: func:`date_range`, func:`bdate_range`, func:`cdate_range`, func:`period_range`, func:`timedelta_range`, and func:`interval_range`. (:issue:`17471`).

One of the inconsistent behaviors occurred when the ``start``, ``end`` and ``period`` parameters were all specified, potentially leading to ambiguous ranges. When all three parameters were passed, ``interval_range`` ignored the ``period`` parameter, ``period_range`` ignored the ``end`` parameter, and the other range functions raised. To promote consistency among the range functions, and avoid potentially ambiguous ranges, ``interval_range`` and ``period_range`` will now raise when all three parameters are passed.

Previous Behavior:

.. code-block:: ipython

In [2]: pd.interval_range(start=0, end=4, periods=6)
Out[2]:
IntervalIndex([(0, 1], (1, 2], (2, 3]]
closed='right',
dtype='interval[int64]')

In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q')
Out[3]: PeriodIndex(['2017Q1', '2017Q2', '2017Q3', '2017Q4', '2018Q1', '2018Q2'], dtype='period[Q-DEC]', freq='Q-DEC')

New Behavior:

.. code-block:: ipython

In [2]: pd.interval_range(start=0, end=4, periods=6)
---------------------------------------------------------------------------
ValueError: Of the three parameters: start, end, and periods, exactly two must be specified

In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q')
---------------------------------------------------------------------------
ValueError: Of the three parameters: start, end, and periods, exactly two must be specified

Additionally, the endpoint parameter ``end`` was not included in the intervals produced by ``interval_range``. However, all other range functions include ``end`` in their output. To promote consistency among the range functions, ``interval_range`` will now include ``end`` as the right endpoint of the final interval, except if ``freq`` is specified in a way which skips ``end``.

Previous Behavior:

.. code-block:: ipython

In [4]: pd.interval_range(start=0, end=4)
Out[4]:
IntervalIndex([(0, 1], (1, 2], (2, 3]]
closed='right',
dtype='interval[int64]')


New Behavior:

.. ipython:: python

pd.interval_range(start=0, end=4)

.. _whatsnew_0210.api:

Other API Changes
Expand Down
58 changes: 31 additions & 27 deletions pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -292,8 +292,8 @@ def __new__(cls, data=None,
if is_float(periods):
periods = int(periods)
elif not is_integer(periods):
raise ValueError('Periods must be a number, got %s' %
str(periods))
msg = 'periods must be a number, got {periods}'
raise TypeError(msg.format(periods=periods))

if data is None and freq is None:
raise ValueError("Must provide freq argument if no data is "
Expand Down Expand Up @@ -412,7 +412,8 @@ def __new__(cls, data=None,
def _generate(cls, start, end, periods, name, offset,
tz=None, normalize=False, ambiguous='raise', closed=None):
if com._count_not_none(start, end, periods) != 2:
raise ValueError('Must specify two of start, end, or periods')
raise ValueError('Of the three parameters: start, end, and '
'periods, exactly two must be specified')

_normalized = True

Expand Down Expand Up @@ -2004,7 +2005,7 @@ def _generate_regular_range(start, end, periods, offset):
def date_range(start=None, end=None, periods=None, freq='D', tz=None,
normalize=False, name=None, closed=None, **kwargs):
"""
Return a fixed frequency datetime index, with day (calendar) as the default
Return a fixed frequency DatetimeIndex, with day (calendar) as the default
frequency
Parameters
Expand All @@ -2013,24 +2014,25 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
Left bound for generating dates
end : string or datetime-like, default None
Right bound for generating dates
periods : integer or None, default None
If None, must specify start and end
periods : integer, default None
Number of periods to generate
freq : string or DateOffset, default 'D' (calendar daily)
Frequency strings can have multiples, e.g. '5H'
tz : string or None
tz : string, default None
Time zone name for returning localized DatetimeIndex, for example
Asia/Hong_Kong
normalize : bool, default False
Normalize start/end dates to midnight before generating date range
name : str, default None
Name of the resulting index
closed : string or None, default None
name : string, default None
Name of the resulting DatetimeIndex
closed : string, default None
Make the interval closed with respect to the given frequency to
the 'left', 'right', or both sides (None)
Notes
-----
2 of start, end, or periods must be specified
Of the three parameters: ``start``, ``end``, and ``periods``, exactly two
must be specified.
To learn more about the frequency strings, please see `this link
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
Expand All @@ -2047,7 +2049,7 @@ def date_range(start=None, end=None, periods=None, freq='D', tz=None,
def bdate_range(start=None, end=None, periods=None, freq='B', tz=None,
normalize=True, name=None, closed=None, **kwargs):
"""
Return a fixed frequency datetime index, with business day as the default
Return a fixed frequency DatetimeIndex, with business day as the default
frequency
Parameters
Expand All @@ -2056,24 +2058,25 @@ def bdate_range(start=None, end=None, periods=None, freq='B', tz=None,
Left bound for generating dates
end : string or datetime-like, default None
Right bound for generating dates
periods : integer or None, default None
If None, must specify start and end
periods : integer, default None
Number of periods to generate
freq : string or DateOffset, default 'B' (business daily)
Frequency strings can have multiples, e.g. '5H'
tz : string or None
Time zone name for returning localized DatetimeIndex, for example
Asia/Beijing
normalize : bool, default False
Normalize start/end dates to midnight before generating date range
name : str, default None
Name for the resulting index
closed : string or None, default None
name : string, default None
Name of the resulting DatetimeIndex
closed : string, default None
Make the interval closed with respect to the given frequency to
the 'left', 'right', or both sides (None)
Notes
-----
2 of start, end, or periods must be specified
Of the three parameters: ``start``, ``end``, and ``periods``, exactly two
must be specified.
To learn more about the frequency strings, please see `this link
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
Expand All @@ -2091,7 +2094,7 @@ def bdate_range(start=None, end=None, periods=None, freq='B', tz=None,
def cdate_range(start=None, end=None, periods=None, freq='C', tz=None,
normalize=True, name=None, closed=None, **kwargs):
"""
**EXPERIMENTAL** Return a fixed frequency datetime index, with
**EXPERIMENTAL** Return a fixed frequency DatetimeIndex, with
CustomBusinessDay as the default frequency
.. warning:: EXPERIMENTAL
Expand All @@ -2105,29 +2108,30 @@ def cdate_range(start=None, end=None, periods=None, freq='C', tz=None,
Left bound for generating dates
end : string or datetime-like, default None
Right bound for generating dates
periods : integer or None, default None
If None, must specify start and end
periods : integer, default None
Number of periods to generate
freq : string or DateOffset, default 'C' (CustomBusinessDay)
Frequency strings can have multiples, e.g. '5H'
tz : string or None
tz : string, default None
Time zone name for returning localized DatetimeIndex, for example
Asia/Beijing
normalize : bool, default False
Normalize start/end dates to midnight before generating date range
name : str, default None
Name for the resulting index
weekmask : str, Default 'Mon Tue Wed Thu Fri'
name : string, default None
Name of the resulting DatetimeIndex
weekmask : string, Default 'Mon Tue Wed Thu Fri'
weekmask of valid business days, passed to ``numpy.busdaycalendar``
holidays : list
list/array of dates to exclude from the set of valid business days,
passed to ``numpy.busdaycalendar``
closed : string or None, default None
closed : string, default None
Make the interval closed with respect to the given frequency to
the 'left', 'right', or both sides (None)
Notes
-----
2 of start, end, or periods must be specified
Of the three parameters: ``start``, ``end``, and ``periods``, exactly two
must be specified.
To learn more about the frequency strings, please see `this link
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
Expand Down
Loading

0 comments on commit 2cf2566

Please sign in to comment.