Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple aggregates and TZ error fixed #2

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion AUTHORS.rst
Original file line number Diff line number Diff line change
@@ -5,4 +5,5 @@ Authors & Contributors
* Mikhail Korobov;
* Pawel Tomasiewicz;
* Steve Jones;
* @ivirabyan.
* @ivirabyan;
* Abd Allah Diab.
74 changes: 52 additions & 22 deletions README.rst
Original file line number Diff line number Diff line change
@@ -36,11 +36,11 @@ How many users signed up today? this month? this year?
qs = User.objects.all()
qss = qsstats.QuerySetStats(qs, 'date_joined')

print '%s new accounts today.' % qss.this_day()
print '%s new accounts this week.' % qss.this_week()
print '%s new accounts this month.' % qss.this_month()
print '%s new accounts this year.' % qss.this_year()
print '%s new accounts until now.' % qss.until_now()
print '%s new accounts today.' % qss.this_day()[0]
print '%s new accounts this week.' % qss.this_week()[0]
print '%s new accounts this month.' % qss.this_month()[0]
print '%s new accounts this year.' % qss.this_year()[0]
print '%s new accounts until now.' % qss.until_now()[0]

This might print something like::

@@ -61,9 +61,6 @@ Aggregating time-series data suitable for graphing
qs = User.objects.all()
qss = qsstats.QuerySetStats(qs, 'date_joined')

today = datetime.date.today()
seven_days_ago = today - datetime.timedelta(days=7)

time_series = qss.time_series(seven_days_ago, today)
print 'New users in the last 7 days: %s' % [t[1] for t in time_series]

@@ -74,6 +71,32 @@ This might print something like::

Please see qsstats/tests.py for similar usage examples.

Multiple aggregates
-------------------

::

from my_store_app.models import Purchase
from django.db.models import Sum, Count
import datetime, qsstats

qs = Purchase.objects.all()
qss = qsstats.QuerySetStats(qs, 'date_purchased', aggregates=[Count('id'), Sum('amount')])

print '%s Purchases of value %s today.' % tuple(qss.this_day())
print '%s Purchases of value %s this week.' % tuple(qss.this_week())
print '%s Purchases of value %s this month.' % tuple(qss.this_month())
print '%s Purchases of value %s this year.' % tuple(qss.this_year())
print '%s Purchases of value %s until now.' % tuple(qss.until_now())

This might print something like::

5 Purchases of value 50 today.
11 Purchases of value 110 this week.
27 Purchases of value 270 this month.
377 Purchases of value 3770 year.
409 Purchases of value 4090 until now.

API
===

@@ -96,11 +119,12 @@ without providing enough information.

Default: ``None``

``aggregate``
The django aggregation instance. Can be set also set when
instantiating or calling one of the methods.
``aggregates``
A list of django aggregation instances. Can be set also set when
instantiating or calling one of the methods. You can also pass
one aggregation instance.

Default: ``Count('id')``
Default: ``[Count('id')]``

``operator``
The default operator to use for the ``pivot`` function. Can be also set
@@ -117,7 +141,7 @@ without providing enough information.

All of the documented methods take a standard set of keyword arguments
that override any information already stored within the ``QuerySetStats``
object. These keyword arguments are ``date_field`` and ``aggregate``.
object. These keyword arguments are ``date_field`` and ``aggregates``.

Once you have a ``QuerySetStats`` object instantiated, you can receive a
single aggregate result by using the following methods:
@@ -152,7 +176,7 @@ time-series data which may be extremely using in plotting data:
the start and stop of the time series data.

Keyword arguments: In addition to the standard ``date_field`` and
``aggregate`` keyword argument, ``time_series`` takes an optional
``aggregates`` keyword argument, ``time_series`` takes an optional
``interval`` keyword argument used to mark which interval to use while
calculating aggregate data between ``start`` and ``end``. This argument
defaults to ``'days'`` and can accept ``'years'``, ``'months'``,
@@ -161,7 +185,7 @@ time-series data which may be extremely using in plotting data:

This methods returns a list of tuples. The first item in each
tuple is a ``datetime.datetime`` object for the current inverval. The
second item is the result of the aggregate operation. For
other items are the results of the aggregates operations. For
example::

[(datetime.datetime(2010, 3, 28, 0, 0), 12), (datetime.datetime(2010, 3, 29, 0, 0), 0), ...]
@@ -176,15 +200,15 @@ time-series data which may be extremely using in plotting data:
Positional arguments: ``dt`` a ``datetime.date`` or ``datetime.datetime``
object to be used for filtering the queryset since.

Keyword arguments: ``date_field``, ``aggregate``.
Keyword arguments: ``date_field``, ``aggregates``.

``until_now``
Aggregate information until now.

Positional arguments: ``dt`` a ``datetime.date`` or ``datetime.datetime``
object to be used for filtering the queryset since (using ``lte``).

Keyword arguments: ``date_field``, ``aggregate``.
Keyword arguments: ``date_field``, ``aggregates``.

``after``
Aggregate information after a given date or time, filtering the queryset
@@ -193,15 +217,15 @@ time-series data which may be extremely using in plotting data:
Positional arguments: ``dt`` a ``datetime.date`` or ``datetime.datetime``
object to be used for filtering the queryset since.

Keyword arguments: ``date_field``, ``aggregate``.
Keyword arguments: ``date_field``, ``aggregates``.

``after_now``
Aggregate information after now.

Positional arguments: ``dt`` a ``datetime.date`` or ``datetime.datetime``
object to be used for filtering the queryset since (using ``gte``).

Keyword arguments: ``date_field``, ``aggregate``.
Keyword arguments: ``date_field``, ``aggregates``.

``pivot``
Used by ``since``, ``after``, and ``until_now`` but potentially useful if
@@ -210,7 +234,7 @@ time-series data which may be extremely using in plotting data:
Positional arguments: ``dt`` a ``datetime.date`` or ``datetime.datetime``
object to be used for filtering the queryset since (using ``lte``).

Keyword arguments: ``operator``, ``date_field``, ``aggregate``.
Keyword arguments: ``operator``, ``date_field``, ``aggregates``.

Raises ``InvalidOperator`` if the operator provided is not one of ``'lt'``,
``'lte'``, ``gt`` or ``gte``.
@@ -235,8 +259,8 @@ Difference from django-qsstats

1. Faster time_series method using 1 sql query (currently works for MySQL and
PostgreSQL, with a fallback to the old method for other DB backends).
2. Single ``aggregate`` parameter instead of ``aggregate_field`` and
``aggregate_class``. Default value is always ``Count('id')`` and can't be
2. Single ``aggregates`` parameter instead of ``aggregate_field`` and
``aggregate_class``. Default value is always ``[Count('id')]`` and can't be
specified in settings.py. ``QUERYSETSTATS_DEFAULT_OPERATOR`` option is also
unsupported now.
3. Support for minute and hour aggregates.
@@ -247,3 +271,9 @@ Difference from django-qsstats
I don't know if original author (Matt Croydon) would like my changes so
I renamed a project for now. If the changes will be merged then
django-qsstats-magic will become obsolete.

New in 0.8.0
============

* Changed ``aggregate`` to ``aggregates`` and now the framework returns a list of
aggregate information instead of only one.
91 changes: 55 additions & 36 deletions qsstats/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
__author__ = 'Matt Croydon, Mikhail Korobov, Pawel Tomasiewicz'
__version__ = (0, 7, 0)
__author__ = 'Matt Croydon, Mikhail Korobov, Pawel Tomasiewicz, Abd Allah Diab'
__version__ = (0, 8, 0)

from functools import partial
import datetime
@@ -20,12 +20,17 @@ class QuerySetStats(object):
is able to handle snapshots of data (for example this day, week, month, or
year) or generate time series data suitable for graphing.
"""
def __init__(self, qs=None, date_field=None, aggregate=None, today=None):
def __init__(self, qs=None, date_field=None, aggregates=None, today=None):
self.qs = qs
self.date_field = date_field
self.aggregate = aggregate or Count('id')
self.aggregates = aggregates and self._get_aggregates(aggregates) or [Count('id')]
self.today = today or self.update_today()

def _get_aggregates(self, aggregates=None):
if aggregates and not isinstance(aggregates, list):
aggregates = [aggregates]
return aggregates or self.aggregates

def _guess_engine(self):
if hasattr(self.qs, 'db'): # django 1.2+
engine_name = settings.DATABASES[self.qs.db]['ENGINE']
@@ -40,15 +45,15 @@ def _guess_engine(self):

# Aggregates for a specific period of time

def for_interval(self, interval, dt, date_field=None, aggregate=None):
def for_interval(self, interval, dt, date_field=None, aggregates=None):
start, end = get_bounds(dt, interval)
date_field = date_field or self.date_field
kwargs = {'%s__range' % date_field : (start, end)}
return self._aggregate(date_field, aggregate, kwargs)
return self._aggregate(date_field, self._get_aggregates(aggregates), kwargs)

def this_interval(self, interval, date_field=None, aggregate=None):
def this_interval(self, interval, date_field=None, aggregates=None):
method = getattr(self, 'for_%s' % interval)
return method(self.today, date_field, aggregate)
return method(self.today, date_field, self._get_aggregates(aggregates))

# support for this_* and for_* methods
def __getattr__(self, name):
@@ -59,11 +64,11 @@ def __getattr__(self, name):
raise AttributeError

def time_series(self, start, end=None, interval='days',
date_field=None, aggregate=None, engine=None):
date_field=None, aggregates=None, engine=None):
''' Aggregate over time intervals '''

end = end or self.today
args = [start, end, interval, date_field, aggregate]
args = [start, end, interval, date_field, self._get_aggregates(aggregates)]
engine = engine or self._guess_engine()
sid = transaction.savepoint()
try:
@@ -73,7 +78,7 @@ def time_series(self, start, end=None, interval='days',
return self._slow_time_series(*args)

def _slow_time_series(self, start, end, interval='days',
date_field=None, aggregate=None):
date_field=None, aggregates=None):
''' Aggregate over time intervals using 1 sql query for one interval '''

num, interval = _parse_interval(interval)
@@ -88,17 +93,17 @@ def _slow_time_series(self, start, end, interval='days',
stat_list = []
dt, end = _to_datetime(start), _to_datetime(end)
while dt <= end:
value = method(dt, date_field, aggregate)
value = method(dt, date_field, self._get_aggregates(aggregates))
stat_list.append((dt, value,))
dt = dt + relativedelta(**{interval : 1})
return stat_list

def _fast_time_series(self, start, end, interval='days',
date_field=None, aggregate=None, engine=None):
date_field=None, aggregates=None, engine=None):
''' Aggregate over time intervals using just 1 sql query '''

date_field = date_field or self.date_field
aggregate = aggregate or self.aggregate
aggregates = self._get_aggregates(aggregates)
engine = engine or self._guess_engine()

num, interval = _parse_interval(interval)
@@ -109,70 +114,84 @@ def _fast_time_series(self, start, end, interval='days',

kwargs = {'%s__range' % date_field : (start, end)}
aggregate_data = self.qs.extra(select = {'d': interval_sql}).\
filter(**kwargs).order_by().values('d').\
annotate(agg=aggregate)
filter(**kwargs).order_by().values('d')
for i, aggregate in enumerate(aggregates):
aggregate_data = aggregate_data.annotate(**{'agg_%d' % i: aggregate})

today = _remove_time(compat.now())
def to_dt(d):
if isinstance(d, basestring):
return parse(d, yearfirst=True, default=today)
return d

data = dict((to_dt(item['d']), item['agg']) for item in aggregate_data)
data = dict((to_dt(item['d']), [item['agg_%d' % i] for i in range(len(aggregates))]) for item in aggregate_data)

stat_list = []
dt = start
try:
try:
from django.utils.timezone import utc
except ImportError:
from django.utils.timezones import utc
dt = dt.replace(tzinfo=utc)
end = end.replace(tzinfo=utc)
except ImportError:
pass
zeros = [0 for i in range(len(aggregates))]

while dt < end:
idx = 0
value = 0
value = []
for i in range(num):
value = value + data.get(dt, 0)
value = map(lambda a, b: (a or 0) + (b or 0), value, data.get(dt, zeros[:]))
if i == 0:
stat_list.append((dt, value,))
stat_list.append(tuple([dt] + value))
idx = len(stat_list) - 1
elif i == num - 1:
stat_list[idx] = (dt, value,)
stat_list[idx] = tuple([dt] + value)
dt = dt + relativedelta(**{interval : 1})

return stat_list

# Aggregate totals using a date or datetime as a pivot

def until(self, dt, date_field=None, aggregate=None):
return self.pivot(dt, 'lte', date_field, aggregate)
def until(self, dt, date_field=None, aggregates=None):
return self.pivot(dt, 'lte', date_field, self._get_aggregates(aggregates))

def until_now(self, date_field=None, aggregate=None):
return self.pivot(compat.now(), 'lte', date_field, aggregate)
def until_now(self, date_field=None, aggregates=None):
return self.pivot(compat.now(), 'lte', date_field, self._get_aggregates(aggregates))

def after(self, dt, date_field=None, aggregate=None):
return self.pivot(dt, 'gte', date_field, aggregate)
def after(self, dt, date_field=None, aggregates=None):
return self.pivot(dt, 'gte', date_field, self._get_aggregates(aggregates))

def after_now(self, date_field=None, aggregate=None):
return self.pivot(compat.now(), 'gte', date_field, aggregate)
def after_now(self, date_field=None, aggregates=None):
return self.pivot(compat.now(), 'gte', date_field, self._get_aggregates(aggregates))

def pivot(self, dt, operator=None, date_field=None, aggregate=None):
def pivot(self, dt, operator=None, date_field=None, aggregates=None):
operator = operator or self.operator
if operator not in ['lt', 'lte', 'gt', 'gte']:
raise InvalidOperator("Please provide a valid operator.")

kwargs = {'%s__%s' % (date_field or self.date_field, operator) : dt}
return self._aggregate(date_field, aggregate, kwargs)
return self._aggregate(date_field, self._get_aggregates(aggregates), kwargs)

# Utility functions
def update_today(self):
_now = compat.now()
self.today = _remove_time(_now)
return self.today

def _aggregate(self, date_field=None, aggregate=None, filter=None):
def _aggregate(self, date_field=None, aggregates=None, filters=None):
date_field = date_field or self.date_field
aggregate = aggregate or self.aggregate

aggregates = self._get_aggregates(aggregates)

if not date_field:
raise DateFieldMissing("Please provide a date_field.")
raise DateFieldMissing("Please provide a date_field.")

if self.qs is None:
raise QuerySetMissing("Please provide a queryset.")

agg = self.qs.filter(**filter).aggregate(agg=aggregate)
return agg['agg']
qs = self.qs.filter(**filters).aggregate(**{'agg_%d' % i: aggregate for i, aggregate in enumerate(aggregates)})

return [qs['agg_%d' % i] for i in range(len(aggregates))]
16 changes: 8 additions & 8 deletions qsstats/tests.py
Original file line number Diff line number Diff line change
@@ -21,7 +21,7 @@ def test_basic_today(self):
qss = QuerySetStats(qs, 'date_joined')

# We should only see a single user
self.assertEqual(qss.this_day(), 1)
self.assertEqual(qss.this_day()[0], 1)

def assertTimeSeriesWorks(self, today):
seven_days_ago = today - datetime.timedelta(days=7)
@@ -66,10 +66,10 @@ def test_until(self):
qs = User.objects.all()
qss = QuerySetStats(qs, 'date_joined')

self.assertEqual(qss.until(now), 1)
self.assertEqual(qss.until(today), 1)
self.assertEqual(qss.until(yesterday), 0)
self.assertEqual(qss.until_now(), 1)
self.assertEqual(qss.until(now)[0], 1)
self.assertEqual(qss.until(today)[0], 1)
self.assertEqual(qss.until(yesterday)[0], 0)
self.assertEqual(qss.until_now()[0], 1)

def test_after(self):
now = compat.now()
@@ -83,11 +83,11 @@ def test_after(self):
qs = User.objects.all()
qss = QuerySetStats(qs, 'date_joined')

self.assertEqual(qss.after(today), 1)
self.assertEqual(qss.after(now), 0)
self.assertEqual(qss.after(today)[0], 1)
self.assertEqual(qss.after(now)[0], 0)
u.date_joined=tomorrow
u.save()
self.assertEqual(qss.after(now), 1)
self.assertEqual(qss.after(now)[0], 1)

# MC_TODO: aggregate_field tests

8 changes: 4 additions & 4 deletions setup.py
Original file line number Diff line number Diff line change
@@ -8,12 +8,12 @@

setup(
name='django-qsstats-magic',
version='0.7.2',
version='0.8.0',
description='A django microframework that eases the generation of aggregate data for querysets.',
long_description = open('README.rst').read(),
author='Matt Croydon, Mikhail Korobov',
author_email='mcroydon@gmail.com, kmike84@gmail.com',
url='http://bitbucket.org/kmike/django-qsstats-magic/',
author='Matt Croydon, Mikhail Korobov, Abd Allah Diab',
author_email='mcroydon@gmail.com, kmike84@gmail.com, mpcabd@gmail.com',
url='https://github.com/mpcabd/django-qsstats-magic/',
packages=['qsstats'],
requires=['dateutil(>=1.4.1, < 2.0)'],
classifiers=[