Skip to content

Data analysis and reporting tool for quick access to custom charts and tables in Jupyter Notebooks and in the shell.

License

Notifications You must be signed in to change notification settings

data-processing/fireant

 
 

Repository files navigation

FireAnt - Analytics and Reporting

BuildStatus CoverageStatus Codacy Docs PyPi License

fireant is a a data analysis tool used for quickly building charts, tables, reports, and dashboards. It defines a schema for configuring metrics and dimensions which removes most of the leg work of writing queries and formatting charts. fireant even works great with Jupyter notebooks and in the Python shell providing quick and easy access to your data.

Read more at http://fireant.readthedocs.io/en/latest/

Installation

To install fireant, run the following command in the terminal:

pip install fireant

Slicers

Slicers are the core component of fireant. A Slicer is a configuration of two types of elements, metrics and dimensions, which represent what kinds of data exist and how the data can be organized. A metric is a type of data, a measurement such as clicks and a dimension is a range over which metrics can be extended or grouped by. Concretely, metrics represent the data in a chart or table and dimensions represent the rows and columns, axes, or categories.

To configure a slicer, instantiate a fireant.Slicer with either a fireant.database.VerticaDatabase, fireant.database.MySQLDatabase, fireant.database.PostgreSQLDatabase or a fireant.database.RedshiftDatabase and a pypika.Table or pypika.Tables, with a list of fireant.slicer.Metric and fireant.slicer.Dimension.

from fireant.slicer import *
from fireant.database import VerticaDatabase
from pypika import Tables, functions as fn

vertica_database = VerticaDatabase(user='fakeuser', password='fakepassword')
analytics, accounts = Tables('analytics', 'accounts')

my_slicer = Slicer(
    # This is the primary database table that our slicer uses
    table=analytics,

    # Define the database connection object
    database=vertica_database,

    joins=[
        # Metrics and dimensions can use columns from joined tables by
        # configuring the join here.  Joins will only be used when necessary.
        Join('accounts', accounts, analytics.account_id == accounts.id),
    ],

    metrics=[
        # A unique key is required for each metric
        Metric('impressions'),
        Metric('clicks'),
        Metric('conversions'),
        Metric('cost'),
        Metric('revenue'),

        # By default, a metric maps one-to-one with a column in the database
        # but it can also be given a more complex definition.
        Metric('cpc', label='CPC',
               definition=fn.Sum(analytics.cost) / fn.Sum(analytics.clicks)),
        Metric('rpc', label='RPC',
               definition=fn.Sum(analytics.revenue) / fn.Sum(analytics.clicks)),
        Metric('roi', label='ROI',
               definition=fn.Sum(analytics.revenue) / fn.Sum(analytics.cost)),
    ],

    dimensions=[
        # Datetime Dimensions are continuous and must be truncated to an interval
        # like hour, day, week. Day is the default.
        DatetimeDimension('date', definition=analytics.dt),

        # Categorical dimensions are ones with a fixed number of options.
        CategoricalDimension('device', display_options=[DimensionValue('desktop'),
                                                DimensionValue('tablet'),
                                                DimensionValue('mobile')]),

        # Unique dimensions are used for entities that have a unique ID and
        # a display name field
        UniqueDimension('account', label='Account Name', definition=analytics.account_id,

                        # The accounts table is joined to get more data about the
                        # account.
                        display_field=accounts.name,

                        # Just a list of keys of the required joins is needed.
                        joins=['accounts']),
    ],
)

Querying Data and Rendering Charts

Once a slicer is configured, it is ready to be used. Each slicer comes with a fireant.slicer.SlicerManager and several |ClassTransformerManager| which expose an interface for executing queries and transforming the results. Each function in the manager uses the same signature. The principal function is data and all othe functions call this function first. The additional functions provide a transformation to the data.

The notebooks transformer bundle includes different functions for use in Jupyter notebooks. Other formats return results in JSON format.

  • my_slicer.manager.data - A Pandas data frame indexed by the selected dimensions.
  • my_slicer.manager.query_string - A string containing the raw SQL query that FireAnt is running.
  • my_slicer.notebooks.row_index_table - A Datatables row-indexed table.
  • my_slicer.notebooks.column_index_table - A Datatables column-indexed table.
  • my_slicer.notebooks.line_chart - A Matplotlib line chart. (Requires [matplotlib] dependency)
  • my_slicer.notebooks.column_chart - A Matplotlib column chart. (Requires [matplotlib] dependency)
  • my_slicer.notebooks.bar_chart - A Matplotlib bar chart. (Requires [matplotlib] dependency)
  • my_slicer.highcharts.line_chart - A Highcharts line chart.
  • my_slicer.highcharts.column_chart - A Highcharts column chart.
  • my_slicer.highcharts.bar_chart - A Highcharts bar chart.
  • my_slicer.datatables.row_index_table - A Datatables row-indexed table.
  • my_slicer.datatables.column_index_table - A Datatables column-indexed table.
def data(self, metrics, dimensions, metric_filters, dimension_filters, references, operations):

Examples

Use the data function to get a Pandas data frame or series. The following example will result in a data frame with 'device' as the index, containing the values 'Desktop', 'Tablet', and 'Mobile', and the columns 'Clicks' and 'ROI'.

df = my_slicer.manager.data(
    metrics=['clicks', 'roi'],
    dimensions=['device']
)

Removing the dimension will yield a similar result except as a Pandas series containing 'Clicks' and 'ROI'. These are the aggregated values over the entire data base table.

df = my_slicer.manager.data(
    metrics=['clicks', 'roi'],
)

The transformer functions us the data function but then apply a transformation to convert the data into formats for Highcharts or Datatables. The results for these can be serialized directly into json objects.

import json

result = my_slicer.manager.line_chart(
    metrics=['clicks', 'roi'],
    dimensions=['date', 'device'],
)

json.dumps(result)
import json

result = my_slicer.manager.row_index_table(
    metrics=['clicks', 'revenue', 'cost', 'roi'],
    dimensions=['account', 'device'],
)

json.dumps(result)

License

Copyright 2016 KAYAK Germany, GmbH

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Crafted with ♥ in Berlin.

About

Data analysis and reporting tool for quick access to custom charts and tables in Jupyter Notebooks and in the shell.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.6%
  • SQLPL 0.4%