Skip to content

Filter pushdown issue with aggregating inline views and CTE's #640

@ralfbecher

Description

@ralfbecher

Filters are applied to the outer SELECT. This cannot work with aggregating inline views and CTE's (common table expressions if dimensionality is reduced. With FILTER_PARAMS we can push down the filter inside the inline view or CTS like this:

Having this simple example with a transaction table:

CREATE TABLE food_sales (
        CREATED_TS      timestamp PRIMARY KEY,
        ITEM            string,
        AMOUNT          integer
);

and a Cube defined for aggregating top-5 with all others (typical pie chart) we want to filter for AMOUNT < n (eg. finding top-5 items in a lower segment):

cube(`FoodSales_Aggr_Top5`, {
  sql: `WITH top5 AS (
      SELECT ITEM, sum(AMOUNT) AS AMOUNT_SUM
      FROM food_sales
      WHERE ${FILTER_PARAMS.FoodSales_Aggr_Top5.amount.filter('AMOUNT')}
      GROUP BY ITEM
      ORDER BY AMOUNT_SUM DESC 
      LIMIT 5
    ) 
    SELECT * FROM top5
    UNION ALL
    SELECT 'Others' as ITEM, sum(AMOUNT) AS AMOUNT_SUM
    FROM food_sales
    WHERE ${FILTER_PARAMS.FoodSales_Aggr_Top5.amount.filter('AMOUNT')} 
    AND ITEM NOT IN (SELECT ITEM FROM top5)`,
  dataSource: `default`,

  title: `Food Sales Top 5`,
  description: `Food Sales aggregated Top5 sales`,

  dimensions: {
    created: {
      sql: 'CREATED_TS',
      type: 'time',
      primaryKey: true,
      shown: false
    },
    item: {
      sql: `ITEM`,
      type: `string`,
    },
    amount: {
      sql: `AMOUNT`,
      type: `number`,
      shown: false
    }
  },

  measures: {
    sumAmount: {
      sql: `AMOUNT_SUM`,
      type: `sum`
    }

  }
});

Now we define this query with the filter AMOUNT < 1000:

{
  measures: [
    'FoodSales_Aggr_Top5.sumAmount'
  ],
  dimensions: [
    'FoodSales_Aggr_Top5.item'
  ],
  filters: [{
    member: 'FoodSales_Aggr_Top5.amount',
    operator: 'lt',
    values: ['1000']
  }]
}

This is the generated SQL:

SELECT  "food_sales__aggr__top5".ITEM "food_sales__aggr__top5__item",
        sum("food_sales__aggr__top5".AMOUNT_SUM) "food_sales__aggr__top5__sum_amount"
FROM (
        WITH top5 AS (
                SELECT  ITEM,
                        sum(AMOUNT) AS AMOUNT_SUM
                FROM food_sales 
                WHERE AMOUNT < ?
                GROUP BY ITEM
                ORDER BY AMOUNT_SUM DESC
                LIMIT 5)
        SELECT * FROM top5
        UNION ALL
        SELECT  'Others' as ITEM,
                sum(AMOUNT) AS AMOUNT_SUM
        FROM food_sales
        WHERE AMOUNT < ? AND ITEM NOT IN (SELECT ITEM FROM top5)
) AS "food_sales__aggr__top5"
WHERE ("food_sales__aggr__top5".AMOUNT < ?)
GROUP BY 1
ORDER BY 2 DESC

As we see we have an additional WHERE clause on the outer SELECT where we do not have the AMOUNT field because of aggregation.

My expectation is that this is not needed here because we have explicitly used FILTER_PARAMS for this field inside the SQL. Also, I would expect the field AMOUNT would not appear in the outer SELECT since I set shown: false in the Cube definition.

Maybe there is already a solution to such kind of queries and I haven't found it yet?

Metadata

Metadata

Assignees

No one assigned

    Labels

    backend:serverIssues relating to Cube Core's Server

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions