Skip to content

Commit

Permalink
addressed check warnings and notes
Browse files Browse the repository at this point in the history
  • Loading branch information
rgurlek committed Mar 8, 2020
1 parent 050390b commit 8a2e1b8
Show file tree
Hide file tree
Showing 8 changed files with 26 additions and 41 deletions.
6 changes: 5 additions & 1 deletion .Rbuildignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
^.*\.Rproj$
^\.Rproj\.user$
^\sample_data
^sample_data$
^index.html$
^_config.yml$
^README.md$
^\.travis\.yml$
4 changes: 4 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# R for travis: see documentation at https://docs.travis-ci.com/user/languages/r

language: R
cache: packages
4 changes: 3 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Package: FAIRforecast
Type: Package
Title: A Package for FAIR Forecast
Version: 0.1.0
Description: FAIR is a forecasting tool developed to support decision making in a retail environment. It provides multi-step-ahead sales forecasts at the category-store level, which are based on an interpretable and transparent model. These aspects make it an objective tool with which different promotional strategies (scenarios) can be compared and insights can be generated. Additionally, the FAIRforecast package provides plotting functions that generate figures showing the relative strength of the interactions between categories and important promotional variables. For more information, see vignette("FAIRforecast") and Gür Ali & Gürlek (2020).
Description: FAIR is a forecasting tool developed to support decision making in a retail environment. It provides multi-step-ahead sales forecasts at the category-store level, which are based on an interpretable and transparent model. These aspects make it an objective tool with which different promotional strategies (scenarios) can be compared and insights can be generated. Additionally, the FAIRforecast package provides plotting functions that generate figures showing the relative strength of the interactions between categories and important promotional variables. For more information, see the vignette and original paper provided in the GitHub URL below.
Authors@R: c(
person("Özden", "Gür Ali", email = "[email protected]", role = "aut"),
person("Ragip", "Gürlek", email = "[email protected]", role = c("aut", "cre")))
Expand All @@ -17,6 +17,8 @@ Imports:
glmnet,
plotly,
plyr,
scales,
stats,
tidyr
RoxygenNote: 7.0.2
Suggests:
Expand Down
1 change: 0 additions & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,4 @@ export(FAIR_predict)
export(FAIR_train)
export(cross_category)
export(sample_data)
export(var_cre)
export(variable_importance)
2 changes: 1 addition & 1 deletion R/fair_predict.R
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ FAIR_predict <- function(object, new_data, parallel = F) {
return(rep(my_model, nrow(my_data)))
if (is.null(my_model))
return(rep(NA, nrow(my_data)))
return(suppressWarnings(predict.lm(my_model, my_data)))
return(suppressWarnings(stats::predict.lm(my_model, my_data)))
}

new_vars <- lapply(var_list, fit)
Expand Down
6 changes: 3 additions & 3 deletions R/fair_train.R
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ FAIR_train <- function(train_data,
for (t1 in t) {
#block loop
#deseasonalize data
is_val <- t1 != tail(t, 1)
is_val <- t1 != utils::tail(t, 1)
if (is_val) {
temp <- plyr::ddply(
train_data,
Expand Down Expand Up @@ -201,15 +201,15 @@ FAIR_train <- function(train_data,
best_pars <- function(my_data) {
my_data$alpha <- as.numeric(as.character(my_data$alpha))
my_data$lambda <- as.numeric(as.character(my_data$lambda))
parameter_ev1 <- aggregate(
parameter_ev1 <- stats::aggregate(
my_data$RMSE,
by = list(my_data$alpha, my_data$lambda),
FUN = "mean",
na.rm = TRUE,
drop = TRUE
)
names(parameter_ev1)[ncol(parameter_ev1)] <- "mean"
parameter_ev2 <- aggregate(
parameter_ev2 <- stats::aggregate(
my_data$RMSE,
by = list(my_data$alpha, my_data$lambda),
FUN = "sd",
Expand Down
21 changes: 10 additions & 11 deletions R/others.R
Original file line number Diff line number Diff line change
Expand Up @@ -98,21 +98,21 @@ deseason <- function(my_data,
var_list <- c(sales, marketing)
fit <- function(my_var) {
train <- my_data[my_data[, time_id] < t1, c(my_var, seasonality)]
if (sd(train[, my_var]) == 0){
if (stats::sd(train[, my_var]) == 0){
s1_models[[my_var]] <<- train[1, my_var]
return(my_data[, my_var])
#If there is no variance in the training data, return
#training + validation as it is
} else{
myformula <- as.formula(paste0("log(", my_var, "+1) ~ ."))
my_model <- lm(formula = myformula,
myformula <- stats::as.formula(paste0("log(", my_var, "+1) ~ ."))
my_model <- stats::lm(formula = myformula,
data = train,
model = F)
s1_models[[my_var]] <<- my_model
#above line ensures obs until t1+horizon-1 are predicted.
#If the data is not validation (training), it is same as t1-1 since
#all points are less than t1
return(log(my_data[, my_var] + 1) - predict.lm(my_model, my_data))
return(log(my_data[, my_var] + 1) - stats::predict.lm(my_model, my_data))
}
}
new_vars <- lapply(var_list, fit)
Expand All @@ -123,7 +123,7 @@ deseason <- function(my_data,
my_list <- list(my_data, s1_models)
return(my_list)
}
#'@export

var_cre <- function(my_data,
category,
category_list,
Expand Down Expand Up @@ -191,7 +191,7 @@ step2 <-
}

train <- my_data[my_data[, time_id] < t1,]
train <- train[complete.cases(train[, marketing]),]
train <- train[stats::complete.cases(train[, marketing]),]

if (is_val) {
test <- my_data[!my_data[, time_id] < t1,]
Expand Down Expand Up @@ -259,7 +259,7 @@ step2 <-

time_series <- function(my_data, store, time_id, category) {
sum_func <- function(x) if (all(is.na(x))) NA_integer_ else sum(x)
series <- aggregate(
series <- stats::aggregate(
my_data$residual_2,
by = list(my_data[, store], my_data[, time_id], my_data[, category]),
FUN = sum_func,
Expand Down Expand Up @@ -292,7 +292,7 @@ time_series <- function(my_data, store, time_id, category) {

step3 <- function(x, time_id, frequency) {
x <- x[order(x[, time_id]),]
my_series <- ts(x$residual_2, frequency = frequency)
my_series <- stats::ts(x$residual_2, frequency = frequency)
my_model <- forecast::stlm(forecast::na.interp(my_series))
x$STL <- as.numeric(my_model$fitted)
return(list(x, my_model))
Expand Down Expand Up @@ -371,7 +371,7 @@ cross_category <- function(object){
my_mat <- my_mat[my_mat[, 'variable'] %in% cc_list, ]
my_mat$category <- substr(my_mat$variable, 1,
regexpr("_", my_mat$variable) - 1)
my_mat <- aggregate(my_mat$magnitude, list(my_mat$category), FUN = sum)
my_mat <- stats::aggregate(my_mat$magnitude, list(my_mat$category), FUN = sum)
colnames(my_mat) <- c('Category', 'magnitude')
for(j in 1:nrow(my_mat)){
cross_mat[names(models)[i], my_mat[j, 'Category']] <-
Expand All @@ -383,8 +383,7 @@ cross_category <- function(object){
p <- plotly::layout(
plotly::plot_ly(x = colnames(cross_mat), y = rownames(cross_mat),
z = cross_mat,
colorscale = scales::col_numeric("Blues", domain = NULL)(unique(scales::rescale(c(volcano)))),
type = "heatmap"),
colors = "Reds", type = "heatmap"),
title = "Cross-category effects",
xaxis = list(title = "Influencial Category"),
yaxis = list(title = "Influenced Category")
Expand Down
23 changes: 0 additions & 23 deletions vignettes/FAIRforecast.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -128,29 +128,6 @@ cross_category(my_model)
```


```{r eval=FALSE, include=FALSE}
### This part is not included in the vignette. Because, the current version of FAIR does not have a variable to distinguish the scenarios with the same store-category-week combination. Also, with the sample data we have, the other categories also should be provided to the predict function because of cc_marketing. Here is the paragraph I wrote for this example:
### The user can input a set of scenarios to `FAIR_predict` function to guide a decision. For example, the below code compares the sales for a base scenario, an agressive-ad scenario, and a scenario where the same agressive ad strategy is followed on a holiday.
# The base scenario. We take a random observation from the test data.
scenarios <- test_data[3757, ]
scenarios
# The agressive-ad scenario. We change the average ad in the category.
scenarios[2, ] <- scenarios[1, ]
scenarios[2, "ad"] <- 0.30
# Agressive-ad-on-holiday scenario. We take an observation on a holiday for the same store-category and change its marketing variables to the those of scenario 2.
scenarios[3, ] <- test_data[3766, ]
marketing_variables = c("price", "display", "discount", "dist_5", "dist_10",
"dist_15", "dist_20", "dist_30", "dist_40", "dist_50",
"dist_100", "ad")
scenarios[3, marketing_variables] <- scenarios[2, marketing_variables]
FAIR_predict(my_model, scenarios)
```


## References

Gür Ali, Ö. and Gürlek, R. (2019) Automatic Interpretable Retail Forecasting with Promotional Scenarios
Expand Down

0 comments on commit 8a2e1b8

Please sign in to comment.