Skip to content

Missing excpetion handling for infinite / missing predictions #136

Open
@MalteKurz

Description

@MalteKurz

There is no exception handling in-place in case some learner produces infinite or missing predictions. Basically, very silently the estimates are becoming NA's without a warning or exception.

See for example:

library(DoubleML)

g = function(x) {
  res = sin(x)^2
  return(res)
}

m = function(x, nu = 0, gamma = 1) {
  xx = sinh(gamma) / (cosh(gamma) - cos(x - nu))
  res = 0.5 / pi * xx
  return(res)
}

dgp1_irmiv = function(theta, N, k) {
  
  b = 1 / (1:k)
  sigma = clusterGeneration::genPositiveDefMat(k, "unifcorrmat")$Sigma
  
  X = mvtnorm::rmvnorm(N, sigma = sigma)
  G = g(as.vector(X %*% b))
  M = m(as.vector(X %*% b))
  
  pr_z = 1 / (1 + exp(-(1) * X[, 1] * b[5] + X[, 2] * b[2] + rnorm(N)))
  z = rbinom(N, 1, pr_z)
  
  U = rnorm(N)
  pr = 1 / (1 + exp(-(1) * (0.5 * z + X[, 1] * (-0.5) + X[, 2] * 0.25 - 0.5 * U + rnorm(N))))
  d = rbinom(N, 1, pr)
  err = rnorm(N)
  
  y = theta * d + G + 4 * U + err
  
  data = data.frame(y, d, z, X)
  
  return(data)
}

set.seed(1282)
df = dgp1_irmiv(0.5, 1000, 20)
Xnames = names(df)[names(df) %in% c("y", "d", "z") == FALSE]
dml_data = double_ml_data_from_data_frame(df,
                                          y_col = "y",
                                          d_cols = "d", x_cols = Xnames, z_col = "z")

ml_g = mlr3::lrn("regr.rpart", cp = 0.01, minsplit = 20)
ml_m = mlr3::lrn("classif.rpart", cp = 0.01, minsplit = 20)
ml_r = mlr3::lrn("classif.rpart", cp = 0.01, minsplit = 20)

set.seed(3141)
double_mliivm_obj = DoubleMLIIVM$new(
  data = dml_data,
  n_folds = 5,
  ml_g = ml_g,
  ml_m = ml_m,
  ml_r = ml_r,
  dml_procedure = "dml2",
  trimming_threshold = 0,
  score = "LATE")
double_mliivm_obj$fit()
print(double_mliivm_obj$coef)
print(double_mliivm_obj$se)

It is then getting even more confusing if one thereafter calls the method bootstrap(). This results in exception

double_mliivm_obj$bootstrap()
Error in double_mliivm_obj$bootstrap(): Apply fit() before bootstrap().

which is obviously not the root cause and also the remark to apply fit() will obviously not fix the issue.

I propose to implement a check for finite predictions similar to the check in the Python package: https://github.com/DoubleML/doubleml-for-py/blob/b3cbdb572fce435c18ec67ca323645900fc901b5/doubleml/_utils.py#L204-L208

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions