Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Don't use @datanames anymore #333

Open
gogonzo opened this issue Sep 10, 2024 · 5 comments · May be fixed by #347
Open

[Feature Request]: Don't use @datanames anymore #333

gogonzo opened this issue Sep 10, 2024 · 5 comments · May be fixed by #347
Assignees

Comments

@gogonzo
Copy link
Contributor

gogonzo commented Sep 10, 2024

Feature description

@datanames slot has been introduced only to distinguish temporary/proxy objects from "real" datanames. It was useful in teal where app didn't have to create a filter-panel. Now (after this) teal ignores teal.data::datanames and simply runs ls(x@env) to determine "datanames".

We can keep datanames function but names (implemented in qenv) is more relevant in my opinion.

names.qenv <- function(x) ls(x@env)
names(teal_data(a = 1, .b = 2))
#> [1] "a"

Plan:

  • introduce names method for qenv and qenv.error
  • remove @datanames slot
  • deprecate datanames<- in teal.data - function has no effect in teal anymore. Please point to use . for exclusions or teal::set_datanames
  • deprecate datanames() in teal.data - point to the names
  • replace teal.data::datanames in the modules in favour of names()
    • tmc, tmg, teal.gallery, tlg-catalog

Related issues:

@gogonzo gogonzo transferred this issue from insightsengineering/teal.logger Sep 10, 2024
@gogonzo
Copy link
Contributor Author

gogonzo commented Sep 26, 2024

@pawelru I'd like to continue discussion here:

I have an idea that @datanames shouldn't exist

I'm all for it!

  • no dataname slot - makes it simpler and reduce the clutter code made by end-user on the preprocessing part
  • topological sort functionality should live in the teal-data class

One question tough. Can we avoid extending the base class in teal.code and apply extension only in teal.data? I would like to avoid edits in teal.code. It's good as is.

If we decide to use names to read the object names from @env then we don't even need to make generic method because this is base S3 method - in this case I don't see why we should avoid teal.code. It would look like this:

names.qenv <- function(x) ls(x)
names.teal_data <- function(x) {
  unsorted <- names.qenv(x)
  .topological_sort(unsorted, join_keys(x))
}

names unlike datanames is more general name and fits to both classes.


We can consider also other methods names:

  • ls_env
  • env_ls
  • get_ls
  • list_objects
  • ...
  • get_env_names

Yes I am aware. It very connected to my old issue insightsengineering/teal#969. I guess we won't solve it here. But I think we should be good on this. I think that we are on the same page that get_datanames() functionality (however its implemented) should exclude datanames starting with dot. That check function you referenced should be called after that filter.

  • ls by default excludes . so I consider this more natural this this is why ls_env is closer by its function. We can't implement ls.qenv as ls has no S3 generic.
  • names however is much easier to implement and most of the R class have this method. Problem is that names.environment doesn't exclude . elements.

@gogonzo
Copy link
Contributor Author

gogonzo commented Sep 26, 2024

By looking at the NEWS entries of this incoming release it feels like we should address this issue sooner. It doesn't make sense to enhance datanames and then deprecate them. Improvements will be still relevant but datanames() function might be deprecated soon in favour of something else (ls_env or names).

Or maybe we should keep datanames() getter for backwards compatibility and not bother with more precise name like ls_env?

# teal.data 0.6.0.9011

### Enhancements

- `datanames()`
    - if `join_keys` are provided, the `datanames()` are now sorted in topological way (`Kahn` algorithm),
    which means the parent dataset always precedes the child dataset.
    - are extended by the parent dataset name, if one of the child dataset exist in `datanames()` and
    the connection between child-parent is set through `join_keys` and `parent` exist in `teal_data` environment.
    - do not allow to set a dataset name that do not exist in `teal_data` environment.
    - `teal_data` no longer set default `datanames()` based on `join_keys` names - it uses only data names.

@gogonzo
Copy link
Contributor Author

gogonzo commented Sep 26, 2024

After short call with @pawelru we initially agreed to the names with logic described in this comment

@averissimo
Copy link
Contributor

averissimo commented Oct 29, 2024

Disruptive question here as qenv design predates my time at the project

Was it ever considered to use an S4 object that inherits from "environment". A preliminary test while I was trying to make ls() internal function to work with qenv shows it's possible.

It adds a slot called .xData where we can store the current @env. A simple find and replace makes it very functional.

The only problem so far is that the environment's parent is not .GlobalEnv edit: nevermind

The only problem so far is that rlang does not recognize these are environments. rlang::is_environment(q) returns FALSE

# The required change in the class
setClass(
  "qenv",
  slots = c(
---    env = "environment",
    code = "character",
    id = "integer",
    warnings = "character",
    messages = "character"
  ),
+++  contains = "environment",
  prototype = list(
    env = new.env(parent = parent.env(.GlobalEnv)), code = character(0), id = integer(0),
    warnings = character(0), messages = character(0)
  )
)

Note that this is not a reproducible code, it's just a sample of the results

pkgload::load_all("teal.code")
#> ℹ Loading teal.code
q2 <- qenv() |> within({
  iris <- iris
  mtcars <- mtcars
})
ls(q2)
#> [1] "iris"   "mtcars"
q2$iris |> tibble::as_tibble()
#> # A tibble: 150 × 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#>  1          5.1         3.5          1.4         0.2 setosa 
#>  2          4.9         3            1.4         0.2 setosa 
#>  3          4.7         3.2          1.3         0.2 setosa 
#>  4          4.6         3.1          1.5         0.2 setosa 
#>  5          5           3.6          1.4         0.2 setosa 
#>  6          5.4         3.9          1.7         0.4 setosa 
#>  7          4.6         3.4          1.4         0.3 setosa 
#>  8          5           3.4          1.5         0.2 setosa 
#>  9          4.4         2.9          1.4         0.2 setosa 
#> 10          4.9         3.1          1.5         0.1 setosa 
#> # ℹ 140 more rows
q2[["iris"]] |> tibble::as_tibble()
#> # A tibble: 150 × 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#>  1          5.1         3.5          1.4         0.2 setosa 
#>  2          4.9         3            1.4         0.2 setosa 
#>  3          4.7         3.2          1.3         0.2 setosa 
#>  4          4.6         3.1          1.5         0.2 setosa 
#>  5          5           3.6          1.4         0.2 setosa 
#>  6          5.4         3.9          1.7         0.4 setosa 
#>  7          4.6         3.4          1.4         0.3 setosa 
#>  8          5           3.4          1.5         0.2 setosa 
#>  9          4.4         2.9          1.4         0.2 setosa 
#> 10          4.9         3.1          1.5         0.1 setosa 
#> # ℹ 140 more rows

q2$yada <- 2
#> Error in q2$yada <- 2: cannot add bindings to a locked environment
q2[["yada"]] <- 2
#> Error in q2[["yada"]] <- 2: cannot add bindings to a locked environment

Created on 2024-10-29 with reprex v2.1.1

@m7pr
Copy link
Contributor

m7pr commented Oct 30, 2024

Wow @averissimo , very promising finding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

3 participants