Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential functions to add to chk (first version of the diagram) #140

Closed
flor14 opened this issue Oct 9, 2024 · 8 comments
Closed

Potential functions to add to chk (first version of the diagram) #140

flor14 opened this issue Oct 9, 2024 · 8 comments

Comments

@flor14
Copy link
Contributor

flor14 commented Oct 9, 2024

Hello

Here there is a first version of chk diagram. Part of the diagram is based on the book Advanced R.

The existence of objects mentioned in Advanced R that do not currently have a corresponding chk function should not be the sole criterion for deciding which new functions to include in the package. The goal of this issue is to list these objects as sources of potential new chk functions for further discussion.

Both the diagram and the final comments are open to feedback and modifications.

image

Note: The functions highlighted in pink may be confusing. I marked them this way to remind myself to add some clarifying comments in the documentation.

What is currently not covered?

  • The data types complex and raw don't have specific functions to check them (but they are mentioned as less frequently used).
  • Tibbles.
    There are no chk_* and vld_* functions for tibbles. Tibbles are not equal to data frames. They have a different class and can behave as specified in this section of Advanced R
#> $class
#> [1] "tbl_df"     "tbl"        "data.frame"
  • Durations.
    There are no chk_* and vld_* functions for class difftimes.

"Durations, which represent the amount of time between pairs of dates or date-times, are stored in difftime. difftime class is built on top of doubles, and have a units attribute that determines how the value should be interpreted"

  • Units attribute
    Since the chk_tz function checks the tz attribute of the POSIXct class, a similar chk_* and vld_* function could be developed to check the units attribute of the difftime class.

  • Double data type scalars There are some scalars for data type double that don't have specific functions to check them: hexadecimal, NaN, Inf, -Inf, Scientific.

  • More functions for data.frames
    length() may not be intuitive for data.frames.

"A data frame has nrow() rows and ncol() columns. The length() of a data frame gives the number of columns."

@flor14 flor14 changed the title Potential functions to add to chk (first version diagram) Potential functions to add to chk (first version of the diagram) Oct 9, 2024
@joethorley
Copy link
Member

joethorley commented Oct 16, 2024

@flor14 - can you provide a checklist of possible functions to add such as the following that I could quickly overview and then select the ones that make sense?

  • chk_raw()
  • chk_complex()
  • chk_nan()

@flor14
Copy link
Contributor Author

flor14 commented Oct 20, 2024

Data types missing

  • chk_raw
  • chk_bytes (The raw data type is used to store raw bytes of data. Individual bytes are represented in hexadecimal format, it could (or not) overlap with chk_hexadecimal)
  • chk_complex (The complex data type represents complex numbers, which have both real and imaginary parts.)
  • chk_im - There are functions (Im() and Re()) to check if each part of a complex number exists.
  • chk_re

Double data type

  • chk_nan
  • chk_hexadecimal
  • chk_scientific
  • chk_infinite (could be -Inf and/or Inf)

class difftime

  • chk_difftime
  • chk_units

Tibble

  • chk_tibble

I will not go into detail regarding my data.frame comment because I am still not familiar with all the functions that are part of check_data

@flor14
Copy link
Contributor Author

flor14 commented Oct 20, 2024

Also, #129

  • chk_namespace

@joethorley
Copy link
Member

I think
chk_bytes() should be called chk_byte() and it should check that a raw vector is length 1 and not missing.

@joethorley
Copy link
Member

joethorley commented Oct 21, 2024

Also

  • chk_complex_number() for a non-missing complex vector of length 1

@flor14
Copy link
Contributor Author

flor14 commented Oct 29, 2024

This is the final diagram

image

@joethorley
Copy link
Member

Going with chk_complex_number(), chk_complex() and chk_raw(). Have PRs for all three.

@joethorley
Copy link
Member

chk_time() is in dttr2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants