Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deal with tibbles and 1 column data frames gracefully #169

Open
cjyetman opened this issue Jan 6, 2018 · 7 comments
Open

deal with tibbles and 1 column data frames gracefully #169

cjyetman opened this issue Jan 6, 2018 · 7 comments

Comments

@cjyetman
Copy link
Collaborator

cjyetman commented Jan 6, 2018

We could add here something like...

if (mode(sourcevar) == "list" && length(sourcevar) == 1) {
  sourcevar <- sourcevar[[1]]
}

to essentially cast a single column tibble or data frame (or single element list) to a vector and avoid the following warning, i.e. sourcevar must be a character or numeric vector. This error often arises when users pass a tibble (e.g., from dplyr) instead of a column vector from a data.frame (i.e., my_tbl[, 2] vs. my_df[, 2] vs. my_tbl[[2]]).

Not 100% sure that's the best thing, so leaving this here simply as an idea.

@vincentarelbundock
Copy link
Owner

This seems like a good thing overall. Two questions:

  1. How can we be as explicit as possible w.r.t. tibbles?
  2. Is it desirable to output data in the same format that we got it in, or would it be better for countrycode to settle on a single output format that is well documented? If the former, do we also insert some code at then end to reformat?

@cjyetman
Copy link
Collaborator Author

cjyetman commented Jan 6, 2018

  1. "tbl_df" %in% class(sourcevar) maybe? not 100% explicit, but pretty close (I suppose if someone created a class on top of tbl_df and added other charcateristics this would still assume it's a tbl_df)

  2. I think it would be best to always return a single vector, which can easily be extended by the user for whatever structure they need. This proposal would simply work around an unintended passing of a sourcevar that doesn't quite conform to what's necessary, but can be obviously cast into something appropriate... rather than trying to open up a whole new realm of converting various different data structures.

@cjyetman
Copy link
Collaborator Author

cjyetman commented Jan 6, 2018

It could also throw a mandatory warning that tells the user that it's converting sourcevar so eventually they might learn that they should probably do it themselves first.

@vincentarelbundock
Copy link
Owner

That all sounds very good.

I've now dog-fooded the package a fair amount, and I'm thinking about preparing a release for CRAN using the major improvements we've implemented.

Do you think we should sneak this change in before I hit "GO"? Any other easy changes to make before release?

@cjyetman
Copy link
Collaborator Author

cjyetman commented Jan 6, 2018

There are a few regex improvements that I could probably get around to in the next week or so.

@vincentarelbundock
Copy link
Owner

Cool. No rush.

@vincentarelbundock
Copy link
Owner

My thinking on this has evolved.

I now believe that we should not convert inputs automagically, but that we should rather build in input checks and exit gracefully with a warning and instructions on common fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants