-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
readr no longer reproduces the problems from challenge.csv #1398
Comments
@jennybc It seems like this is due to vroom since I can replicate the parsing error with edition 1 library(readr)
with_edition(
1,
read_csv(readr_example("challenge.csv"), show_col_types = FALSE)
)
#> Warning: 1000 parsing failures.
#> row col expected actual file
#> 1001 y 1/0/T/F/TRUE/FALSE 2015-01-16 '/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/readr/extdata/challenge.csv'
#> 1002 y 1/0/T/F/TRUE/FALSE 2018-05-18 '/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/readr/extdata/challenge.csv'
#> 1003 y 1/0/T/F/TRUE/FALSE 2015-09-05 '/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/readr/extdata/challenge.csv'
#> 1004 y 1/0/T/F/TRUE/FALSE 2012-11-28 '/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/readr/extdata/challenge.csv'
#> 1005 y 1/0/T/F/TRUE/FALSE 2020-01-13 '/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/readr/extdata/challenge.csv'
#> .... ... .................. .......... ..................................................................................................
#> See problems(...) for more details.
#> # A tibble: 2,000 × 2
#> x y
#> <dbl> <lgl>
#> 1 404 NA
#> 2 4172 NA
#> 3 3004 NA
#> 4 787 NA
#> 5 37 NA
#> 6 2332 NA
#> 7 2489 NA
#> 8 1449 NA
#> 9 3665 NA
#> 10 3863 NA
#> # … with 1,990 more rows
#> # ℹ Use `print(n = ...)` to see more rows Created on 2022-08-30 by the reprex package (v2.0.1.9000) This is happening in a vignette that we already want/need to update so maybe a minimal solution is best for now. We could specify the column types like we do for library(readr)
read_csv(
readr_example("challenge.csv"),
show_col_types = FALSE,
col_types = "dl"
)
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#> dat <- vroom(...)
#> problems(dat)
#> # A tibble: 2,000 × 2
#> x y
#> <dbl> <lgl>
#> 1 404 NA
#> 2 4172 NA
#> 3 3004 NA
#> 4 787 NA
#> 5 37 NA
#> 6 2332 NA
#> 7 2489 NA
#> 8 1449 NA
#> 9 3665 NA
#> 10 3863 NA
#> # … with 1,990 more rows
#> # ℹ Use `print(n = ...)` to see more rows Created on 2022-08-30 by the reprex package (v2.0.1.9000) Otherwise we'd need to modify/replace library(readr)
# create a file like this
df <- glue::glue('x,y
1,2
3,4
5,6
7
8,9
10')
tf <- withr::local_tempfile(lines = df)
read_csv(tf, show_col_types = FALSE)
#> Warning: One or more parsing issues, call `problems()` on your data frame for details,
#> e.g.:
#> dat <- vroom(...)
#> problems(dat)
#> # A tibble: 6 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 2
#> 2 3 4
#> 3 5 6
#> 4 7 NA
#> 5 8 9
#> 6 10 NA Created on 2022-08-30 by the reprex package (v2.0.1.9000) |
I think you should update that section of the vignette for the modern readr 2e / vroom era.
^ this is no longer true, needs rewording
You can either force parsing problems to happen with challenge.csv, which at least allows some discussion of It would be better (but harder) to think about what sort of realistic problems we want to demonstrate and create a new small dataset that has such a problem. Issues relating to parsing problems in readr and vroom would be a good source of inspiration. |
The most common usage of |
challenge.csv
is designed to teach some key challenges of parsing and features ofreadr
. However, the example is broken.source
Here's what I get when I run this code on my computer (this is a screen cap from the vignette, but I have the same issue on my computer)
The text was updated successfully, but these errors were encountered: