Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Data Package compatibility to vignettes #246

Merged
merged 41 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
1d09d94
Ignore vignette extras
peterdesmet Jul 9, 2024
c9b9bf5
Copy paste read_resource info to vignettes
peterdesmet Jul 9, 2024
c64d19d
Describe all Data Resource properties
peterdesmet Jul 9, 2024
364c9c0
Update data-resource and table-dialect
peterdesmet Jul 10, 2024
a2df6e6
Merge branch 'main' into vignettes
peterdesmet Jul 10, 2024
1fc0edc
Update table-schema.Rmd
peterdesmet Jul 10, 2024
5470999
Add alt to logo to avoid pkgdown 2.1.0 warning
peterdesmet Jul 10, 2024
fffc5b3
a bit more concise
sannegovaert Jul 10, 2024
46e4043
more clear
sannegovaert Jul 10, 2024
4b5b2b7
small update
sannegovaert Jul 10, 2024
0923df9
3rd group Mutating added
sannegovaert Jul 10, 2024
41f1d7d
more concise
sannegovaert Jul 10, 2024
250c838
update links
sannegovaert Jul 10, 2024
08d7f79
use present tense
sannegovaert Jul 10, 2024
d546e47
use present tense
sannegovaert Jul 10, 2024
a6ac11f
Merge branch 'main' into vignettes
peterdesmet Jul 15, 2024
7ebee57
Add URL, remove author properties
peterdesmet Jul 15, 2024
88f042e
Update Data Resource and Table Dialect
peterdesmet Jul 15, 2024
d438907
Rework data-resource
peterdesmet Aug 22, 2024
90c628c
Update data-resource and table-dialect
peterdesmet Aug 22, 2024
1040c5f
Update title
peterdesmet Aug 22, 2024
8977690
Add author + update some phrasing
peterdesmet Aug 23, 2024
58aecce
Finalize table-schema vignette
peterdesmet Aug 23, 2024
c6f4631
Create data-package.Rmd
peterdesmet Aug 23, 2024
19969b4
Merge branch 'main' into vignettes
peterdesmet Aug 23, 2024
0526367
Fix todo and make use of observations_1.tsv
peterdesmet Aug 23, 2024
d3fe50f
Rephrase "support" as "implementation"
peterdesmet Aug 23, 2024
eff34a2
Finalize Data Package vignette
peterdesmet Aug 23, 2024
cff1cdc
Simplify titles and use custom navbar
peterdesmet Aug 23, 2024
8189dc7
Avoid "will"
peterdesmet Aug 23, 2024
aaeb785
Rather than custom navbar, group articles
peterdesmet Aug 23, 2024
5491f26
Link to articles from README
peterdesmet Aug 23, 2024
adfdf39
Avoid use of "pkg" + link to package as {pkg}
peterdesmet Aug 23, 2024
5d0b565
Clarify v1 vs v2 support
peterdesmet Aug 23, 2024
aeb670c
Link functions to vignettes instead of verbosely describing
peterdesmet Aug 23, 2024
e38d56e
Use [function()]
peterdesmet Aug 23, 2024
6f1c61b
Indicate warn
peterdesmet Aug 23, 2024
fe30d7d
Describe change
peterdesmet Aug 23, 2024
e1ba4ec
replace with working link
sannegovaert Aug 26, 2024
55511cc
Review suggestions
peterdesmet Aug 26, 2024
5e46bed
Remove link to issues
peterdesmet Aug 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
# produced vignettes
vignettes/*.html
vignettes/*.pdf
vignettes/*.R
inst/doc

# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
.httr-oauth
Expand Down
150 changes: 3 additions & 147 deletions R/read_resource.R
Original file line number Diff line number Diff line change
Expand Up @@ -23,161 +23,17 @@
#' @family read functions
#' @export
#' @section Resource properties:
#' The [Data Resource
#' properties](https://specs.frictionlessdata.io/data-resource/) are handled as
#' follows:
#'
#' ## Path
#'
#' [`path`](https://specs.frictionlessdata.io/data-resource/#data-location) is
#' required.
#' It can be a local path or URL, which must resolve.
#' Absolute path (`/`) and relative parent path (`../`) are forbidden to avoid
#' security vulnerabilities.
#'
#' When multiple paths are provided (`"path": [ "myfile1.csv", "myfile2.csv"]`)
#' then data are merged into a single data frame, in the order in which the
#' paths are listed.
#'
#' ## Data
#'
#' If `path` is not present, the function will attempt to read data from the
#' `data` property.
#' **`schema` will be ignored**.
#'
#' ## Name
#'
#' `name` is [required](https://specs.frictionlessdata.io/data-resource/#name).
#' It is used to find the resource with `name` = `resource_name`.
#'
#' ## Profile
#'
#' `profile` is
#' [required](https://specs.frictionlessdata.io/tabular-data-resource/#specification)
#' to have the value `tabular-data-resource`.
#'
#' ## File encoding
#'
#' `encoding` (e.g. `windows-1252`) is
#' [required](https://specs.frictionlessdata.io/data-resource/#optional-properties)
#' if the resource file(s) is not encoded as UTF-8.
#' The returned data frame will always be UTF-8.
#' See `vignette("data-resource")`.
#'
#' ## CSV Dialect
#'
#' `dialect` properties are
#' [required](https://specs.frictionlessdata.io/csv-dialect/#specification) if
#' the resource file(s) deviate from the default CSV settings (see below).
#' It can either be a JSON object or a path or URL referencing a JSON object.
#' Only deviating properties need to be specified, e.g. a tab delimited file
#' without a header row needs:
#' ```json
#' "dialect": {"delimiter": "\t", "header": "false"}
#' ```
#'
#' These are the CSV dialect properties.
#' Some are ignored by the function:
#' - `delimiter`: default `,`.
#' - `lineTerminator`: ignored, line terminator characters `LF` and `CRLF` are
#' interpreted automatically by [readr::read_delim()], while `CR` (used by
#' Classic Mac OS, final release 2001) is not supported.
#' - `doubleQuote`: default `true`.
#' - `quoteChar`: default `"`.
#' - `escapeChar`: anything but `\` is ignored and it will set `doubleQuote` to
#' `false` as these fields are mutually exclusive.
#' You can thus not escape with `\"` and `""` in the same file.
#' - `nullSequence`: ignored, use `missingValues`.
#' - `skipInitialSpace`: default `false`.
#' - `header`: default `true`.
#' - `commentChar`: not set by default.
#' - `caseSensitiveHeader`: ignored, header is not used for column names, see
#' Schema.
#' - `csvddfVersion`: ignored.
#'
#' ## File compression
#' See `vignette("table-dialect")`.
#'
#' Resource file(s) with `path` ending in `.gz`, `.bz2`, `.xz`, or `.zip` are
#' automatically decompressed using default [readr::read_delim()]
#' functionality.
#' Only `.gz` files can be read directly from URL `path`s.
#' Only the extension in `path` can be used to indicate compression type,
#' the `compression` property is
#' [ignored](https://specs.frictionlessdata.io/patterns/#specification-3).
#'
#' ## Ignored resource properties
#'
#' - `title`
#' - `description`
#' - `format`
#' - `mediatype`
#' - `bytes`
#' - `hash`
#' - `sources`
#' - `licenses`
#' @section Table schema properties:
#' `schema` is required and must follow the [Table
#' Schema](https://specs.frictionlessdata.io/table-schema/) specification.
#' It can either be a JSON object or a path or URL referencing a JSON object.
#'
#' - Field `name`s are used as column headers.
#' - Field `type`s are use as column types (see further).
#' - [`missingValues`](https://specs.frictionlessdata.io/table-schema/#missing-values)
#' are used to interpret as `NA`, with `""` as default.
#'
#' ## Field types
#'
#' Field `type` is used to set the column type, as follows:
#' See `vignette("table-schema")`.
#'
#' - [string](https://specs.frictionlessdata.io/table-schema/#string) as
#' `character`; or `factor` when `enum` is present.
#' `format` is ignored.
#' - [number](https://specs.frictionlessdata.io/table-schema/#number) as
#' `double`; or `factor` when `enum` is present.
#' Use `bareNumber: false` to ignore whitespace and non-numeric characters.
#' `decimalChar` (`.` by default) and `groupChar` (undefined by default) can
#' be defined, but the most occurring value will be used as a global value for
#' all number fields of that resource.
#' - [integer](https://specs.frictionlessdata.io/table-schema/#integer) as
#' `double` (not integer, to avoid issues with big numbers); or `factor` when
#' `enum` is present.
#' Use `bareNumber: false` to ignore whitespace and non-numeric characters.
#' - [boolean](https://specs.frictionlessdata.io/table-schema/#boolean) as
#' `logical`.
#' Non-default `trueValues/falseValues` are not supported.
#' - [object](https://specs.frictionlessdata.io/table-schema/#object) as
#' `character`.
#' - [array](https://specs.frictionlessdata.io/table-schema/#array) as
#' `character`.
#' - [date](https://specs.frictionlessdata.io/table-schema/#date) as `date`.
#' Supports `format`, with values `default` (ISO date), `any` (guess `ymd`)
#' and [Python/C strptime](https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior)
#' patterns, such as `%a, %d %B %Y` for `Sat, 23 November 2013`.
#' `%x` is `%m/%d/%y`.
#' `%j`, `%U`, `%w` and `%W` are not supported.
#' - [time](https://specs.frictionlessdata.io/table-schema/#time) as
#' [hms::hms()].
#' Supports `format`, with values `default` (ISO time), `any` (guess `hms`)
#' and [Python/C strptime](https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior)
#' patterns, such as `%I%p%M:%S.%f%z` for `8AM30:00.300+0200`.
#' - [datetime](https://specs.frictionlessdata.io/table-schema/#datetime) as
#' `POSIXct`.
#' Supports `format`, with values `default` (ISO datetime), `any`
#' (ISO datetime) and the same patterns as for `date` and `time`.
#' `%c` is not supported.
#' - [year](https://specs.frictionlessdata.io/table-schema/#year) as `date`,
#' with `01` for month and day.
#' - [yearmonth](https://specs.frictionlessdata.io/table-schema/#yearmonth) as
#' `date`, with `01` for day.
#' - [duration](https://specs.frictionlessdata.io/table-schema/#duration) as
#' `character`.
#' Can be parsed afterwards with [lubridate::duration()].
#' - [geopoint](https://specs.frictionlessdata.io/table-schema/#geopoint) as
#' `character`.
#' - [geojson](https://specs.frictionlessdata.io/table-schema/#geojson) as
#' `character`.
#' - [any](https://specs.frictionlessdata.io/table-schema/#any) as `character`.
#' - Any other value is not allowed.
#' - Type is guessed if not provided.
#' @examples
#' # Read a datapackage.json file
#' package <- read_package(
Expand Down
2 changes: 1 addition & 1 deletion _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ authors:
href: https://mastodon.social/@peterdesmet
LifeWatch Belgium:
href: https://lifewatch.be
html: "<img src='https://oscibio.inbo.be/assets/logos/lifewatch-belgium.svg' width=72>"
html: "<img src='https://oscibio.inbo.be/assets/logos/lifewatch-belgium.svg' width=72 alt='LifeWatch Belgium logo'>"

reference:
- title: "Read a Data Package"
Expand Down
162 changes: 3 additions & 159 deletions man/read_resource.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading