You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to use dplyr::tbl to create a reference to a table in bigquery. I want to be able to use dbplyr::in_schema to select both the dataset and table I want. Even though bigquery connections have a default dataset it is possible to query tables in datasets (aka schemas) other than the default.
library(bigrquery)
bq_auth(path= Sys.getenv("BIGQUERY_SERVICE_ACCOUNT_JSON_PATH"))
cdm_schema<- Sys.getenv("BIGQUERY_CDM_SCHEMA")
write_schema<- Sys.getenv("BIGQUERY_SCRATCH_SCHEMA")
con<-DBI::dbConnect(
bigrquery::bigquery(),
project= Sys.getenv("BIGQUERY_PROJECT_ID"),
dataset=cdm_schema
)
DBI::dbGetQuery(con, "select count(*) as n from person")
#> # A tibble: 1 × 1#> n#> <int>#> 1 116352# try to create a tbl reference using fully qualified dataset.table namedplyr::tbl(con, DBI::Id(schema="synpuf_110k", table="person"))
#> Warning: <BigQueryConnection> uses an old dbplyr interface#> ℹ Please install a newer version of the package or contact the maintainer#> This warning is displayed once every 8 hours.#> Error in `signal_reason()`:#> ! Invalid dataset ID "`synpuf_110k`". Dataset IDs must be alphanumeric (plus underscores and dashes) and must be at most 1024 characters long. [invalid]#> Backtrace:#> ▆#> 1. ├─dplyr::tbl(con, DBI::Id(schema = "synpuf_110k", table = "person"))#> 2. └─dplyr:::tbl.DBIConnection(...)#> 3. ├─dplyr::tbl(...)#> 4. └─dbplyr:::tbl.src_dbi(...)#> 5. └─dbplyr::tbl_sql(c(subclass, "dbi"), src = src, from = from, ...)#> 6. ├─vars %||% dbplyr_query_fields(src$con, from_sql)#> 7. └─dbplyr:::dbplyr_query_fields(src$con, from_sql)#> 8. └─dbplyr:::dbplyr_fallback(con, "db_query_fields", ...)#> 9. ├─rlang::eval_bare(expr((!!fun)(con, ...)))#> 10. ├─dplyr::db_query_fields(con, ...)#> 11. └─bigrquery:::db_query_fields.BigQueryConnection(con, ...)#> 12. └─bigrquery::bq_table_fields(tb)#> 13. └─bigrquery::bq_table_meta(x, fields = "schema")#> 14. └─bigrquery:::bq_get(url, query = list(fields = fields))#> 15. └─bigrquery:::process_request(req, raw = raw)#> 16. └─bigrquery:::bq_check_response(status, type, content)#> 17. └─bigrquery:::signal_reason(json$error$errors[[1L]]$reason, json$error$message)#> 18. └─rlang::abort(message, class = paste0("bigrquery_", reason))dplyr::tbl(con, dbplyr::sql("synpuf_110k.person"))
#> Error in `signal_reason()`:#> ! Syntax error: Expected end of input but got identifier "synpuf_110k" at [1:1] [invalidQuery]#> Backtrace:#> ▆#> 1. ├─dplyr::tbl(con, dbplyr::sql("synpuf_110k.person"))#> 2. └─dplyr:::tbl.DBIConnection(con, dbplyr::sql("synpuf_110k.person"))#> 3. ├─dplyr::tbl(...)#> 4. └─dbplyr:::tbl.src_dbi(...)#> 5. └─dbplyr::tbl_sql(c(subclass, "dbi"), src = src, from = from, ...)#> 6. ├─vars %||% dbplyr_query_fields(src$con, from_sql)#> 7. └─dbplyr:::dbplyr_query_fields(src$con, from_sql)#> 8. └─dbplyr:::dbplyr_fallback(con, "db_query_fields", ...)#> 9. ├─rlang::eval_bare(expr((!!fun)(con, ...)))#> 10. ├─dplyr::db_query_fields(con, ...)#> 11. └─bigrquery:::db_query_fields.BigQueryConnection(con, ...)#> 12. └─bigrquery:::bq_query_fields(sql, con@billing, default_dataset = ds)#> 13. └─bigrquery:::bq_post(url, body = bq_body(body, ...), query = list(fields = "statistics(query(schema(fields)))"))#> 14. └─bigrquery:::process_request(req)#> 15. └─bigrquery:::bq_check_response(status, type, content)#> 16. └─bigrquery:::signal_reason(json$error$errors[[1L]]$reason, json$error$message)#> 17. └─rlang::abort(message, class = paste0("bigrquery_", reason))dplyr::tbl(con, dbplyr::in_schema("synpuf_110k", "person"))
#> Error in `signal_reason()`:#> ! Invalid dataset ID "`synpuf_110k`". Dataset IDs must be alphanumeric (plus underscores and dashes) and must be at most 1024 characters long. [invalid]#> Backtrace:#> ▆#> 1. ├─dplyr::tbl(con, dbplyr::in_schema("synpuf_110k", "person"))#> 2. └─dplyr:::tbl.DBIConnection(...)#> 3. ├─dplyr::tbl(...)#> 4. └─dbplyr:::tbl.src_dbi(...)#> 5. └─dbplyr::tbl_sql(c(subclass, "dbi"), src = src, from = from, ...)#> 6. ├─vars %||% dbplyr_query_fields(src$con, from_sql)#> 7. └─dbplyr:::dbplyr_query_fields(src$con, from_sql)#> 8. └─dbplyr:::dbplyr_fallback(con, "db_query_fields", ...)#> 9. ├─rlang::eval_bare(expr((!!fun)(con, ...)))#> 10. ├─dplyr::db_query_fields(con, ...)#> 11. └─bigrquery:::db_query_fields.BigQueryConnection(con, ...)#> 12. └─bigrquery::bq_table_fields(tb)#> 13. └─bigrquery::bq_table_meta(x, fields = "schema")#> 14. └─bigrquery:::bq_get(url, query = list(fields = fields))#> 15. └─bigrquery:::process_request(req, raw = raw)#> 16. └─bigrquery:::bq_check_response(status, type, content)#> 17. └─bigrquery:::signal_reason(json$error$errors[[1L]]$reason, json$error$message)#> 18. └─rlang::abort(message, class = paste0("bigrquery_", reason))# this worksDBI::dbGetQuery(con, "select count(*) as n from `synpuf_110k`.`person`")
#> # A tibble: 1 × 1#> n#> <int>#> 1 116352# db_query_fields seems to be the problemdplyr::db_query_fields(con, "synpuf_110k.person")
#> [1] "person_id" "gender_concept_id" #> [3] "year_of_birth" "month_of_birth" #> [5] "day_of_birth" "birth_DATETIME" #> [7] "race_concept_id" "ethnicity_concept_id" #> [9] "location_id" "provider_id" #> [11] "care_site_id" "person_source_value" #> [13] "gender_source_value" "gender_source_concept_id" #> [15] "race_source_value" "race_source_concept_id" #> [17] "ethnicity_source_value" "ethnicity_source_concept_id"dplyr::db_query_fields(con, dbplyr::sql("synpuf_110k.person"))
#> Error in `signal_reason()`:#> ! Syntax error: Expected end of input but got identifier "synpuf_110k" at [1:1] [invalidQuery]#> Backtrace:#> ▆#> 1. ├─dplyr::db_query_fields(con, dbplyr::sql("synpuf_110k.person"))#> 2. └─bigrquery:::db_query_fields.BigQueryConnection(con, dbplyr::sql("synpuf_110k.person"))#> 3. └─bigrquery:::bq_query_fields(sql, con@billing, default_dataset = ds)#> 4. └─bigrquery:::bq_post(url, body = bq_body(body, ...), query = list(fields = "statistics(query(schema(fields)))"))#> 5. └─bigrquery:::process_request(req)#> 6. └─bigrquery:::bq_check_response(status, type, content)#> 7. └─bigrquery:::signal_reason(json$error$errors[[1L]]$reason, json$error$message)#> 8. └─rlang::abort(message, class = paste0("bigrquery_", reason))dplyr::db_query_fields(con, dbplyr::sql("`synpuf_110k`.`person`"))
#> Error in `signal_reason()`:#> ! Syntax error: Expected end of input but got identifier `synpuf_110k` at [1:1] [invalidQuery]#> Backtrace:#> ▆#> 1. ├─dplyr::db_query_fields(con, dbplyr::sql("`synpuf_110k`.`person`"))#> 2. └─bigrquery:::db_query_fields.BigQueryConnection(con, dbplyr::sql("`synpuf_110k`.`person`"))#> 3. └─bigrquery:::bq_query_fields(sql, con@billing, default_dataset = ds)#> 4. └─bigrquery:::bq_post(url, body = bq_body(body, ...), query = list(fields = "statistics(query(schema(fields)))"))#> 5. └─bigrquery:::process_request(req)#> 6. └─bigrquery:::bq_check_response(status, type, content)#> 7. └─bigrquery:::signal_reason(json$error$errors[[1L]]$reason, json$error$message)#> 8. └─rlang::abort(message, class = paste0("bigrquery_", reason))DBI::dbDisconnect(con)
I would like to use
dplyr::tbl
to create a reference to a table in bigquery. I want to be able to usedbplyr::in_schema
to select both the dataset and table I want. Even though bigquery connections have a default dataset it is possible to query tables in datasets (aka schemas) other than the default.Created on 2023-03-07 with reprex v2.0.2
Session info
The text was updated successfully, but these errors were encountered: