Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SFSQL and SFSQL_PG driver for lazy_tbl #2

Closed
wants to merge 5 commits into from
Closed

SFSQL and SFSQL_PG driver for lazy_tbl #2

wants to merge 5 commits into from

Conversation

mdsumner
Copy link
Member

@mdsumner mdsumner commented Sep 23, 2020

SFSQL() wraps the basic sql query to read_sf(), passing in the DSN argument in the usual way (file, url, connection string) with a dummy layer value.

SFSQL_PG() adds another level to allow input of host,dbname,user,password which is passed down as the 'PG:...' connection string to SFSQL().

There is a lot I haven't considered, readonly is ignored, security/obfuscation on the connection string is ignored, and no detail on the difference between DB-drivers and non-DB:

i.e. DB drivers have the SQL passed in direct, so it is engine-specific, otherwise for non-DB afaik:

  • must not contain subquerys - no 'SELECT * FROM (SELECT * FROM layer) table'
  • can use special variables FID, OGR_GEOM_AREA (but only for filter, not select/mutate/transmute)
  • automatically keeps the geometry column _ogr_geometry_ whether selected or not (see next comment)

Discussed here: mdsumner/RGDALSQL#7

Other:

  • I wanted data.frame, tbl, sf classes allowed so the Results slot uses "ANY" .
  • there's an error-catch on statements that match "SELECT * (SELECT * )", hopefully to indicate that subqueries might be the problem for non-DB

Examples, first pseudocode for postgresql

library(sfdbi)   ## mdsumner/sfdbi@ogrsql branch

## PSEUDOCODE
db <- dbConnect(SFSQL_PG(), 
                host = Sys.getenv("host"),
                dbname = "databasename",
                user = Sys.getenv("user"),
                password = Sys.getenv("password")
)

## that is equivalent to 
#db <- dbConnect(SFSQL(), "PG:host='..' dbname='..' user='..' password='..'")

library(dplyr)
tbl(db, "planes")

Examples with SHP and GPKG

gfile <- system.file("gpkg/nc.gpkg", package = "sf", mustWork = TRUE)
sfile <- system.file("shape/nc.shp", package = "sf", mustWork = TRUE)

gpkg <- tbl(dbConnect(SFSQL(), gfile), "nc.gpkg")
gpkg %>% 
  filter(NAME %LIKE% 'A%') %>% 
  mutate(abc = 1.3) %>% 
  select(abc, NAME, geom) %>% 
  arrange(desc(NAME))  ## %>% show_query()


## note geom column *name*
shp <- tbl(dbConnect(SFSQL(), sfile), "nc")
shp %>% 
  filter(NAME %LIKE% 'A%') %>% 
  mutate(abc = 1.3) %>% 
  select(abc, NAME, `_ogr_geometry_`) %>% 
  arrange(desc(NAME))  #%>% show_query()

@mdsumner
Copy link
Member Author

note that non-DB drivers will keep the geometry column whether it's in the query or not

GPKG drops 'geom' as it's not selected, but SHP keeps it, I don't have a strong opinion on this ...

gpkg %>% 
  filter(NAME %LIKE% 'A%') %>% 
  mutate(abc = 1.3) %>% 
  select(abc, NAME)

image

@etiennebr
Copy link
Member

Thanks @mdsumner! This looks great. I'll give it a serious look in the next few days.

@mdsumner
Copy link
Member Author

mdsumner commented Oct 6, 2020

I went and made this PR into a package, so you can either use that or crib the DBI stuff from it when you're ready:

https://github.com/mdsumner/lazysf

If I get more into DBs I might come back and look at the more general case here, but rn I need this to get on with more general outputs from the lazy query.

@mdsumner mdsumner closed this Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants