Skip to content

Book outline: R, Docker, Postgres for SQL instruction

John David Smith edited this page Oct 24, 2018 · 12 revisions

Here’s a first draft:

FILES: 01-09

  • Introduction -- index.Rmd
  • How to use this book (01) -- 01-using-this-book.Rmd
  • Docker Hosting for Windows (02) -- 02-docker_hosting_for_windows.Rmd
  • This Book's Learning Goals and Use Cases (03) -- 03-learning-goals-use-cases.Rmd
  • Docker, Postgres, and R (04) -- 04-docker-setup-postgres-connect-with-r.Rmd
  • The dvdrental database in Postgres in Docker (05) -- 05-docker-setup-postgres-with-dvdrental.Rmd

FILES: 11-19

  • Introduction to DBMS queries (11) -- 11-elementary-queries.Rmd
    • Limit rows: filter / WHERE
    • Limit columns: select / SELECT
    • Mapping data types from SQL to R
    • Column name mapping in SQL and R
    • Calculating new variables: mutate and ?
  • Joins and complex queries (13) -- 13-sql_joins-and-complex-queries.Rmd
    • kinds: left, right, anti, etc.
    • order of joins
    • explain (belongs with table indexes or here?)
  • SQL Quick start - simple retrieval (15) -- 15-sql_quick_start_simple_retrieval.Rmd
  • Dividing the work between R and the DBMS
    • What should stay on the server / DBMS side
    • What goes on the R side
    • Criteria:
      • Legibility
      • code for humans versus code for computers
      • one-time use for exploration versus repeated / scheduled use
  • tidy data is 3rd normal form: basic SQL & data frames
  • simple retrieval (one table)
  • Overview of dplyr -> SQL command mapping

FILES: 21-29

  • Getting metadata about and from the database (21) - 21-r-query-postgres-metadata.Rmd
    • What’s in a database
      • Expanded diagram
      • How to query the metadata
    • Finding traps
      • name reuse
      • ER diagram
      • Watch for dates
  • Drilling into Your DB Environment (22) - 22-drilling-into-db-environment.Rmd
  • Digging into some hygiene issues
    • passwords — how to store them (not in plain text)
    • how to avoid abusing shared resources
    • table indexes

FILES: 31-39

  • Taking advantage of PostgreSQL functions
  • Taking advantage of the R function library
    • Beyond dplyr
    • lubridate
    • stringr

FILES: 51-59 A worked example

  • Loading a new public dataset
  • Dealing with real issues

FILES: 71-79

  • Explain queries (71) - 71-explain-queries.Rmd
    • instrumentation: adding timings to your code
  • SQL queries behind the scenes (72) - 72-sql-query-steps.Rmd
    • DB privileges (important if you want to create staging tables in the DB)
  • Writing to the DBMS (73) - 73_write-to-the-database.Rmd
    • scope
    • Staging intermediate datasets
    • ?

  • Appendix A: Other resources (89) -- 89-resources.Rmd
  • APPENDIX C - Mapping your local environment (92) -- 92-environment_diagram.Rmd
  • APPENDIX D - Quick Guide to SQL (94) - 94-SQLGuide.Rmd