Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set-up backend to load GerryDB tables #1

Merged
merged 33 commits into from
Jul 20, 2024

Conversation

raphaellaude
Copy link
Collaborator

@raphaellaude raphaellaude commented Jul 17, 2024

Description

DB

  • Add tracker GerryDBTable table for loaded gerrydb tables
  • click command to load a gerryDB view to SCHEMA gerrydb
    • e.g. python cli.py import-gerrydb-view -g s3://districtr-v2-dev/c6c23a64a3234171853c9897095f001b.gpkg --layer ks_demo_view_census_vtd --rm
  • Remove MongoDB code, since we're opting for full postgres

Infra

  • Mount volumes (10GB ea.) to fly machines for temp storage (e.g. loading layers)
  • Update image to include GDAL
  • Note: this will work on prod and my machine because I've configured it with the right Cloudflare secrets for now. Let's discuss w/ @mduchin if they have R2 buckets set up already

CI/CD

  • Update GHA to support postgres CI/CD unit testing infra
  • Update test module to spin up and down empty database for unit tests à la django

Reviewers

Checklist

  • Added/Updated related documentation (if applicable).
  • Added/Updated related unit tests (if applicable).

Screenshots (if applicable):

@raphaellaude raphaellaude self-assigned this Jul 17, 2024
@raphaellaude raphaellaude marked this pull request as draft July 17, 2024 04:51
@raphaellaude
Copy link
Collaborator Author

yay

@raphaellaude raphaellaude changed the title Simple pop table for testing Set-up backend testing environment Jul 18, 2024
@raphaellaude raphaellaude marked this pull request as ready for review July 18, 2024 03:58
@raphaellaude
Copy link
Collaborator Author

Ok got it all working on the server! Example loading gerrydb GPKG from R2 to prod postgres:

root:/app# psql $DATABASE_URL
psql (13.15 (Debian 13.15-0+deb11u1), server 15.6 (Debian 15.6-1.pgdg120+2))
WARNING: psql major version 13, server major version 15.
         Some psql features might not work.
Type "help" for help.

districtr_v2_api=# \l
                                    List of databases
       Name       |  Owner   | Encoding |  Collate   |   Ctype    |   Acce
ss privileges
------------------+----------+----------+------------+------------+-------
----------------
 districtr_v2_api | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
 postgres         | postgres | UTF8     | en_US.utf8 | en_US.utf8 |
 repmgr           | repmgr   | UTF8     | en_US.utf8 | en_US.utf8 |
 template0        | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/pos
tgres          +
                  |          |          |            |            | postgr
es=CTc/postgres
 template1        | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/pos
tgres          +
                  |          |          |            |            | postgr
es=CTc/postgres
(5 rows)

districtr_v2_api=# \c districtr_v2_api
psql (13.15 (Debian 13.15-0+deb11u1), server 15.6 (Debian 15.6-1.pgdg120+2))
WARNING: psql major version 13, server major version 15.
         Some psql features might not work.
You are now connected to database "districtr_v2_api" as user "districtr_v2_api".
districtr_v2_api=# \dt
                  List of relations
 Schema |      Name       | Type  |      Owner
--------+-----------------+-------+------------------
 public | alembic_version | table | districtr_v2_api
 public | gerrydbtable    | table | districtr_v2_api
 public | spatial_ref_sys | table | districtr_v2_api
(3 rows)

districtr_v2_api=# select * from gerrydbtable;
 created_at | updated_at | id | uuid | name
------------+------------+----+------+------
(0 rows)

districtr_v2_api=# exit
root:/app# python cli.py import-gerrydb-view -g s3://districtr-v2-dev/c6c23a64a3234171853c9897095f001b.gpkg --layer ks_demo_view_census_vtd --rm
Importing GerryDB view...
INFO:__main__:URL: ParseResult(scheme='s3', netloc='districtr-v2-dev', path='/c6c23a64a3234171853c9897095f001b.gpkg', params='', query='', fragment='')
INFO:__main__:File name: c6c23a64a3234171853c9897095f001b.gpkg
INFO:__main__:Importing GerryDB view. Got response:
{'ResponseMetadata': {'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Sat, 20 Jul 2024 16:13:52 GMT', 'content-length': '16130048', 'connection': 'keep-alive', 'accept-ranges': 'bytes', 'etag': '"d9b6a6a7aae65de494855eab4df7c644-2"', 'last-modified': 'Fri, 19 Jul 2024 02:11:14 GMT', 'x-amz-mp-parts-count': '2', 'x-amz-storage-class': 'STANDARD_IA', 'vary': 'Accept-Encoding', 'server': 'cloudflare', 'cf-ray': '8a643db01da0434a-EWR'}, 'RetryAttempts': 0}, 'AcceptRanges': 'bytes', 'LastModified': datetime.datetime(2024, 7, 19, 2, 11, 14, tzinfo=tzutc()), 'ContentLength': 16130048, 'ETag': '"d9b6a6a7aae65de494855eab4df7c644-2"', 'Metadata': {}, 'StorageClass': 'STANDARD_IA', 'PartsCount': 2}
INFO:__main__:File already exists. Skipping download.
INFO:__main__:GerryDB view imported successfully
INFO:__main__:Deleted file /data/c6c23a64a3234171853c9897095f001b.gpkg
GerryDB view imported successfully
2024-07-20 16:13:52,958 INFO sqlalchemy.engine.Engine select pg_catalog.version()
INFO:sqlalchemy.engine.Engine:select pg_catalog.version()
2024-07-20 16:13:52,958 INFO sqlalchemy.engine.Engine [raw sql] {}
INFO:sqlalchemy.engine.Engine:[raw sql] {}
2024-07-20 16:13:52,960 INFO sqlalchemy.engine.Engine select current_schema()
INFO:sqlalchemy.engine.Engine:select current_schema()
2024-07-20 16:13:52,960 INFO sqlalchemy.engine.Engine [raw sql] {}
INFO:sqlalchemy.engine.Engine:[raw sql] {}
2024-07-20 16:13:52,961 INFO sqlalchemy.engine.Engine show standard_conforming_strings
INFO:sqlalchemy.engine.Engine:show standard_conforming_strings
2024-07-20 16:13:52,962 INFO sqlalchemy.engine.Engine [raw sql] {}
INFO:sqlalchemy.engine.Engine:[raw sql] {}
2024-07-20 16:13:52,964 INFO sqlalchemy.engine.Engine BEGIN (implicit)
INFO:sqlalchemy.engine.Engine:BEGIN (implicit)
2024-07-20 16:13:52,965 INFO sqlalchemy.engine.Engine
        INSERT INTO gerrydbtable (uuid, name, updated_at)
        VALUES (%(uuid)s, %(name)s, now())
        ON CONFLICT (name)
        DO UPDATE SET
            updated_at = now()

INFO:sqlalchemy.engine.Engine:
        INSERT INTO gerrydbtable (uuid, name, updated_at)
        VALUES (%(uuid)s, %(name)s, now())
        ON CONFLICT (name)
        DO UPDATE SET
            updated_at = now()

2024-07-20 16:13:52,965 INFO sqlalchemy.engine.Engine [generated in 0.00058s] {'uuid': '8317219d-9798-4494-98b7-b0f935bdf7dc', 'name': 'ks_demo_view_census_vtd'}
INFO:sqlalchemy.engine.Engine:[generated in 0.00058s] {'uuid': '8317219d-9798-4494-98b7-b0f935bdf7dc', 'name': 'ks_demo_view_census_vtd'}
2024-07-20 16:13:52,967 INFO sqlalchemy.engine.Engine COMMIT
INFO:sqlalchemy.engine.Engine:COMMIT
INFO:__main__:GerryDB view upserted successfully.
root:/app# psql $DATABASE_URL
psql (13.15 (Debian 13.15-0+deb11u1), server 15.6 (Debian 15.6-1.pgdg120+2))
WARNING: psql major version 13, server major version 15.
         Some psql features might not work.
Type "help" for help.

districtr_v2_api=# select * from gerrydbtable;
          created_at           |          updated_at           | id |
            uuid                 |          name
-------------------------------+-------------------------------+----+-----
---------------------------------+-------------------------
 2024-07-20 16:13:52.965882+00 | 2024-07-20 16:13:52.965882+00 |  1 | 8317
219d-9798-4494-98b7-b0f935bdf7dc | ks_demo_view_census_vtd
(1 row)

districtr_v2_api=# select path, total_pop from gerrydb.ks_demo_view_census_vtd order by total_pop desc limit 2;
      path       | total_pop
-----------------+-----------
 vtd:20173501960 |      8040
 vtd:2001500001a |      7435
(2 rows)

districtr_v2_api=#

@raphaellaude raphaellaude changed the title Set-up backend testing environment Set-up backend to load GerryDB tables Jul 20, 2024
env:
POSTGRES_SCHEME: postgresql+psycopg
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just saying this out loud: this is fine, bc this is an ephemeral database used only for testing.

@raphaellaude raphaellaude merged commit 7646b12 into main Jul 20, 2024
1 check passed
@raphaellaude raphaellaude deleted the simple-pop-table-for-testing branch July 20, 2024 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants