Updating/Replacing feature would be nice.

Heya, I am working with the dataset from Dublin Bus. This data updates regularly and I will create a script to automatically update it on an x interval. When trying to run the same script again it will fail as it will try to create the tables while they're already created. It would be nice to have a drop/replace schema option or to be able to add data that has not yet been added to the table. 

Currently, I'm running my 'replace' as follows, which is purely a hack but it's a personal project so it's no bother.

1. I create a script via `gtfs-via-postgres [...] | sponge > gtfs.sql`
2. I replace `BEGIN;` in the script to:
```
BEGIN;
drop schema public cascade;
create schema public;
GRANT ALL ON SCHEMA public TO postgres;
GRANT ALL ON SCHEMA public TO public;
CREATE EXTENSION IF NOT EXISTS postgis;
```
using a script:
```
echo 'Replacing in SQL file to replace public schema'

sed -i -e 's/BEGIN;/BEGIN;\
drop schema public cascade;\
create schema public;\
GRANT ALL ON SCHEMA public TO postgres;\
GRANT ALL ON SCHEMA public TO public;\
CREATE EXTENSION IF NOT EXISTS postgis;/' gtfs.sql

echo 'Replacing done'
```
3. I run the gtfs.sql script, in a transaction this replaces the public schema to be filled by the gtfs data.sql script

This works but is not pretty. The main problem is **creating the indexes takes 20 minutes** on my local pc. (using --stops-location-index)

-----

I currently don't have the time to fully edit the project but starting from PostgreSQL 17, the ON_ERROR 'ignore' option on the COPY command would be a big help. This ignores any copy that cannot be done. Example query:

```
COPY public.routes (
                             route_id,
                             agency_id,
                             route_short_name,
                             route_long_name,
                             route_desc,
                             route_type,
                             route_url,
                             route_color,
                             route_text_color,
                             route_sort_order
    ) FROM STDIN WITH(format csv, ON_ERROR 'ignore')
```

This together with some `IF NOT EXISTS` on index and create table queries could help. If I get some time I will try to test this and make a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Updating/Replacing feature would be nice. #68

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Updating/Replacing feature would be nice. #68

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions