You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Heya, I am working with the dataset from Dublin Bus. This data updates regularly and I will create a script to automatically update it on an x interval. When trying to run the same script again it will fail as it will try to create the tables while they're already created. It would be nice to have a drop/replace schema option or to be able to add data that has not yet been added to the table.
Currently, I'm running my 'replace' as follows, which is purely a hack but it's a personal project so it's no bother.
I create a script via gtfs-via-postgres [...] | sponge > gtfs.sql
I replace BEGIN; in the script to:
BEGIN;
drop schema public cascade;
create schema public;
GRANT ALL ON SCHEMA public TO postgres;
GRANT ALL ON SCHEMA public TO public;
CREATE EXTENSION IF NOT EXISTS postgis;
using a script:
echo 'Replacing in SQL file to replace public schema'
sed -i -e 's/BEGIN;/BEGIN;\
drop schema public cascade;\
create schema public;\
GRANT ALL ON SCHEMA public TO postgres;\
GRANT ALL ON SCHEMA public TO public;\
CREATE EXTENSION IF NOT EXISTS postgis;/' gtfs.sql
echo 'Replacing done'
I run the gtfs.sql script, in a transaction this replaces the public schema to be filled by the gtfs data.sql script
This works but is not pretty. The main problem is creating the indexes takes 20 minutes on my local pc. (using --stops-location-index)
I currently don't have the time to fully edit the project but starting from PostgreSQL 17, the ON_ERROR 'ignore' option on the COPY command would be a big help. This ignores any copy that cannot be done. Example query:
Heya, I am working with the dataset from Dublin Bus. This data updates regularly and I will create a script to automatically update it on an x interval. When trying to run the same script again it will fail as it will try to create the tables while they're already created. It would be nice to have a drop/replace schema option or to be able to add data that has not yet been added to the table.
Currently, I'm running my 'replace' as follows, which is purely a hack but it's a personal project so it's no bother.
gtfs-via-postgres [...] | sponge > gtfs.sql
BEGIN;
in the script to:using a script:
This works but is not pretty. The main problem is creating the indexes takes 20 minutes on my local pc. (using --stops-location-index)
I currently don't have the time to fully edit the project but starting from PostgreSQL 17, the ON_ERROR 'ignore' option on the COPY command would be a big help. This ignores any copy that cannot be done. Example query:
This together with some
IF NOT EXISTS
on index and create table queries could help. If I get some time I will try to test this and make a PR.The text was updated successfully, but these errors were encountered: