Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

78 - Fix checksum migration issue #79

Merged
merged 6 commits into from
May 7, 2021
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
node_modules
.idea/
45 changes: 42 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ The focus is specifically on CHT application data currently stored in CouchDB. I

This version is built for medic/cht-core#3.0.0 and above. For replicating data from earlier versions, see the 2.0.x branch and associated tags.


## Installation Steps (if applicable)

1. Clone repository
Expand Down Expand Up @@ -38,16 +39,25 @@ Run it locally with environment variables: `npm ci && node .`

Run it locally in interactive mode: `npm ci && node . -i`


## Running tests through docker-compose

Run tests with:

```
docker-compose build --build-arg node_version=[node version] test
docker-compose up
```

Then in another terminal:

```
docker-compose run test grunt test
```

Run tests in interactive watch mode with: `docker-compose run test npm run watch`.


## Running tests against local couch and postgres databases

Run tests with: `grunt test`.
Expand All @@ -59,10 +69,12 @@ Environment variables required for the integration tests to run correctly:

NB: The integration tests destroy and re-create the given databases each time they are run.


## Required database setup

We support PostgreSQL 9.4 and greater. The user passed in the postgres url needs to have full creation rights on the given database.


## Example usage

You should probably install medic-analytics as a service and leave it to do its thing, as it should be able to run independently without any user input.
Expand All @@ -86,7 +98,8 @@ end script
```
- The service is then a standard service, e.g. `service couch2pg-example-client start`

### Installing as a service using Systemd (18.04.3 LTS [Bionic Beaver])
### Installing as a service using Systemd (18.04.3 LTS [Bionic Beaver])

To setup couch2pg using systemd is also pretty simple. You will need to have sudo rights to the server and then follow the steps listed below:

- Install git and clone this repo onto your server, check out the relevant tag `git checkout tag_id`, and run `npm ci`.
Expand Down Expand Up @@ -126,6 +139,32 @@ WantedBy=multi-user.target
- Start the service `sudo service couch2pg-sample-client start`
- If all goes well the service should start smoothly.
- You can check the service logs using `journalctl` like this `journalctl -u couch2pg-sample-client --since today`




## Known issues
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should really have a release-notes.md to contain this information, instead of the readme.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. @garethbowen do you want me to prepare a new PR for the release ? I can create there the release note file and move this content. The release version should be 3.2.1, right?


### Error "Checksum failed for migration ..." when upgrading from 3.2.0 to latest

In the version 3.2.0 of medic-couch2pg one of the SQL migration files was changed causing the process to fail if the PostgreSQL database already have data. The change was reverted later, but if you started to use this tool since that version you won't be able to upgrade to newer versions until the following SQL script is executed in the Postgre database:
mrsarm marked this conversation as resolved.
Show resolved Hide resolved

```sql
UPDATE xmlforms_migrations
SET md5 = 'e0535c9fe3faef6e66a31691deebf1a8'
WHERE version = '201606200952' AND
md5 = '40187aa5ee95eda0e154ecefd7512cda';
```

See more details about the error in [#78](https://github.com/medic/medic-couch2pg/issues/78).

### Error installing deps `ERR! ... node-pre-gyp install --fallback-to-build`

If installing de Node.js dependencies locally or building the docker image you got an error like:
mrsarm marked this conversation as resolved.
Show resolved Hide resolved

```
...
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] install: `node-pre-gyp install --fallback-to-build`
```

It is probably related to a gcc library that is failing with some versions of Node and npm, try with Node 10 without updating the `npm` version that comes with it.
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
-- filter contact docs into one place
CREATE OR REPLACE VIEW raw_contacts AS SELECT * FROM couchdb WHERE doc->>'type' IN ('clinic', 'district_hospital', 'health_center', 'person', 'contact');
CREATE OR REPLACE VIEW raw_contacts AS SELECT * FROM couchdb WHERE doc->>'type' IN ('clinic', 'district_hospital', 'health_center', 'person');

-- extract JSON data from contact docs and cache it
DROP MATERIALIZED VIEW IF EXISTS contactview_metadata CASCADE;
CREATE MATERIALIZED VIEW contactview_metadata AS
SELECT doc->>'_id' AS uuid, doc->>'name' AS name, doc->>'type' AS type, doc->>'contact_type' AS contact_type, doc#>>'{contact,_id}' AS contact_uuid, doc#>>'{parent,_id}' AS parent_uuid, doc->>'notes' AS notes,
SELECT doc->>'_id' AS uuid, doc->>'name' AS name, doc->>'type' AS type, doc#>>'{contact,_id}' AS contact_uuid, doc#>>'{parent,_id}' AS parent_uuid, doc->>'notes' AS notes,
TIMESTAMP WITH TIME ZONE 'epoch' + (doc->>'reported_date')::numeric / 1000 * interval '1 second' AS reported
FROM raw_contacts;

Expand Down
56 changes: 56 additions & 0 deletions libs/analytics/migrations/202104281912.do.0078-contactType.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
-- Recreates the view only to add the new column contact_type
DROP MATERIALIZED VIEW IF EXISTS contactview_metadata CASCADE;
CREATE MATERIALIZED VIEW contactview_metadata AS
SELECT doc->>'_id' AS uuid,
doc->>'name' AS name,
doc->>'type' AS type,
doc->>'contact_type' AS contact_type, --> only this is new
doc#>>'{contact,_id}' AS contact_uuid,
doc#>>'{parent,_id}' AS parent_uuid,
doc->>'notes' AS notes,
TIMESTAMP WITH TIME ZONE 'epoch' + (doc->>'reported_date')::numeric / 1000 * interval '1 second' AS reported
FROM raw_contacts;

CREATE UNIQUE INDEX contactview_metadata_uuid ON contactview_metadata (uuid);
CREATE INDEX contactview_metadata_contact_uuid ON contactview_metadata (contact_uuid);
CREATE INDEX contactview_metadata_parent_uuid ON contactview_metadata (parent_uuid);
CREATE INDEX contactview_metadata_type ON contactview_metadata (type);

-- NOTE: The recreation of the view above caused 4 other views to be dropped in cascade,
-- here are the scripts to recreate them:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should link to the original migration that created each of these (including the line numbers). Something like: https://github.com/medic/medic-couch2pg/blob/master/libs/analytics/migrations/201711071603.do.2635-removeNestedContactReferences.sql#L26

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes better to keep track of that, adding...

CREATE VIEW contactview_hospital AS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super sad we have to have all this duplicated code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, next time we use Postgrator to mantain views, may be better to use *.js script to reuse better the SQL scripts.

Also it is sad that Postgres does not allow to add a column in a view without the need to recreate it manually, whether the engine needs to recreate the entirely view or not, it should be an internal decision made by the engine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually read a little around this, and people were complaining that they can't update materialized views in pgadmin - which is arguably probably even more popular than Postgrator. It could be pgadmin has improved since those posts though, but my point is that this seems like it is a common problem with other pg tools and not something that's easily automated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the problem is that Postgres does not support ALTER VIEW xxx ADD COLUMN ... like in tables does. It does allow to do some ALTER changes but the most basic one that is to add a column is not supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and my point is that even if that is not possible without the recreation of the whole materialized view, that should be an implementation detail for the engine, not for the user, Postgres could support ALTER VIEW xxx ADD COLUMN ... syntax, adding a note in the documentation, like: if you add a column in a materialized view internally the engine will recreate the whole view and dependent views and indexes, and that can take a while, use with precaution ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, I agree it's a less then ideal situation.

SELECT cmd.uuid, cmd.name
FROM contactview_metadata AS cmd
WHERE cmd.type = 'district_hospital';

CREATE VIEW contactview_chw AS
SELECT chw.name, pplfields.*, chwarea.uuid AS area_uuid,
chwarea.parent_uuid AS branch_uuid
FROM contactview_person_fields AS pplfields
INNER JOIN contactview_metadata AS chw ON (chw.uuid = pplfields.uuid)
INNER JOIN contactview_metadata AS chwarea ON (chw.parent_uuid = chwarea.uuid)
WHERE pplfields.parent_type = 'health_center';

CREATE VIEW contactview_clinic AS
SELECT cmd.uuid, cmd.name, chw.uuid AS chw_uuid, cmd.reported AS created
FROM contactview_metadata AS cmd
INNER JOIN contactview_chw AS chw ON (cmd.parent_uuid = chw.area_uuid)
WHERE type = 'clinic';

CREATE VIEW contactview_clinic_person AS
SELECT
raw_contacts.doc ->> '_id' AS uuid,
raw_contacts.doc ->> 'name' AS name,
raw_contacts.doc ->> 'type' AS type,
raw_contacts.doc #>> '{parent,_id}' AS family_uuid,
raw_contacts.doc ->> 'phone' AS phone,
raw_contacts.doc ->> 'alternative_phone' AS phone2,
raw_contacts.doc ->> 'date_of_birth' AS date_of_birth,
cmeta.type AS parent_type
FROM
raw_contacts
LEFT JOIN contactview_metadata AS cmeta ON (doc #>> '{parent,_id}' = cmeta.uuid)
WHERE
raw_contacts.doc->>'type' = 'person' AND
raw_contacts.doc->>'_id' IN (SELECT contact_uuid FROM contactview_metadata WHERE type = 'clinic');