Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema evolution, mapping and query logging improvement #20

Merged
merged 14 commits into from
Aug 24, 2020

Conversation

bcogrel
Copy link
Collaborator

@bcogrel bcogrel commented Aug 20, 2020

Schema evolution

In the development setting, we now also use logical replication and schema versioning. The Python script has been extended for re-generating some derived tables and their triggers and repopulating them, without having to recreate from scratch the slave instance.

These changes addressed the issue #11 .

ODH tourism mapping

The mapping has been enriched for including the properties needed by the JSON-LD snippet generator.

Query logging improvement

Following the request from Alex Lanz, here are two new kinds of messages.

  1. Merged message (merging the messages query:reformulated, query:result-set-unblocked and query:last-result-fetched):
{
  "@timestamp": "2020-08-20T13:40:06.092Z",
  "message": "query:all",
  "application": "ontop-odh",
  "payload": {
    "queryId": "cca86ae7-381d-453e-9676-7b9da26e9807",
    "classesUsedInQuery": [
      "http://noi.example.org/ontology/odh#Pizzeria"
    ],
    "propertiesUsedInQuery": [
      "http://www.opengis.net/ont/geosparql#asWKT",
      "http://schema.org/name",
      "http://schema.org/geo"
    ],
    "tables": [
      "\"v_gastronomiesopen_CategoryCodes\"",
      "\"v_gastronomiesopen\""
    ],
    "reformulationDuration": 185,
    "reformulationCacheHit": false,
    "httpHeaders": {
      "referer": "http://localhost:8080/"
    },
    "extractedQueryTemplate": {
      "hash": "11b919109b240898de92ba5a8667a9007bdbbb5a22f72b61446df05315399fa5",
      "parameters": {
        "v0": "\"it\""
      }
    },
    "sparqlQuery": "PREFIX schema: <http://schema.org/>\nPREFIX geo: <http://www.opengis.net/ont/geosparql#>\nPREFIX : <http://noi.example.org/ontology/odh#>\n\nSELECT ?pos ?posLabel\nWHERE {\n  ?p a :Pizzeria ;\n     geo:asWKT ?pos ;\n     schema:name ?posLabel ;\n     schema:geo ?geo .\n  FILTER (lang(?posLabel) = 'it')\n}\n",
    "reformulatedQuery": "ans1(pos,posLabel)\nCONSTRUCT [pos, posLabel] [posLabel/RDF(VARCHARToTEXT(Detail-it-Title1m18),@it), pos/RDF(||5(\"POINT (\"^^TEXT,Longitude11m24,\" \"^^TEXT,Latitude12m24,\")\"^^TEXT),http://www.opengis.net/ont/geosparql#wktLiteral)]\n   NATIVE [Detail-it-Title1m18, Latitude12m24, Longitude11m24]\nSELECT v4.\"Detail-it-Title1m18\" AS \"Detail-it-Title1m18\", v4.\"Latitude12m24\" AS \"Latitude12m24\", v4.\"Longitude11m24\" AS \"Longitude11m24\"\nFROM (SELECT DISTINCT v2.\"Detail-it-Title\" AS \"Detail-it-Title1m18\", v2.\"Latitude\" AS \"Latitude12m24\", v2.\"Longitude\" AS \"Longitude11m24\", v1.\"gastronomiesopen_Id\" AS \"gastronomiesopen_Id1m75\"\nFROM \"v_gastronomiesopen_CategoryCodes\" v1, \"v_gastronomiesopen\" v2\nWHERE (v2.\"Longitude\" IS NOT NULL AND v2.\"Latitude\" IS NOT NULL AND v2.\"Detail-it-Title\" IS NOT NULL AND v1.\"gastronomiesopen_Id\" = v2.\"Id\" AND 'Pizzeria' = v1.\"Shortname\")\n) v4\n\n",
    "executionBeforeUnblockingDuration": 37,
    "executionAndFetchingDuration": 190,
    "totalDuration": 375,
    "resultCount": 337
  }
}
  1. Nginx log message:
{
  "@timestamp": "2020-08-20T13:40:06+00:00",
  "message": "query:reverse-proxy",
  "application": "ontop-odh",
  "payload": {
    "cacheStatus": "MISS",
    "httpHeaders": {
      "referer": "http://localhost:8080/",
      "client-app": "",
      "prepared-query": ""
    },
    "remoteAddr": "192.168.0.1",
    "request": "POST /sparql HTTP/1.1",
    "requestBody": "query=PREFIX+schema%3A+%3Chttp%3A%2F%2Fschema.org%2F%3E%0APREFIX+geo%3A+%3Chttp%3A%2F%2Fwww.opengis.net%2Font%2Fgeosparql%23%3E%0APREFIX+%3A+%3Chttp%3A%2F%2Fnoi.example.org%2Fontology%2Fodh%23%3E%0A%0ASELECT+%3Fpos+%3FposLabel%0AWHERE+%7B%0A++%3Fp+a+%3APizzeria+%3B%0A+++++geo%3AasWKT+%3Fpos+%3B%0A+++++schema%3Aname+%3FposLabel+%3B%0A+++++schema%3Ageo+%3Fgeo+.%0A++FILTER+(lang(%3FposLabel)+%3D+'it')%0A%7D%0A",
    "status": "200",
    "bodyBytesSent": "8419",
    "requestTime": "0.402"
  }
}

Other messages are still outputted for not breaking the existing dashboard. Once the dashboard will have been migrated to the query:all messages, please remove the following entries from the file vkg/odh.docker.properties:

ontop.queryLogging.decompositionAndMergingMutuallyExclusive=false
ontop.queryLogging.decomposition=true

@bcogrel bcogrel changed the title Schema evolution, mapping improvement Schema evolution, mapping and query logging improvement Aug 20, 2020
@bertolla
Copy link
Contributor

I noticed different credentials (flyway, postgres) in the source code. Should we adapt the pipelines to inject credentials or would that only become overhead?

@bcogrel
Copy link
Collaborator Author

bcogrel commented Aug 24, 2020

The default values the .env.example file should not cause problems I hope for the test and production deployments, as these values are overridden in their respective Jenkins scripts. See for instance https://github.com/ontopic-vkg/odh-vkg/blob/master/infrastructure/Jenkinsfile-Test.groovy#L15 .

These default values are needed for the dev mode, which is now using Flyway as well.

@bertolla bertolla merged commit 4afa398 into noi-techpark:development Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants