Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix optimize_expression to genereate atlas of feature with textual PKs #72

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions atlasprint/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -312,17 +312,18 @@ def optimize_expression(layer, expression):
logger.info("'$id' not found in the expression, returning the input expression.")
return expression

# extract indexes of primary keys
primary_keys = layer.primaryKeyAttributes()
if len(primary_keys) != 1:
logger.info("Primary keys are not defined in the layer '{}'.".format(layer.id()))
return expression

field = layer.fields().at(0)
if not field.isNumeric():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your proposal. I am not sure to understand correctly the context you are targeting with this PR.

It seems to me $id will always return an integer, the internal layer feature ID, which:

  • corresponds exactly to the integer primary key of the table if the field type is integer.
  • does not correspond to the primary key value if the field type is text (or other even bigint, for big datasets, $id will return the feature line number as it is returned, for example for a big PostgreSQL table).

For example, $id will return the value of id_integer_pk for this layer:

id_integer_pk name
1 one
2 two

but will not return the value of id_text_pk for this layer, but 1 or 2:

id_text_pk name
AB one
CD two

Can you please share a QGIS project with an example layer (FlatGeoBuf, GeoJSON, etc.), so that we can test it on real data ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mdouchin

Please find some data here: https://github.com/maxencelaurent/qgis-atlasprint-issue-71

As you mentioned, there are two cases:

In the first one, QGIS is able to retrieve the PK from existing fields. Thus, layer.primaryKeyAttributes() returns the list of indices of the fields that make up the PK. LWC use this information to build the GetPrintAtlas queries: EXP_FILTER="$id IN ('68ca5ce1-7ac4-4fbe-9a76-3c0d3d41fa53')"

Under some circumstances (eg. postgresql), such a field may be of type UUID. This behaviour almost only exists for postgresql layers. I couldn't reproduce it with sqlite for example.

In the second one, QGIS is not able to retrieve the PK from existing fields. Hence, it uses an internal $id field which is always of type integer. layer.primaryKeyAttributes() always returns an empty list. LWC uses this @id to build the query : EXP_FILTER="$id IN (1)".

On the atlas-plugin side, the EXP_FILTER is optimized. The purpose of this optimization is to replace the $id with the effective field name of the PK.

The plugin verifies that :

  1. The PK is made of exactly one field
  2. The first field of the layer is numeric

If both tests are true, the filter is optimized : $id is replaced with the name of the first field.

In my data, you'll find some layers:

text_pk_first_place & text_pk_sec_place

  • SQLite layer wtih text PK
  • QGIS failed to retrieve PK ⚠️
  • interal $id is used.
  • The EXP_FILTER is not optimized
  • GetPrintAtlas works fine.

int_pk_first_place

  • SQLite layer wtih int PK
  • QGIS uses effective PK
  • The EXP_FILTER is optimized.
  • GetPrintAtlas works fine ✅

int_pk_sec_place

  • SQLite layer wtih int PK
  • QGIS uses effective PK
  • The EXP_FILTER is not optimized ⚠️
  • GetPrintAtlas works fine

uuid_pk_first_place & uuid_pk_sec_place

  • SQLite layer wtih uuid PK
  • QGIS failed to retrieve PK ⚠️
  • interal $id is used.
  • The EXP_FILTER is not optimized.
  • GetPrintAtlas works fine.

int_pk_sec_place_but_other_int_first

  • SQLite layer wtih integer PK
  • the first field if not the PK
  • the first field is a number
  • QGIS uses effective PK
  • The EXP_FILTER is badly optimized ⚠️
  • GetPrintAtlas fails 💀 💥

psql_uuid

  • PostgreSQL layer wtih uuid PK
  • QGIS uses effective PK
  • The EXP_FILTER is not optimized.
  • GetPrintAtlas fails 💀 💥

logger.info("The field '{}' is not numeric in layer '{}'.".format(field.name(), layer.id()))
return expression
# extract primary key from fields list
pk_index = primary_keys[0]
pk_field = layer.fields().at(pk_index)

expression = expression.replace('$id', '"{}"'.format(field.name()))
logger.info('$id has been replaced by "{}" in layer "{}"'.format(field.name(), layer.id()))
# replace `$id` with effective PK name
expression = expression.replace('$id', '"{}"'.format(pk_field.name()))
logger.info('$id has been replaced by "{}" in layer "{}"'.format(pk_field.name(), layer.id()))
Comment on lines +326 to +327
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expression = expression.replace('$id', '"{}"'.format(pk_field.name()))
logger.info('$id has been replaced by "{}" in layer "{}"'.format(pk_field.name(), layer.id()))
expression = expression.replace('$id', f'"{pk_field.name()}"')
logger.info(f'$id has been replaced by "{pk_field.name()}" in layer "{layer.id()}"')

Let's switch to fstrings now at the same time


return expression
17 changes: 11 additions & 6 deletions tests/test_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,21 +37,26 @@ def test_optimize_filter():
assert "$id in ('1','2')" == optimize_expression(layer, "$id in ('1','2')")

# One primary key
layer.primaryKeyAttributes = lambda: ['primary']
layer.primaryKeyAttributes = lambda: [0]

assert '"primary"=3' == optimize_expression(layer, '$id=3')
assert '"primary" in (\'1\',\'2\')' == optimize_expression(layer, "$id in ('1','2')")

# Two primary keys
layer.primaryKeyAttributes = lambda: ['primary', 'name']
layer.primaryKeyAttributes = lambda: [0, 1]
assert '$id=3' == optimize_expression(layer, '$id=3')

# One primary key but it's not integer
# One primary key type string
layer = QgsVectorLayer('None?field=primary:string(20)&field=name:string(20)', 'test', 'memory')
layer.primaryKeyAttributes = lambda: ['primary']
assert '$id=3' == optimize_expression(layer, '$id=3')
layer.primaryKeyAttributes = lambda: [0]
assert '"primary"=3' == optimize_expression(layer, '$id=3')

# One primary key type double
layer = QgsVectorLayer('None?field=primary:double(20,20)&field=name:string(20)', 'test', 'memory')
layer.primaryKeyAttributes = lambda: ['primary']
layer.primaryKeyAttributes = lambda: [0]
assert '"primary"=3' == optimize_expression(layer, '$id=3')

# One primary key type int but not the first attribute of the layer
layer = QgsVectorLayer('None?field=count:integer&field=primary:integer&field=name:string(20)', 'test', 'memory')
layer.primaryKeyAttributes = lambda: [1]
assert '"primary"=3' == optimize_expression(layer, '$id=3')
Loading