Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for partial indexes #159

Merged
merged 1 commit into from
Nov 21, 2024

Conversation

timgraham
Copy link
Collaborator

refs #104

django_mongodb/indexes.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@Jibola Jibola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments! Will circle back and ask some more tomorrow :)

django_mongodb/schema.py Show resolved Hide resolved
django_mongodb/indexes.py Outdated Show resolved Hide resolved
@timgraham
Copy link
Collaborator Author

After e971120, simple conditions no longer work, but we may need a separate path for generating this MQL anyway.

======================================================================
FAIL: test_boolean_restriction_partial (indexes.tests.PartialIndexTests.test_boolean_restriction_partial)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tim/code/django/tests/indexes/tests.py", line 453, in test_boolean_restriction_partial
    self.assertEqual(
AssertionError: "{'published': {'$eq': True}}" != "{'parent__field__0': {'$eq': True}}"
- {'published': {'$eq': True}}
+ {'parent__field__0': {'$eq': True}}

@WaVEV

This comment was marked as resolved.

@timgraham
Copy link
Collaborator Author

Can the following be supported? If not, we can edit the test with a condition that is supported.

======================================================================
ERROR: test_multiple_conditions (indexes.tests.PartialIndexTests.test_multiple_conditions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tim/code/django/django/test/testcases.py", line 1457, in skip_wrapper
    return test_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django/tests/indexes/tests.py", line 486, in test_multiple_conditions
    sql = str(index._get_condition_mql(Article, schema_editor=editor))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/indexes.py", line 10, in _get_condition_mql
    return where.as_mql_idx(compiler, schema_editor.connection)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/query.py", line 316, in where_node_idx
    mql = child.as_mql_idx(compiler, connection)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/lookups.py", line 26, in builtin_lookup_idx
    return connection.mongo_operators_idx[self.lookup_name](lhs_mql, value)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
KeyError: 'contains'

Also, please add tests/indexes_ in this repo so we can make sure we have complete test coverage for the new index MQL generation.

@timgraham
Copy link
Collaborator Author

Maybe it makes sense to put the as_mql_idx methods in indexes.py?

Copy link
Collaborator Author

@timgraham timgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure all operators in mongo_operators_idx are tested. Also, it looks like we may need quite a few tests to cover all the branches in where_node_idx.

django_mongodb/base.py Outdated Show resolved Hide resolved
tests/indexes_/test_mql.py Outdated Show resolved Hide resolved
tests/indexes_/test_mql.py Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@timgraham timgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is coming along. I pushed some edits in lieu of leaving some comments.

django_mongodb/base.py Outdated Show resolved Hide resolved
tests/indexes_/models.py Outdated Show resolved Hide resolved
django_mongodb/features.py Outdated Show resolved Hide resolved
django_mongodb/indexes.py Outdated Show resolved Hide resolved
django_mongodb/indexes.py Outdated Show resolved Hide resolved
django_mongodb/indexes.py Outdated Show resolved Hide resolved
django_mongodb/indexes.py Outdated Show resolved Hide resolved
django_mongodb/indexes.py Outdated Show resolved Hide resolved
django_mongodb/indexes.py Outdated Show resolved Hide resolved
django_mongodb/indexes.py Outdated Show resolved Hide resolved
tests/indexes_/test_mql.py Outdated Show resolved Hide resolved
tests/indexes_/models.py Outdated Show resolved Hide resolved
tests/indexes_/models.py Outdated Show resolved Hide resolved
@WaVEV
Copy link
Collaborator

WaVEV commented Nov 7, 2024

Found a bug in where.as_mql (should rise EmptyResultSet), and also I have to research in which way the test from constraints will be adapted, it isn't straightforward

@timgraham
Copy link
Collaborator Author

timgraham commented Nov 7, 2024

Oh, I didn't notice the constraints problems you mentioned previously. I added a Django fork commit to place isnull usage in UniqueConstraint.condition tests, but other problems like this remain which I think it what you're referring to:

======================================================================
ERROR: test_validate_expression_condition (constraints.tests.UniqueConstraintTests.test_validate_expression_condition)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tim/code/django/django/test/testcases.py", line 1457, in skip_wrapper
    return test_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django/tests/constraints/tests.py", line 1014, in test_validate_expression_condition
    constraint.validate(UniqueConstraintProduct, non_unique_product)
  File "/home/tim/code/django/django/db/models/constraints.py", line 461, in validate
    if (self.condition & Exists(queryset.filter(self.condition))).check(
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django/django/db/models/query_utils.py", line 140, in check
    return compiler.execute_sql(SINGLE) is not None
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/compiler.py", line 246, in execute_sql
    query = self.build_query(
            ^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/compiler.py", line 379, in build_query
    query.project_fields = self.get_project_fields(columns, ordering_fields)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/compiler.py", line 563, in get_project_fields
    fields.update(fields.pop(self.collection_name, {}))
                             ^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django/django/utils/functional.py", line 47, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
                                         ^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/compiler.py", line 438, in collection_name
    base_table = next(
                 ^^^^^
StopIteration

The SQL:

SELECT 1 AS "_check"
WHERE COALESCE((EXISTS
                  (SELECT 1 AS "a"
                   FROM "constraints_uniqueconstraintconditionproduct" U0
                   WHERE (U0."name" = 'p1'
                          AND U0."color" IS NULL)
                   LIMIT 1)), 1);

I think it fails because in MongoDB, we have no notion of executing a query without a collection.

I wonder if we could make it work without the outer query? But it looks tricky because I don't see a place we could hook into to customize the query: https://github.com/django/django/blob/042b381e2e37c0c37b8a8f6cc9947f1a2ebfa0dd/django/db/models/query_utils.py#L117-L151

@WaVEV
Copy link
Collaborator

WaVEV commented Nov 10, 2024

Oh, I didn't notice the constraints problems you mentioned previously. I added a Django fork commit to place isnulll usage in UniqueConstraint.condition tests, but other problems like this remain which I think it what you're referring to:

======================================================================
ERROR: test_validate_expression_condition (constraints.tests.UniqueConstraintTests.test_validate_expression_condition)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tim/code/django/django/test/testcases.py", line 1457, in skip_wrapper
    return test_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django/tests/constraints/tests.py", line 1014, in test_validate_expression_condition
    constraint.validate(UniqueConstraintProduct, non_unique_product)
  File "/home/tim/code/django/django/db/models/constraints.py", line 461, in validate
    if (self.condition & Exists(queryset.filter(self.condition))).check(
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django/django/db/models/query_utils.py", line 140, in check
    return compiler.execute_sql(SINGLE) is not None
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/compiler.py", line 246, in execute_sql
    query = self.build_query(
            ^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/compiler.py", line 379, in build_query
    query.project_fields = self.get_project_fields(columns, ordering_fields)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/compiler.py", line 563, in get_project_fields
    fields.update(fields.pop(self.collection_name, {}))
                             ^^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django/django/utils/functional.py", line 47, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
                                         ^^^^^^^^^^^^^^^^^^^
  File "/home/tim/code/django-mongodb/django_mongodb/compiler.py", line 438, in collection_name
    base_table = next(
                 ^^^^^
StopIteration

The SQL:

SELECT 1 AS "_check"
WHERE COALESCE((EXISTS
                  (SELECT 1 AS "a"
                   FROM "constraints_uniqueconstraintconditionproduct" U0
                   WHERE (U0."name" = 'p1'
                          AND U0."color" IS NULL)
                   LIMIT 1)), 1);

I think it fails because in MongoDB, we have no notion of executing a query without a collection.

I wonder if we could make it work without the outer query? But it looks tricky because I don't see a place we could hook into to customize the query: https://github.com/django/django/blob/042b381e2e37c0c37b8a8f6cc9947f1a2ebfa0dd/django/db/models/query_utils.py#L117-L151

I added like a dummy collection to handle this and created the pr fixing the remaining unit tests.
mongodb-forks/django#11

django_mongodb/indexes.py Outdated Show resolved Hide resolved
django_mongodb/indexes.py Outdated Show resolved Hide resolved
django_mongodb/lookups.py Outdated Show resolved Hide resolved
django_mongodb/indexes.py Show resolved Hide resolved
@WaVEV WaVEV marked this pull request as ready for review November 15, 2024 05:31
@timgraham timgraham requested a review from Jibola November 18, 2024 14:23
Copy link
Collaborator

@Jibola Jibola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. Would love clarity on the $facet question

Comment on lines 438 to 454
try:
base_table = next(
v
for k, v in self.query.alias_map.items()
if isinstance(v, BaseTable) and self.query.alias_refcount[k]
)
except StopIteration:
# Use a dummy collection if the query doesn't specify a table
# (such as Constraint.validate() with a condition).
query = self.query_class(self)
query.aggregation_pipeline = [{"$facet": {"__null": []}}]
self.subqueries.insert(0, query)
return "__null"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My one worry here is $facet is subject to the 16MB document size limit, so in the case of using Constraint we should note the corner cases and limitations properly. (Especially if this is a blocker for getting partial indexing)

Copy link
Collaborator

@WaVEV WaVEV Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the facet execution will have a very short pipeline, is just a way to insert a document so the subqueries won't get empty.
I mean
just this example:

select 1
where exists(some subquery checking that an index exists)

This is translated as:

[{"$facet": {"__null": []}},
 {"$lookup": {"the subquery"}}
 {"$match": {"condition to check from the subquery"}}
]

If a query over an empty collection is executed, the result will be 0 rows. To avoid that, I put a single tiny document

django_mongodb/indexes.py Outdated Show resolved Hide resolved
@timgraham timgraham force-pushed the partial-indexes branch 2 times, most recently from 176cd33 to 76ca293 Compare November 21, 2024 15:54
@timgraham timgraham merged commit 84d0f04 into mongodb-labs:main Nov 21, 2024
10 checks passed
@timgraham timgraham deleted the partial-indexes branch November 21, 2024 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants