Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INTPYTHON-348 add support for QuerySet.raw_aggregate() #183

Merged
merged 2 commits into from
Nov 22, 2024

Conversation

aclark4life
Copy link
Collaborator

@aclark4life aclark4life commented Nov 4, 2024

A diff of how the tests added here compare to Django's raw_query tests: mongodb-forks/django#12

fixes #163

django_mongodb/query.py Outdated Show resolved Hide resolved
django_mongodb/query.py Outdated Show resolved Hide resolved
django_mongodb/query.py Outdated Show resolved Hide resolved
@aclark4life

This comment was marked as resolved.

django_mongodb/query.py Outdated Show resolved Hide resolved
django_mongodb/query.py Outdated Show resolved Hide resolved
@aclark4life

This comment was marked as resolved.

@aclark4life aclark4life marked this pull request as ready for review November 5, 2024 19:30
@aclark4life

This comment was marked as resolved.

django_mongodb/query.py Outdated Show resolved Hide resolved
django_mongodb/query.py Outdated Show resolved Hide resolved
django_mongodb/query.py Outdated Show resolved Hide resolved
django_mongodb/query.py Outdated Show resolved Hide resolved
@timgraham
Copy link
Collaborator

test_missing_fields fails because self.queryset.columns returns all model fields (MongoRawQuery.get_columns()) rather than the fields in the query (RawQuery.get_columns() uses cursor.description). I'm not sure how we could solve this besides "peeking" at the first document to see what keys the document has.

@aclark4life
Copy link
Collaborator Author

test_missing_fields fails because self.queryset.columns returns all model fields (MongoRawQuery.get_columns()) rather than the fields in the query (RawQuery.get_columns() uses cursor.description). I'm not sure how we could solve this besides "peeking" at the first document to see what keys the document has.

@Jibola 🤔 ^

@aclark4life
Copy link
Collaborator Author

@Jibola @timgraham In case it helps, less args to $project reproduces the error


(Pdb) query
[{'$project': {'dob': 1, 'last_name': 1, 'first_name': 1, 'id': 1}}]

(Pdb) Author.objects.raw_mql(query)
<MongoRawQuerySet: [{'$project': {'dob': 1, 'last_name': 1, 'first_name': 1, 'id': 1}}]>

(Pdb) [i for i in Author.objects.raw_mql([{'$project': {'dob': 1} }])]
*** IndexError: list index out of range

@timgraham
Copy link
Collaborator

Right. What I meant by "peeking at the first document" is that we can't use resolve_model_init_order(). Instead we have to figure out the info that method generates (what columns and annotations are returned and what database converters we need to process them) by examining the keys of the first document in queryset_iterator.

@aclark4life
Copy link
Collaborator Author

E.g. something like this?


diff --git a/django_mongodb/query.py b/django_mongodb/query.py
index 89b0085..10b56ec 100644
--- a/django_mongodb/query.py
+++ b/django_mongodb/query.py
@@ -1,7 +1,7 @@
 from functools import reduce, wraps
 from operator import add as add_operator
 
-from django.core.exceptions import EmptyResultSet, FullResultSet
+from django.core.exceptions import EmptyResultSet, FullResultSet, FieldDoesNotExist
 from django.db import DatabaseError, IntegrityError, NotSupportedError, connections
 from django.db.models import QuerySet
 from django.db.models.expressions import Case, Col, When
@@ -360,6 +360,23 @@ class MongoRawQuerySet(RawQuerySet):
     def iterator(self):
         yield from MongoRawModelIterable(self)
 
+    def resolve_model_init_order(self, query):
+        """Resolve the init field names and value positions."""
+        converter = connections[self.db].introspection.identifier_converter
+        model_init_fields = [
+            f for f in self.model._meta.fields if converter(f.column) in self.columns and converter(f.column) in [i for i in query][0].keys()
+        ]
+        annotation_fields = [
+            (column, pos)
+            for pos, column in enumerate(self.columns)
+            if column not in self.model_fields
+        ]
+        model_init_order = [
+            self.columns.index(converter(f.column)) for f in model_init_fields
+        ]
+        model_init_names = [f.attname for f in model_init_fields]
+        return model_init_names, model_init_order, annotation_fields
+
 
 class MongoRawModelIterable(RawModelIterable):
     """
@@ -379,10 +396,10 @@ class MongoRawModelIterable(RawModelIterable):
                 model_init_names,
                 model_init_pos,
                 annotation_fields,
-            ) = self.queryset.resolve_model_init_order()
+            ) = self.queryset.resolve_model_init_order(query)
             model_cls = self.queryset.model
             if model_cls._meta.pk.attname not in model_init_names:
-                raise exceptions.FieldDoesNotExist(
+                raise FieldDoesNotExist(
                     "Raw query must include the primary key"
                 )
             fields = [self.queryset.model_fields.get(c) for c in self.queryset.columns]


That works for me along with mongodb-forks/django@1351d2c

@timgraham
Copy link
Collaborator

I haven't examined the details, but yea, something like that. Probably it would be a cleaner abstraction to pass a columns argument rather than query (and then could columns be used instead of self.columns in that method?).

Btw, in any code you copy you can can omit identifier_converter since that isn't customized by our backend and the default implementation returns the value unchanged.

django_mongodb/query.py Outdated Show resolved Hide resolved
django_mongodb/query.py Outdated Show resolved Hide resolved
@timgraham timgraham changed the title INTPYTHON-348 add support for QuerySet.raw_mql() INTPYTHON-348 add support for QuerySet.raw_aggregate() Nov 19, 2024
@timgraham timgraham requested a review from Jibola November 21, 2024 14:41
@aclark4life aclark4life merged commit 36c5718 into mongodb-labs:main Nov 22, 2024
10 checks passed
@aclark4life aclark4life deleted the INTPYTHON-348 branch November 22, 2024 14:12
@aclark4life
Copy link
Collaborator Author

Thank you @timgraham @Jibola @WaVEV! 🎉

Copy link
Collaborator

@Jibola Jibola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left a comment or two :)

Comment on lines +14 to +22
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# Protect against annotations being passed to __init__ --
# this'll make the test suite get angry if annotations aren't
# treated differently than fields.
for k in kwargs:
assert k in [f.attname for f in self._meta.fields], (
"Author.__init__ got an unexpected parameter: %s" % k
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this init needed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from Django's test suite.

Comment on lines +35 to +38
def _execute_query(self):
connection = connections[self.using]
collection = connection.get_collection(self.model._meta.db_table)
self.cursor = collection.aggregate(self.pipeline)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this would be the only thing that gets updated were we to create a raw_find version if I understand correctly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. With some adjustments for the fact that find() has more arguments (at least filter and projection).

Incidentally, that raises the question as to whether raw_aggregate() should also accept arbitrary **kwargs to pass to Collection.aggregate().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

INTPYTHON-348 add QuerySet.raw_aggregate()
4 participants