-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INTPYTHON-424 add django_mongodb.parse_uri() to configure DATABASES #195
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on! Provided some feedback. Let me know your thoughts.
Via @timgraham we may look to https://github.com/jazzband/dj-database-url/blob/master/tests/test_dj_database_url.py for some testing inspiration. |
@Jibola @timgraham I think this is ready except for:
Tests for this are running in Actions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test with various URIs to exercise the different code paths and the cases when values aren't present.
For more advanced use cases (rather than adding some additional options to this function), it could be reasonable to say "Dump the results, i.e. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending a few more tests.
@timgraham for the final ✅
OK added a couple more localhost tests. |
Looks like there's still some linting issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I left this comment before but I can't find it. I think we should call the function something more than just parse()
, like parse_uri()
. Unlike dj-database-url
, our library has many purposes and therefore many things that could be parsed.
Thanks for the feedback, @Jibola and @timgraham ! Updates:
|
a57ee5f
to
33b5481
Compare
django_mongodb.parse
tests/utils_/test_parse_uri.py
Outdated
def test_no_database(self, mock_resolver): | ||
settings_dict = parse_uri("mongodb://cluster0.example.mongodb.net/") | ||
self.assertEqual(settings_dict["ENGINE"], "django_mongodb") | ||
self.assertIsNone(settings_dict["NAME"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out, based on the spec for NAME
this should end up returning an empty string if it comes back as None
.
https://docs.djangoproject.com/en/5.1/ref/settings/#name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, in this case, I think it's best to still have the name
as a passed through kwarg because Atlas URIs do often not provide a database name. I'll provide an example in the code above about how this would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the URI does not contain a database name, then we don't need to add NAME
. 1519e0f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the None/'' distinction matters. What does it mean to use Django + MongoDB without specifying a database name? I think NAME must be required, although it's not the job of parse_uri()
to raise an error. Currently, when trying runserver without NAME:
Traceback (most recent call last):
File "/opt/python3.12.0/lib/python3.12/threading.py", line 1052, in _bootstrap_inner
self.run()
File "/opt/python3.12.0/lib/python3.12/threading.py", line 989, in run
self._target(*self._args, **self._kwargs)
File "/home/tim/code/django/django/utils/autoreload.py", line 64, in wrapper
fn(*args, **kwargs)
File "/home/tim/code/django/django/core/management/commands/runserver.py", line 136, in inner_run
self.check_migrations()
File "/home/tim/code/django/django/core/management/base.py", line 581, in check_migrations
executor = MigrationExecutor(connections[DEFAULT_DB_ALIAS])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tim/code/django/django/db/migrations/executor.py", line 18, in __init__
self.loader = MigrationLoader(self.connection)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tim/code/django/django/db/migrations/loader.py", line 58, in __init__
self.build_graph()
File "/home/tim/code/django/django/db/migrations/loader.py", line 235, in build_graph
self.applied_migrations = recorder.applied_migrations()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tim/code/django/django/db/migrations/recorder.py", line 89, in applied_migrations
if self.has_table():
^^^^^^^^^^^^^^^^
File "/home/tim/code/django/django/db/migrations/recorder.py", line 64, in has_table
tables = self.connection.introspection.table_names(cursor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tim/code/django-mongodb/django_mongodb/introspection.py", line 10, in table_names
return sorted([x["name"] for x in self.connection.database.list_collections()])
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tim/code/django-mongodb/django_mongodb/base.py", line 150, in __getattr__
return getattr(self, attr)
^^^^^^^^^^^^^^^^^^^
File "/home/tim/code/django-mongodb/django_mongodb/base.py", line 150, in __getattr__
return getattr(self, attr)
^^^^^^^^^^^^^^^^^^^
File "/home/tim/code/django-mongodb/django_mongodb/base.py", line 150, in __getattr__
return getattr(self, attr)
^^^^^^^^^^^^^^^^^^^
[Previous line repeated 741 more times]
RecursionError: maximum recursion depth exceeded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the bug I found a while back! OK I put it back 56fe609
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NAME
has to be set in DATABASES
else Django crashes. Once #196 is merged that will be enforced by get_connect_params
. So end users that try to do parse_uri("mongodb://localhost")
(or something like it, without setting NAME
) will get an error from get_connect_params
.
Our parse_uri
is inspired by dj-database-url
which is inspired by The Twelve-Factor App which prioritizes what @timgraham was arguing about with HOST
. HOST
should contain a HOST
and nothing else so that:
Resources can be attached to and detached from deploys at will. For example, if the app’s database is misbehaving due to a hardware issue, the app’s administrator might spin up a new database server restored from a recent backup. The current production database could be detached, and the new database attached – all without any code changes.
dj-database-url
set the precedent that additional kwargs
passed into parse_uri
are passed to the DATABASES
dictionary and NAME
has already been parsed in the URL, so we definitely would not consider adding an additional kwarg
for NAME
. As @timgraham says, just direct users to the documentation that says "The database NAME
is part of the URL" per Twelve-Factor App and NAME
is required.
AFAICT this works OK for me with Atlas, so if there's any unexpected behavior on the MongoDB side, then we can document that too. E.g. "We understand you are not used to specifying a database NAME
with Atlas!"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading https://www.mongodb.com/docs/manual/reference/connection-string/, one thing that could be confusing is that the example connection URL is: mongodb+srv://[username:password@]host[/[defaultauthdb][?options]]
.
Does it imply that the authentication database could be different from the database that the user may want to store their Django models in? Reading https://pymongo.readthedocs.io/en/stable/examples/authentication.html#default-database-and-authsource, it looks like defaultauthdb
can actually be the default database and you can specify a different authentication db using authSource
in the query string. So I think our implementation is okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think it's OK too, but playing devil's advocate for @Jibola , Django says NAME
can be an empty string and if we assume that MongoDB can auto-assign a collection name then we're back to the logistics of supporting an empty string NAME
which involves an alternative fix for #196 and additional fixes for parse_uri
. I don't have a strong preference either way and may even lean towards supporting the empty string NAME
but I'll pass back to @Jibola for additional feedback.
I will note though that even though Django says NAME
can be an empty string, that appears just a practical matter for Django to leave it to the back end to decide. If your database is SQLite and your NAME
is empty what happens? I think SQLite picks a file name. If your database is Postgresql and your NAME
is empty what happens? I think you get an error, but I'm not sure. Same with MySQL. If the MongoDB preference is for NAME
to be an empty string and there is no error, then I imagine we can get Django to play along with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Django's documentation says the the default of NAME is an empty string, but that doesn't mean it's a valid configuration. Some database backends like sqlite3 raise an error if NAME isn't provided, others like have the notion of a "no db" connection ('NAME': None
) that is of some limited use. For example, when testing on PostgreSQL, "Normally Django will use a connection to the 'postgres' database to avoid running initialization queries against the production."
If there are cases where using Django + MongoDB without a database name is a valid configuration, we need the details. The behavior I see is:
>>> db_name = ''
>>> MongoClient(...)[db_name]
...
File ".../pymongo/synchronous/mongo_client.py", line 1183, in __getitem__
return database.Database(self, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "../pymongo/synchronous/database.py", line 133, in __init__
_check_name(name)
File ".../pymongo/database_shared.py", line 27, in _check_name
raise InvalidName("database name cannot be the empty string")
pymongo.errors.InvalidName: database name cannot be the empty string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well played
a378ad6
to
f218620
Compare
Supersedes #194