Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case insensitive sorting using createSubscribeQuery #249

Closed
naserzandi opened this issue Oct 1, 2018 · 2 comments
Closed

Case insensitive sorting using createSubscribeQuery #249

naserzandi opened this issue Oct 1, 2018 · 2 comments

Comments

@naserzandi
Copy link

I'm trying to subscribe to a query which is supposed to sort the result.
The following query sorts the result, but it's case sensitive.
connection.createSubscribeQuery(collection, { $sort: { title: 1 } });
How can I make it sort the result case insensitive?

@curran
Copy link
Contributor

curran commented Apr 18, 2019

Suggest to close this due to inactivity.

The query behavior is dependent on the implementation. If you're using MongoDB, try this thread https://stackoverflow.com/questions/22931177/case-insensitive-sorting-in-mongodb

@ericyhwang
Copy link
Contributor

I wrote a reply back in October on the mailing list:
https://groups.google.com/d/msg/sharejs/9aEytBzU_24/XvRuvesbAgAJ

I'll paste a copy of that reply here.


Option 1 - Add a meta property for a normalized string

Summary - This is possible with sharedb/sharedb-mongo today. It lets you fully control the normalization in code, but it results in some extra data stored in the DB. It'd require a migration for any existing documents pre-normalization or when changing the normalization algorithm, which is extra developer overhead.

With this approach, you'd add server middleware that sets a normalized string form of the field onto the Share meta property, and then you'd sort by that normalized field.

The middleware would look something like this:

backend.use('commit', function(request, next) {
  var data = request.snapshot.data || {};
  if (request.collection === 'my_collection') {
    request.snapshot.m.myTitleLowerCase = (data.myTitle || '').toLowerCase();
    // Note: Consider using toLocaleLowerCase(locale) if you know the locale of the text.
  }
  next();
});

Informative note - It sets the normalized string underneath the meta property, m, because Share meta properties don't undergo OT and currently don't get sent to clients at all. You can't do the same thing with fields under snapshot.data because then the client issuing the commit will be unaware of the server-side change pre-commit.

Then, you'd sort like this - you sort on _m.myTitleLowerCase because sharedb-mongo puts the meta property on _m:

connection.createSubscribeQuery(collection, { $sort: { '_m.myTitleLowerCase': 1 } });

Option 2 - Use Mongo collation

Summary - This would require a small addition to sharedb-mongo before it would work, and it also requires Mongo 3.4+. You can do it in place without adding more data to the documents, though indexes do take some extra disk space. Also, Mongo only lets you specify one collation per query, so the collation would also affect filtering on string fields, for any query that used collation.

Mongo 3.4 added support for collation when comparing text strings.

With a plain Mongo client, you'd do so using cursor.collation:

db.my_collection.find(query)
  .sort({myTitle: 1})
  .limit(1)
  .collation({locale: 'en_US', strength: 1 /*or 2*/});

For performance, you'd probably want a case-insensitive index on that field, as linked above.

In sharedb-mongo, you generally specify query modifiers using dollar-sign prefixed query properties, like $sort or $hint. So it would look something like:

connection.createSubscribeQuery(collection, {
  $sort: { myTitle: 1 },
  $collation: {locale: 'en_US', strength: 1 /*or 2*/},
});

However, sharedb-mongo doesn't implement $collation. Yet!

It wouldn't be too hard to add it. It involves adding an entry to this map, with associated unit/integration tests.

The one slightly tricky part with the tests is that Share runs its tests against Mongo server 2.6, 3.6, and 4.0, and collation won't work when running against 2.6. I believe it should be possible to check the MONGODB_VERSION env variable set in Travis and conditionally run specific test cases that way.

ShareDB Mongo issue to add $collation - share/sharedb-mongo#70

Option 3 - Shallow copy and sort client-side

Don't have time to flesh this out right now, but if you're subscribing to all matching data - in other words, not using a limit - then you could do the sorting client-side on a shallow copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants